Combining Git Repositories

By Eric Lathrop on

One of my favorite parts of Git is how it lets you fix your mistakes. One mistake I needed to fix a few times in the last year was having two separate repositories, when they ought to be a single respository. I will present commands to merge a child repository into a parent respository as a subdirectory. The new child repository subdirectory will preserve its history and look like it was always part of the parent repository.

Back Up Your Repositories

Any time you break out the history-manipulation power tools, you should first make sure you have backups of the repositories you are manipulating.

Clone The Repositories Locally

Make sure you have separate fresh clones to work on:

vagrant@precise64:~$ git clone git@github.com:user/parent-repo.git
Cloning into 'parent-repo'...
remote: Counting objects: 698, done.
remote: Compressing objects: 100% (280/280), done.
remote: Total 698 (delta 442), reused 664 (delta 408)
Receiving objects: 100% (698/698), 478.25 KiB | 0 bytes/s, done.
Resolving deltas: 100% (442/442), done.
Checking connectivity... done

vagrant@precise64:~$ git clone git@github.com:user/child-repo.git
Cloning into 'child-repo'...
remote: Counting objects: 698, done.
remote: Compressing objects: 100% (280/280), done.
remote: Total 698 (delta 442), reused 664 (delta 408)
Receiving objects: 100% (698/698), 478.25 KiB | 0 bytes/s, done.
Resolving deltas: 100% (442/442), done.
Checking connectivity... done

Rewrite Child Repository into a Subdirectory

Now we will rewrite the history of child-repo so all files exist in the desired subdirectory my/new/subdir:

vagrant@precise64:~$ cd child-repo/
vagrant@precise64:~/child-repo$ git filter-branch --prune-empty --tree-filter '
> if [ ! -e my/new/subdir ]; then
>     mkdir -p my/new/subdir
>     git ls-tree --name-only $GIT_COMMIT | xargs -I files mv files my/new/subdir
> fi'
Rewrite 281f99d5b97676bdc50225c4a11bf9f47e4e6666 (80/80)
Ref 'refs/heads/master' was rewritten

Import Child Repository Commits into Parent Repository

Since a Git repository is just a bucket of objects named by the SHA1 of their contents, we can just dump the child-repo objects into parent-repo. First we add child-repo as a remote repository:

vagrant@precise64:~/child-repo$ cd ../parent-repo/
vagrant@precise64:~/parent-repo$ git remote add child-remote ../child-repo/

Then run git fetch to pull down the commits.

vagrant@precise64:~/parent-repo$ git fetch child-remote
warning: no common commits
remote: Counting objects: 699, done.
remote: Compressing objects: 100% (387/387), done.
remote: Total 699 (delta 337), reused 187 (delta 113)
Receiving objects: 100% (699/699), 71.97 KiB | 0 bytes/s, done.
Resolving deltas: 100% (337/337), done.
From ../child-repo
 * [new branch]      master     -> child-remote/master

Merge Child Repository History into master

Now that the child-repo commits exist alongside the parent-repo commits, we use git merge to combine the histories:

vagrant@precise64:~/parent-repo$ git merge --allow-unrelated-histories child-remote/master
Merge made by the 'recursive' strategy.
 my/new/subdir/some-file.rb                                     |  21 ++++++++++++++++++
<snip>
 39 files changed, 1711 insertions(+)
<snip>
 create mode 100644 my/new/subdir/some-file.rb

Congratulations! Now it looks like child-repo has always existed in the my/new/subdir directory of parent-repo! git log should look like you expect.