Repairing a corrupt Git repo using a clone

Posted on February 24, 2016

Quite recently I managed to make myself a corrupt git repository due to a file system failure. See, git stores everything in content addressable blobs - the file name of something is it’s hash. Which lends itself nicely to checking repository integrity - it keeps out malicious attackers as well as my file system problems.

I already hear you saying: Why not just make a new clone, git is distributed anyway? Well, I wasn’t diligent enough to push everything. I had local commits that were quite important, so I spent some time fixing it.


Git has a command to manually check integrity of the repository: git fsck. Running it lists all the errors.

Luckily in my case the list was quite short so I went ahead and deleted all the objects that were listed as corrupted. So now my objects are fine, but I’m missing some. Luckily (again) corrupted objects did not contain any data pertaining to unpushed commits so I thought I can use a close to restore them.


So I lied a bit, git doesn’t store every blob in a separate file, that would become huge pretty quickly. Instead it uses packfiles. It packs several blobs into one file and does delta compression to reduce disk usage. So I cannot just copy over blobs from a clone.

Fortunately git has commands for dealing with packfiles as well. The one of interest is git unpack-file which takes a packfile, extracts all the blobs and dumps them into the repo. Potentially producing loose objects, but let’s not care about that for a second.

So I made a bare clone from github

And just unpacked everything

And it worked! git fsck did not complain anymore. Well at least not about garbage and corruption - just loose objects.

But that is easy to clean up: just prune them

And do a GC to re-compress.

Any my repo integrity is back!

Facebook Twitter Google Digg Reddit LinkedIn StumbleUpon Email
comments powered by Disqus