Recovering from a corrupted git repo

February 23rd, 2013 by exhuma.twn

I do a lot of work on the go. Offline. Sometimes it takes a long time to push changes to a remote repository. As always, Murphy’s law applies, and the one repo that explodes into my face is the one with ten days worth of work in it.

While working, suddenly my laptop hang. Music looping. No mouse movement. Nothing. The only possible solution was to do a cold-reboot. I was not worried. Everything was saved, and I only changed a few lines and can easily recover if something went awry. So I rebooted.

Once back in the system, I immediately wanted to do a git status and git diff. Git spat back the following error message:

jukebox$ git st
fatal: object 9bd41c2f96f295924af92a9da175cb3686f13359 is corrupted

My Laptop had shown some strange and erratic behaviour over the last few days already. I already left a memtest running for about 24 hours earlier this week without errors. The only possible explanation left was the hard-disk.

Fun times ahead! 10 days of work at risk… 10 days of important changes! Sweat building up my forehead. Bloody sweat!

I trust my tools to keep my code safe. I trust git. I trust vim. I do microscopic commits, and I knew my current uncommitted changes only involved a few lines. So maybe only the last commit got corrupted? Let’s see…

First I dug around my existing code base. Fired up vim to look at the source and noticed some files have become empty. The 0 bytes kind of empty! Luckily vim keeps swap files and I recovered from there. The code I recovered with vim was completely correct. It even had the last line I was working on. Good. So I have my code in a good state.

But this leaves my git history. I did not want to lose all my commits, and I did not want to re-clone from the remote repo and make one huge clunky commit.

As my code was safe now, I decided to dig into git.

My understanding at that point: git saves everything into the objects folder. Commits, tree-ishes, … everything. Every commit has a pointer to it’s parent. So if I find the last valid commit, I can reach my history again. With that knowledge, hop in.

So the plan of action was simple: Sort objects by date, and from there, find the latest commit. And somehow get rid of the corrupted objects (as it turned out, the corrupted objects were also 0-byte files).

To find the latest commit, I decided to write a quick-and-dirty (… very dirty) Python script to order the objects by date. Let me quickly add some comments for readability’s sake:

# This script needs to be executed from inside the ".git" folder
# as it uses relative paths.

from os import listdir as ls, stat
from os.path import join

# Get the list of objects as path names.
objects = []
for head in ls('objects'):
    if len(head) != 2:
    objects.extend([join('objects', head, _) for _ in ls(join('objects', head))])

# In ``objects`` we now have a list, where each item is a
# relative path name to an object. Create a second list
# containing file system meta info (including mtime)
stats = map(stat, objects)

# Combine both lists, so we keep (object, metadata) together
# when sorting
object_meta = zip(objects, stats)

# Sort by ``mtime``
sorted_meta = sorted(object_meta, lambda a, b: cmp(a[1].st_mtime, b[1].st_mtime))

# Print the result. We'll do the rest by hand!
# Because git cuts off the first two character of the object name
# and uses it as super-folder, we "glue" it back together when
# printing, so we can more easily copy/paste it later.
for fname, meta in sorted_meta:
    print "".join(fname.split('/')[-2:]), meta.st_mtime, meta.st_ctime

So far so good. Running this gives me a list of objects, sorted by date. Example output:

.git$ python /home/exhuma/work/ | tail
a9440d0a32451efce4f78b59076d0db748b9fd77 1361466581.0 1361620964.0
24b1869bbbb63f2e6fb831bd55c0be6953443b39 1361466592.0 1361620962.0
7a603cf0091ff4ac8c139ca0ea767f806fa03dcd 1361517228.0 1361620963.0
12390d3fcc6521e10376a9e14f440cdd7825c7a3 1361517240.0 1361620962.0
7807c38aad24324992cd8b5a8d70ba8f8469a4b7 1361517251.0 1361620963.0
df6834af160328695cd5b0d1951cecc48aa10312 1361517253.0 1361620964.0
410a68fb0ee8433341664342f7a76275a724dda1 1361517253.0 1361620962.0
1120de0296566321287a7a7808d836e7103b3c7e 1361517253.0 1361620962.0
74ef956de09a18519489c0f7c2f516d9eecc2bd1 1361517253.0 1361620963.0
9bd41c2f96f295924af92a9da175cb3686f13359 1361517279.0 1361620963.0

Would you look at that: 9bd41c2f96f295924af92a9da175cb3686f13359 is the last saved object. The same as reported by the git status I ran in the beginning!

For the next part, I ran git show starting from the end, writing down each commit which was corrupted. This gave me the list of the following objects:


More than I expected :( … But still… let’s continue:

It turned out, they were all afflicted by the same illness. All had zero bytes. So lost… Forever! To get git working again, I simply deleted them, and re-ran git status:

jukebox$ git status
fatal: bad object HEAD

Hmmm… the corrupted object is gone… but now git cannot find HEAD? But that’s only a simple file in .git pointing to something in .git/refs/heads… surely it hasn’t…

jukebox$ cat .git/HEAD
ref: refs/heads/backend

jukebox$ cat .git/refs/heads/backend
cat: .git/refs/heads/backend: No such file or directory

Well fuck! Well, essentially, not too bad. refs/heads/backend is just a file containing an object name (hash). So all I would need to do is to find the proper hash and put it into the file. But let’s first dig into the git docs. I vaguely remember a git fsck command. And sure enough. It exists, with a --lost-found option. Sounds about right. Let’s run it:

jukebox$ git fsck --lost-found
error: HEAD: invalid sha1 pointer 9bd41c2f96f295924af92a9da175cb3686f13359
error: refs/heads/backend does not point to a valid object!
dangling commit eb85ee879fadc9a11c1d8df7ee003a97225f90c5
missing blob 7807c38aad24324992cd8b5a8d70ba8f8469a4b7
dangling commit 24b1869bbbb63f2e6fb831bd55c0be6953443b39
dangling commit 353f17ca0fc30887156981352c97ef1ecc5b7047
dangling commit c7c40161ccb3c68b52b1a58f55febd728bb53a52
dangling blob ed5e9f9f1ca1544da82110a7431d13267b2b4d98
missing blob 7a603cf0091ff4ac8c139ca0ea767f806fa03dcd
dangling blob 29f70899369a83894e947d4a18da4a49078522b9
dangling commit 2a79a7cc553640f1cb5e58775dc59724a4436289

According to the docs, this should give me something in .git/lost-found:

jukebox$ tree .git/lost-found/
├── commit
│   ├── 24b1869bbbb63f2e6fb831bd55c0be6953443b39
│   ├── 2a79a7cc553640f1cb5e58775dc59724a4436289
│   ├── 353f17ca0fc30887156981352c97ef1ecc5b7047
│   ├── c7c40161ccb3c68b52b1a58f55febd728bb53a52
│   └── eb85ee879fadc9a11c1d8df7ee003a97225f90c5
└── other
    ├── 29f70899369a83894e947d4a18da4a49078522b9
    └── ed5e9f9f1ca1544da82110a7431d13267b2b4d98

2 directories, 7 files

Interesting. 5 commits, and two file blobs (as it turns out). This is looking good! I was working on the “backend” branch. And only on that branch, so the error above with refs/heads/backend is correct.

In theory, all that is left to do is look at the commits in lost-found, take the last one (by date), make refs/heads/backend point to that one, and make HEAD point to the proper ref. Finding the last one is easy, run git show on each object, and note the dates.

# NOTE: The ##*/ magic is a bash string manipulation to extract the basename!
jukebox$ for object in .git/lost-found/commit/*; do git show  ${object##*/} | head -n 3; done
commit 24b1869bbbb63f2e6fb831bd55c0be6953443b39
Author: Michel Albert <#############>
Date:   Thu Feb 21 18:09:41 2013 +0100
commit 2a79a7cc553640f1cb5e58775dc59724a4436289
Author: Michel Albert <#############>
Date:   Tue Jan 22 07:57:54 2013 +0100
commit 353f17ca0fc30887156981352c97ef1ecc5b7047
Author: Michel Albert <#############>
Date:   Tue Feb 19 18:45:40 2013 +0100
commit c7c40161ccb3c68b52b1a58f55febd728bb53a52
Author: Michel Albert <#############>
Date:   Tue Feb 19 18:45:40 2013 +0100
commit eb85ee879fadc9a11c1d8df7ee003a97225f90c5
Author: Michel Albert <#############>
Date:   Mon Feb 11 18:26:29 2013 +0100

So, 24b1869bbbb63f2e6fb831bd55c0be6953443b39 is apparently the most recent commit!

Let’s recover HEAD, which still contains the correct pointer. But the pointer target (refs/heads/backend) is missing:

jukebox $ echo 24b1869bbbb63f2e6fb831bd55c0be6953443b39 > .git/refs/heads/backend

And tadaa:

jukebox$ git st
error: unable to find 7a603cf0091ff4ac8c139ca0ea767f806fa03dcd
 M logging.ini
MM wickedjukebox/
 M wickedjukebox/core/
M  wickedjukebox/demon/players/
?? wjb_log_conf/

This is already a lot better. There are still some errors. But the important thing is that the history between the local backend and origin/backend is back:

* 24b1869 - (HEAD, backend) Icecast admin access fixed. (2 days ago) <Michel Albert>
* 653e4eb - State updates should be handled by the channel! (2 days ago) <Michel Albert>
* a3df02a - Added a default logging config to the distribution. (2 days ago) <Michel Albert>
* ca46885 - Further work on logging. Less verbose default. (2 days ago) <Michel Albert>
* 6a02bde - Additional code block based on ``nextSong`` refactored. (2 days ago) <Michel Albert>
* 140e2b1 - Minor bugfixes. (2 days ago) <Michel Albert>
* ab3893d - Duplicate code refactored into ``getNextSong`` (2 days ago) <Michel Albert>
* 2b7a0aa - Logging fix and simplification. (2 days ago) <Michel Albert>
* bf253d3 - New icecast player hooked into core channel. (3 days ago) <Michel Albert>
* f02ac0e - PEP8 indentation fix (4 days ago) <Michel Albert>
* 09eaaac - More tangible test-code. (4 days ago) <Michel Albert>
* a1792fb - Limiting memory usage (might be too small). (4 days ago) <Michel Albert>
* 11a1fde - Import cleanup (4 days ago) <Michel Albert>
* e2a4d5c - Sending data to the icecast server. (4 days ago) <Michel Albert>
* 24c30a6 - Example production code. (4 days ago) <Michel Albert>
* 85c2168 - Handling 'paused' and 'stopped' state. (4 days ago) <Michel Albert>
* 80adf42 - Disconnecting from icecast at exit. (4 days ago) <Michel Albert>
* d7a0ef8 - Connecting to icecast. (4 days ago) <Michel Albert>
* 0c4c488 - Less verbose logging when testing. (4 days ago) <Michel Albert>
* ac8a6d5 - New icecase implementation (not yet finished) (5 days ago) <Michel Albert>
* d1f1662 - Added the "build" folder to the ignore list. (11 days ago) <Michel Albert>
* 03b9e2b - My stab at a DB bootstrap script for the frontend. (11 days ago) <Michel Albert>
* bda8d02 - (origin/backend) Added the "pinnedIp" column to the database. (11 days ago) <Michel Albert>

This is all I wanted. I now pushed this history back to the server, and to make sure everything is clean again, I will simple re-clone from the server, apply the pending local changes and I am back to a clean workspace.

All in all, in retrospect, the whole operation was very easy, and I am glad that I did not lose my history.

I will need to get myself a new HDD for my laptop. That much is certain! But I am also playing with the idea to set up a git hook to automatically sync to a connected USB stick. That way, I would have my commits on two different file systems, while offline. Should not be too difficult to set up… We’ll see.

Posted in Coding Voodoo | No Comments »


Recent Posts