Recommended Action plan if a Repository Becomes Corrupted on a Bitbucket Server
This is a general recommendation/real life example for situations where the repository on your Bitbucket Server instance becomes corrupted for some reason.
As per our Support reports, no customer has yet encountered a situation where one of Bitbucket Server's repositories became corrupted. Git's internals are very robust, so, as long as you are using Bitbucket Server normally, it's very unlikely, short of some form of disk failure, for one of the repositories to become damaged.
But what if one does? Unfortunately, there's no "one size fits all" approach here. The best and recommended approach is to contact Atlassian Support at https://support.atlassian.com by lodging a Support request with us.
If you are still curious to know more, here are some of the steps carried out internally by our Bitbucket Server development team, when a repository corruption was detected while developing new Bitbucket Server features. Development was able to recover (without losing any commits) using a mix of two strategies:
- They shut down Bitbucket Server periodically and zip up their repositories. To reduce the downtime, they just copy the data to a separate location while Bitbucket Server is down (which is very fast), and then they compress it after Bitbucket Server has been brought back online. This allows them to backup their repositories (some of which are quite large, for test data) in just a few minutes of downtime (generally 5 minutes or less)
- Every developers' local clone is, in effect, a backup of the Bitbucket Server repository. Git repositories include the full history of every reachable ref when cloning, making them very useful for "restoring" repositories.
The most complicated restore development had to do internally (which, again, was accomplished without losing any data), was started by unpacking the zipped repository to serve as a base. They then used scp to copy the .pack and .idx files from some developers' machines into that unpacked repository. A simple git gc eliminated all the duplicate and corrupt objects and produced a single clean, fully functional repository. They replaced the repository directory in their BITBUCKET_HOME with the rebuilt repository and development continued without missing a beat.
This process involves removing/deleting files. Please make sure you've created backups before you begin to avoid data loss.
Determine the numeric ID and file location of the repository that is corrupt. This will appear in Git client output as well as in the repository's settings page for the repository.
> git push origin my-branch Counting objects: 6, done. Delta compression using up to 8 threads. Compressing objects: 100% (6/6), done. Writing objects: 100% (6/6), 664 bytes | 0 bytes/s, done. Total 6 (delta 2), reused 0 (delta 0) remote: error: object file /opt/bitbucket-home/shared/data/repositories/941/./objects/incoming-359Foi/0e/9fd1040b4080bb2986f2c4788d5cf04509fe2f is empty remote: fatal: loose object 0e9fd1040b4080bb2986f2c4788d5cf04509fe2f (stored in /opt/bitbucket-home/shared/data/repositories/941/./objects/incoming-359Foi/0e/9fd1040b4080bb2986f2c4788d5cf04509fe2f) is corrupt error: object file /opt/bitbucket-home/shared/data/repositories/941/./objects/incoming-359Foi/0e/9fd1040b4080bb2986f2c4788d5cf04509fe2f is empty fatal: loose object 0e9fd1040b4080bb2986f2c4788d5cf04509fe2f (stored in /opt/bitbucket-home/shared/data/repositories/941/./objects/incoming-359Foi/0e/9fd1040b4080bb2986f2c4788d5cf04509fe2f) is corrupt To ssh://bitbucket-server.my-company.com:7997/proj/repo.git ! [remote rejected] my-branch -> my-branch (missing necessary objects) error: failed to push some refs to 'ssh://email@example.com:7997/proj/repo.git'
In the output above you can see the repository's file location is
Open a Terminal on your Bitbucket Server instance (perhaps via SSH) and
cdto the repository data directory
git fsck --no-danglingand inspect the output. If pushes are still modifying the repository as you work, this might show transient failures. Before actually modifying the repository, make sure that the failures you see are old and not part of an in-progress push. Concurrent operations will cause transient errors in git fsck that you might mistake for real corruption.
user@server:/opt/bitbucket-home/shared/data/repositories/941$ sudo -u bitbucket_user git fsck --no-dangling
- "empty" objects
ls -lh ./objects/0e/9fd1040b4080bb2986f2c4788d5cf04509fe2fto ensure that:
- The object really is empty with no content
- The timestamp isn't close to "now" (i.e. run
dateand compare). If it's close to now or not empty, it might be coming from an in-progress push, so don't remove it!
- If the object is old and still empty, move it out of the repository's data directory or just delete it.
- Repeat step 3 to check for more problems.
user@server:/opt/bitbucket-home/shared/data/repositories/941$ sudo -u bitbucket_user git fsck --no-dangling Checking object directories: 100% (256/256), done. Checking object directories: 100% (256/256), done. Checking objects: 100% (5710164/5710164), done. broken link from tree 5472414fa37d1db2dac2718e64f35035285bfd43 to blob badef791d6d4b90cbc02cae03d7bbdd390458103 Checking connectivity: 4238598, done. missing blob badef791d6d4b90cbc02cae03d7bbdd390458103
It's possible the missing object is part of an in-progress push. Try running a
cat-fileon it to see if it is still missing a little later.
user@server:/opt/bitbucket-home/shared/data/repositories/941$ sudo -u bitbucket_user git cat-file -p badef791d6d4b90cbc02cae03d7bbdd390458103 fatal: Not a valid object name badef791d6d4b90cbc02cae03d7bbdd390458103
- If it is still missing, you'll have to find that object from one of the developers who has recently pushed to the repository. "Recovering" those objects is accomplished by uploading the missing objects to the server, which can be done by uploading individual loose objects or, for larger numbers of missing objects, by uploading packs. If a pack is uploaded, any duplication in objects in that pack and existing packs can be resolved by a simple
git gc, which will consolidate objects into a new pack and remove old ones.
Git stores objects in folders based on the first two characters of their hash, so
badef791d6d4b90cbc02cae03d7bbdd390458103should be copied into
./objects/ba/def791d6d4b90cbc02cae03d7bbdd390458103. Double check the file permissions and use
git cat-fileto check that Git can recognize the object.
user@server:/opt/bitbucket-home/shared/data/repositories/941$ sudo -u bitbucket_user cp /tmp/def791d6d4b90cbc02cae03d7bbdd390458103 objects/ba user@server:/opt/bitbucket-home/shared/data/repositories/941$ sudo -u bitbucket_user ls -lh objects/ba/def791d6d4b90cbc02cae03d7bbdd390458103 -rw-r--r-- 1 bitbucket_user bitbucket_user 626 Feb 1 06:42 objects/ba/def791d6d4b90cbc02cae03d7bbdd390458103 user@server:/opt/bitbucket-home/shared/data/repositories/941$ sudo -u bitbucket_user git cat-file -t badef791d6d4b90cbc02cae03d7bbdd390458103 blob
- Repeat step 3 to check for more problems.
- clean/successful output
- You're done! The repository is no longer corrupt.
- "empty" objects