How To: Check your repository's size and identify large files

お困りですか?

アトラシアン コミュニティをご利用ください。

コミュニティに質問

プラットフォームについて: Cloud のみ - この記事は、 クラウド プラットフォームのアトラシアン製品にのみ適用されます。


要約

Bitbucket Cloud enforces a 4GB limit on repositories' size. After exceeding the 4GB limit, repositories are set to read-only, which could cause a blockage on development flows. In order to avoid this issue, a few steps can be followed to figure out which files are taking more space and troubleshoot why a repository size is increased.

環境

The steps outlined in this article are applicable for any installation of Bitbucket (Cloud, Server, and Data Center) using Git for source control versioning.

ソリューション

Discover the large files in a Git repo

The following command will list the top largest files reachable from your repository's HEAD. The output will include the blob's hash ID, the size in bytes, and the respective file name.

$ git ls-tree -r --long HEAD | sort -k 4 -n -r | less
100644 blob 557db03de997c86a4a028e1ebd3a1ceb225be238      12	test.txt
100644 blob 5029834def1b27d2f2107b51aac14fe5f75d9da0     127	test.backup
100644 blob 5029834def1b27d2f2107b51aac14fe5f75d9da0     127	test.sql
100644 blob 879b112cd96d01f605d0e380e0c9c00bfd2eb83a     127	jira.txt
100644 blob 9b5b369768594badbad98f2566a00e35ef61e14f     592	.gitattributes
100644 blob 25b6ebb7bfc76200ba96bee52cae9cb49113bef4 6122845	hugeFile.png

Discover commits with the large file

From the output above, you can find all of the commit hashes that contain the blob.  You will need the path, in this case the current directory, in this case ./, and the hash.

$ git log --all --pretty=format:%H -- ./ | \
	xargs -I% sh -c "git ls-tree % -- ./ | 
	grep -q 25b6ebb7bfc76200ba96bee52cae9cb49113bef4 && echo %"
a1a23cca2e2d379c1b8162c536f8753fad0bd1ae
$

Find branches that contain a commit

This lets you find the branches that are affected by the large file.  If the file is only in one or two branches, these branches can be deleted to remove the large file.  Otherwise, please see reduce your repository size.

$ git branch -a --contains a1a23cca2e2d379c1b8162c536f8753fad0bd1ae
* main
  test
$

List the total size of HEAD

This command will use the output of git ls-tree to sum the total size of all files reachable from the repository's head. The output represents the total sum in bytes.

$ git ls-tree -r --long HEAD | awk '{sum+=$4} END {print sum}'
7833793

Check the repo’s size and the number of objects

Using the command git count-objects, we can see the total repository size and how many objects are being used to calculate that size. With the below output, we can confirm that the local repository's current size is 15.27MB and there is a total of 3 objects.

$ git count-objects -vH
count: 3
size: 15.72 MiB
in-pack: 0
packs: 0
size-pack: 0 bytes
prune-packable: 0
garbage: 0
size-garbage: 0 bytes

(info) This command calculates the repository size based on the objects contained within the local clone. For a more accurate size calculation matching the size seen on the remote, it's advisable to run this command on a mirror clone.

After you have identified what has caused the repository size to increase, you can follow the appropriate steps to reduce your repository size.


最終更新日 2024 年 4 月 8 日

この内容はお役に立ちましたか?

はい
いいえ
この記事についてのフィードバックを送信する
Powered by Confluence and Scroll Viewport.