How to optimize migration (export and import) of large number of projects from one Bitbucket Server and Data Center instance to another

Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.

Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Summary

The content on this page relates to platforms which are not supported. Consequently, Atlassian Support cannot guarantee providing any support for it. Please be aware that this material is provided for your information only and using it is done so at your own risk.

When migrating - exporting and importing - a large number of projects and repositories from one Bitbucket Server and Data Center instance to the other, the process explained on the page Export and import projects and repositories should be optimized. Without optimization, the whole process won't be efficient, can take much longer time than needed, and can result in GIT repo duplicates if there are repository forks present.

Environment

8.9.3, but also applicable to other versions.

Solution

With many projects and repositories with a lot of data to move, the general idea is to organize exports into smaller chunks, several projects or repositories each, and to do exports in parallel.

Optimization would include a bit of programming or scripting skills.

We use these REST API endpoints to construct the list of projects and repositories:

Possible challenges

The challenges, according to Start export job REST API, are these two:

For every selected repository, its full fork hierarchy will be considered selected, even if parts of that hierarchy would otherwise not be matched by the provided selectors.
For example, when you explicitly select a single repository only, but that repository is a fork, then its origin will be exported (and eventually imported), too.
Only 2 concurrent exports are supported per cluster node.
If a request ends up on a node that is already running that many export jobs, the request will be rejected and an error returned.

Automatic selection of fork hierarchy

Fork hierarchies are always migrated fully. That means that even if we select for export only project P1 and its repository R1, if R1 itself is a fork, or there are forks of R1, the complete hierarchy will be migrated. For example, if repo R2 in project P2 is a fork of R1, it will be exported, too. Now, if another export selects P2/R2 for export, that one will again include P1/R1, and we will end up with the same repository exported and later imported twice.

In other words, this needs additional handling:

If we know how to split projects and repositories into several chunks so there are no duplicates due to fork hierarchies, we can do the splitting manually.
For example, if we are sure that forks are not created across projects, we can do the splitting on a per-project level.
If we don't know how to split them, we will have to apply some programmatic procedure.

One such programmatic procedure may be like this:

Use Get projects and Get repositories for project REST APIs to create a "complete list of projects and their repositories".
Split the "complete list of projects and repositories" into smaller chunks.
Iterate over this list of "chunks" and call Preview export REST API to check what actually would be exported for a given "chunk" used as a project/repo selector.
- Add the "chunk" (export selector) to the new "list of selectors for export".
- Check the result of the Preview export and remove from the "complete list of projects and repositories" all repositories that would be selected as part of the fork hierarchy.
  This will ensure that automatically selected repositories won't be added again later.
Repeat step 3. until you iterate over all elements of the "complete list of projects and repositories" or that list becomes empty.
At the end, you will have selectors to use for export in the "list of selectors for export", so you can use it to launch parallel exports.

Only 2 concurrent exports are supported per cluster node

If a request ends up on a node that is already running two export jobs, the request will be rejected, and an error will be returned. You can use that as a signal to try again, hoping your request will end on a different node.

Care must be taken not to create excessive requests, so some back-off timer has to be applied.

With only two concurrent exports per cluster node, the maximal number of export chunks that can run in parallel is

2*<NUMBER_OF_CLUSTER_NODES>

Final notes

Please be sure to check all referenced documentation pages. They contain valuable information.
Using shell script for a programmatic procedure to split exports into chunks may not be optimal.
You could better use some more advanced scripting tools or language.
Start export job REST API accepts both project and repository as export selectors.
You can use multiple selectors to export several specific projects or repositories at once.
Page Exporting has several examples.
Preview export REST API accepts the same selectors as Start export job.

Updated on April 8, 2025

Was this helpful?

It wasn't accurateIt wasn't clearIt wasn't relevant