How to obtain developer productivity statistics from Bitbucket Data Center
プラットフォームについて: Data Center - この記事は、Data Center プラットフォームのアトラシアン製品に適用されます。
このナレッジベース記事は製品の Data Center バージョン用に作成されています。Data Center 固有ではない機能の Data Center ナレッジベースは、製品のサーバー バージョンでも動作する可能性はありますが、テストは行われていません。サーバー*製品のサポートは 2024 年 2 月 15 日に終了しました。サーバー製品を利用している場合は、アトラシアンのサーバー製品のサポート終了のお知らせページにて移行オプションをご確認ください。
*Fisheye および Crucible は除く
注意事項
- This document was created on a best-effort basis because Atlassian Support sometimes gets queries from customers about getting developer productivity statistics, however:
Bitbucket Data Center is primarily an application for hosting Git repositories, with governance surrounding access controls and workflow management. It is not a tool to measure developer productivity.
The information in this document is presented as-is.
Atlassian does not guarantee that the data created using this document is correct or fit for any purpose.
Atlassian will not provide any support regarding the information in this document and will not answer support requests raised in relation to it.
要約
Depending on your definition of "developer productivity", you might be interested in retrieving certain types of statistical data from your Bitbucket Data Center instance in order to measure the work of users.
Bitbucket Data Center does not provide any means out of the box to generate such reports on its own, however there are ways to extract data from it that you can use to generate these reports.
This means that regardless of which report you are trying to compile, you will always need to use some additional third-party (or self-developed) tooling to process that data.
With this in mind, this document describes some common use-cases and gives some pointers to resources that might help you achieve your goal.
ソリューション
Extracting the Data from Bitbucket Data Center
The type of report you are trying to put together determines which data you will need to extract from Bitbucket Data Center.
Data Pipeline
Starting with Bitbucket Data Center 7.13 you can extract all of the data in an instance (except repository contents) for the purpose of reporting.
This feature (which requires a Data Center license) is called Data Pipeline, and you can read more about it at Data Pipeline Please note that some data is only available from version 7.15, see Availability at Data pipeline export schema.
This data is very comprehensive, so if for instance you only require metadata from commits (such as who made which commits to which repositories and when), or pull requests then this data most likely will be sufficient for your reporting purposes.
Repository contents via git
If the reports you want to generate require more in-depth data regarding contributions to repositories (i.e. commits) you may need to use git commands to extract this kind of data directly from the repositories.
This is due to the fact that this data is not stored in Bitbucket Data Center's database but only in the repositories on disk.
It is important that any of the git commands you run be run on a clone of the repository (for instance on a workstation) and not the original repository as stored on-disk in the shared Bitbucket home directory.
This helps avoid performance problems in case the git command performs heavy operations that might lead to a slowdown of other git operations that run in Bitbucket Data Center. Alternatively you could choose to perform these actions on a separate (non-production) Data Center instance that holds the same data.
Repository contents via REST API
As an alternative to using native git commands to access repository contents, you can use Bitbucket Data Center's REST API, which will allow you to retrieve many kinds of information from repositories.
Due to pagination and limitations to the number of results returned via the REST API, this method is more limited and considerably more inefficient than using native git commands.
Last but not least, using the REST API can also have a substantial performance impact on your instance (depending on the nature of requests), so as with using native git commands, you must not use the REST API on your production instance to retrieve lots of data for reporting purposes. To protect your instance against thrashing by excessive use of the REST API also see Improving instance stability with rate limiting.
例
The following example commands allow you retrieve some information that one could use to assess developer productivity
Number of commits by author
A very basic metric that one could consider a measure of developer productivity is the number of commits an author has made in a repository.
Git
A simple command that will return the list of authors with commits in the repository with the number of commits they’ve made in ascending order would be:
git log --pretty=format:"%ae" | sort | uniq -c | sort -n
Note that this includes merge commits - if you want to exclude them from the count, consider running the following command instead:
git log --pretty=format:"%ae" --no-merges | sort | uniq -c | sort -n
REST API
Retrieving the list of commits in a repository is possible via a GET
request to the /commits
endpoint as follows:
{BaseURL}/rest/api/1.0/projects/{projectKey}/repos/{repositorySlug}/commits
where {BaseURL}
is the base URL of your Bitbucket Data Center instance, {projectKey}
is the key of the project that contains the repository in question and {repositorySlug}
is the slug of the repository itself.
This will return the list of commits in the repository as JSON, for instance such as in this (truncated) example:
{
"isLastPage": false,
"limit": 25,
"nextPageStart": 25,
"size": 25,
"start": 0,
"values": [
{
"author": {
"active": true,
"displayName": "Joe Max",
"emailAddress": "joe@max.company",
"id": 12158,
"links": {
"self": [
{
"href": "https://bitbucket.max.company/users/joe"
}
]
},
"name": "joe",
"slug": "joe",
"type": "NORMAL"
},
"authorTimestamp": 1730703725000,
"committer": {
"active": true,
"displayName": "Joe Max",
"emailAddress": "joe@max.company",
"id": 12158,
"links": {
"self": [
{
"href": "https://bitbucket.atlassian.com/users/joe"
}
]
},
"name": "joe",
"slug": "joe",
"type": "NORMAL"
},
"committerTimestamp": 1730703725000,
"displayId": "d5be9de1d43",
"id": "d5be9de1d432dec2553127b4a4843be5b5e6d40c",
"message": "markdown",
"parents": [
{
"displayId": "5b58e5d00d5",
"id": "5b58e5d00d58cf775e725214f7f2f82bae5fcb3e"
}
]
}
The /commits
endpoint is paginated, which means that for repositories with larger numbers of commits you will need to iterate over further result pages in order to retrieve all commits in the repo. This makes this operation quite inefficient considering how time and resource intensive it is. Retrieving the list of commits via Git is therefore preferable.
As with git, this returns all types of commits, including merge commits. If you want to exclude these from the results, append the merges=exclude
parameter to the URL, like so:
{BaseURL}/rest/api/1.0/projects/{projectKey}/repos/{repositorySlug}/commits?merges=exclude
Programmatic Use
To give you an idea of how you can use the REST API to retrieve developer productivity statistics (the list of commits in a repository in this case), see the following pseudo code example of a script you might write to achieve this:
$baseUrl = "https://bitbucket.company.com"
getListOfProjects($startPage){
$jsonResponse = fetchFromUrl($baseUrl/rest/api/1.0/projects?start=$startPage)
foreach member of the "values" array in $jsonResponse as $project{
getListOfReposInProject($project, 0)
}
until $jsonResponse has "isLastPage": false{
$nextProjectPageStart = nextPageStart from $jsonResponse
getListOfProjects($nextProjectPageStart)
}
}
getListOfReposInProject($project, $startPage){
$jsonResponse = fetchFromUrl($baseUrl/rest/api/1.0/$project/repos)
foreach member of the "values" array in $jsonResponse as $repo{
getListOfCommitsInRepo($project, $repo, 0)
}
until $jsonResponse has "isLastPage": false{
$nextRepoPageStart = nextPageStart from $jsonResponse
getListOfReposInProject($project, $nextRepoPageStart);
}
}
function getListOfCommitsInRepo($projectKey, $repositorySlug, $startPage){
$jsonResponse = fetchFromUrl($baseUrl/rest/api/1.0/projects/$projectKey/repos/$repositorySlug/commits?start=$startPage)
foreach member of the "values" array in $jsonResponse{
add to list of commits
}
until $jsonResponse has "isLastPage": false{
$nextCommitPageStart = nextPageStart from $jsonResponse
getListOfCommitsInRepo($projectKey, $repositorySlug, $nextCommitPageStart)
}
}
getListOfProjects(0);
Third-Party Apps
Any mention of apps in the section below does not constitute an endorsement of these apps, nor does it imply that the mentioned apps are fit for any particular purpose. Always evaluate third-party apps before making a purchasing decision and speak to the app vendor when in doubt. Atlassian does not provide any support for the functionality of third-party apps.
The Atlassian app ecosystem is rich in third-party apps for various uses. You can find them in the Atlassian Marketplace. For the particular use-case of extracting developer productivity statistics, it appears that Awesome Graphs for Bitbucket Data Center may be a suitable option.
Other Data of Interest
Besides purely looking at the number of commits you might also want to gather other data to help judge developer productivity, such as the number of lines added or modified. Extracting this data via the REST API, while possible, is not practical due to the sheer number of requests required. Atlassian Fisheye allows for some reporting about the numbers of lines of code by developer - see Fisheye reports and Viewing people's statistics for details. Otherwise, extracting the data via git commands is an option, and the first answer provided in this Stack Overflow thread should serve as a good starting point.