Enable performance monitoring for Bitbucket Mesh
This article describes how to enable performance monitoring in Bitbucket Mesh and export performance metrics to various backends.
Why would I want to enable performance monitoring within Bitbucket Mesh?
With performance monitoring, you can easily monitor the application resources consumed by Bitbucket Mesh, enabling you to make better decisions about maintaining and optimizing machine resources.
何を監視できるか
It is possible to monitor various statistics by enabling performance monitoring within Bitbucket Mesh. Below are some examples of some statistics that can be monitored.
gRPC calls statistics
統計情報 | 説明 |
---|---|
失敗 | Number of failed gRPC calls since start |
実行中 | Number of active gRPC calls |
成功 | Number of successful gRPC calls since start |
合計 | Total number of gRPC calls since start |
Mesh nodes
統計情報 | 説明 |
---|---|
InconsistentCount | The number of Mesh nodes that host inconsistent replicas |
OfflineCount | The number of Mesh nodes that are offline |
TotalCount | The total number of Mesh nodes |
UnavailableCount | The number of Mesh nodes that are either disabled or offline |
Partition migration
統計情報 | 説明 |
---|---|
FailedMigrations | The number of migrations that are in a failed state |
InProgressMigrations | The number of migrations that are in progress |
LongRunningMigrationsFor1h | The number of migrations that have been running for longer than one hour |
LongRunningMigrationsFor8h | The number of migrations that have been running for longer than 8 hours |
LongRunningMigrationsFor24h | The number of migrations that have been running for longer than 24 hours |
Repair operation statistics
These statistics represent statistics of repair operations initiated by Mesh.
Repair operation duration
A repair operation for Mesh takes place in distinct phases, and statistics for each phase are available.
統計情報 | 説明 |
---|---|
CompareReflogs | Duration of the reflog comparison phase of repair |
CompareRefs | Duration of the ref comparison phase of repair |
FetchObjects | Duration of the fetch objects phase of repair |
合計 | Total time taken for repair operations |
Repair operation calls
統計情報 | 説明 |
---|---|
失敗 | Number of failed repair operations since start |
実行中 | Number of active repair operations |
成功 | Number of successful repair operations since start |
合計 | Total number of repair operations since start |
Ticket statistics
Mesh uses tickets as a mechanism for creating back pressure to prevent the system from being overloaded with requests. The following types of tickets are available:
Hosting tickets: Limits the number of SCM hosting operations, meaning pushes and pulls, which may be running concurrently.
Mirror Hosting tickets: Limits the number of SCM hosting operations served to mirrors, which may be running concurrently. This limit is intended to protect the system's CPU and memory from being consumed excessively by mirror operations.
Command tickets: Limits the number of SCM commands, such as: git diff
, git blame
, or git rev-list
, which may be running concurrently.
Ref Advertisement tickets: Limits the number of SCM Ref Advertisement operations that may be running concurrently. These are throttled separately from hosting operations as they are much more lightweight and much shorter, so many more of them can run concurrently.
Mesh supports the following metrics for each ticket type:
統計情報 | 説明 |
---|---|
Available | Number of tickets currently available for use |
Used | Number of tickets currently in use |
合計 | Total number of tickets configured |
Queued | Number of threads waiting for the ticket |
Rejected | Total number of rejected requests for the ticket (added in 8.9.19) |
Held duration | Duration that the tickets were held for. Requires the |
Expose performance metrics within Bitbucket Mesh
The above metrics above can be published to the following supported backends:
- Datadog
- Dynatrace
- InfluxDB
- JMX (Java Management Extensions)
- New Relic
- SignalFX
- StatsD
You can publish metrics to a backend by setting the corresponding management.metrics.export.<backend>.enabled
property to true
.
For example, to enable publishing of the metrics to SignalFX, the following property needs to be set in mesh.properties
:
management.metrics.export.signalfx.enabled = true
Further configuration may be required based on the backend you chose. Refer to the configuration properties for more information.
You can also attach tags to metrics published from a given node by specifying them as a suffix of the metrics.tag
property.
For example, to attach a tag called application=Mesh
to the published metrics, set the the following property in mesh.properties
:
metrics.tags.application=Mesh