Random 500 errors when running Stash - Database connection timeouts
症状
- Random 500 errors when visiting Stash in a web browser
- Stash web pages don't always render fully (missing data or sections)
- When pulling from, pushing or cloning to Stash, it responds with the error code 500 intermittently
$ git pull origin master
fatal: unable to access 'http://10.23.165.46/stash/scm/project/repo.git/': The requested URL returned error: 500
$ git pull origin master From http://10.23.165.46/stash/scm/project/repo.git/
* branch master -> FETCH_HEAD
Already up-to-date.
診断
Look at the atlassian-stash.log:
The exceptions can happen anytime Stash uses the database. They should always have a database-related cause and the cause may indicate that the database connection was closed. See below for some examples.
2013-10-08 13:21:37,909 ERROR [http-bio-7990-exec-256] 801x6305x1 10.23.172.120,0:0:0:0:0:0:0:1 "GET /mvc/error500 HTTP/1.1" c.a.s.i.web.ErrorPageController There was an unhandled exception loading [/scm/project/repo.git/info/refs]
org.springframework.transaction.TransactionSystemException: Could not roll back Hibernate transaction; nested exception is org.hibernate.TransactionException: rollback failed
...
Caused by: org.hibernate.TransactionException: rollback failed
...
Caused by: org.hibernate.TransactionException: unable to rollback against JDBC connection
...
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.
ERROR [http-bio-7990-exec-2442] 19x26849x1 62.14.44.226,127.0.0.1 "GET /mvc/error500 HTTP/1.1" c.a.s.i.web.ErrorPageController There was an unhandled exception loading [/scm/ffa/devflow.git/info/refs]
org.springframework.transaction.CannotCreateTransactionException: Could not open Hibernate Session for transaction; nested exception is org.hibernate.TransactionException: JDBC begin transaction failed:
...
...
Caused by: org.hibernate.TransactionException: JDBC begin transaction failed:
...
...
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet successfully received from the server was 197,409 milliseconds ago. The last packet sent successfully to the server was 1 milliseconds ago.
...
Caused by: java.io.EOFException: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost.
原因
Stash does a periodic "ping" (a throwaway operation on the database that doesn't update any data) to keep the server from closing the connection as being idle. However, if the database server has aggressive timeouts configured, Stash's default heartbeat ping may not be fast enough. That can result in connections in the pool being closed. That, in turn, causes random operations to fail when the connection they get from the pool (all of which are expected to be open) is already closed.
ソリューション
You can either have Stash ping the database more frequently or increase the connection timeout.
Option 1) Tweak heartbeat rate on Stash
This solution will work for all databases. Increase the heartbeat rate by add the following parameter to stash-config.properties
:
db.pool.idle.testInterval=1
That will cause Stash to heartbeat its database connections every minute instead of every 10 minutes (the default), avoiding connections in the pool from being closed by the remote server.
Option 2) Tweak database connection timeout parameter
This solution requires a different solution for each database: