Degraded performance and intermittent outages on Portal and Agent API
Incident Report for Patchman
Resolved
Performance continues to be stable and delays have been recovered. The incident has been resolved.
Posted Nov 01, 2017 - 09:36 CET
Monitoring
Portal and Customer API performance have stabilised as a result of the database cluster rebalance. While we do still see some processing delays that had built up as a result of the period of degraded performance, recovery of those delays is steady. We will continue to monitor the situation for now.
Posted Oct 31, 2017 - 17:22 CET
Update
Rebalance of the database cluster has been completed successfully. Performance may already begin to see marginal improvements as a result of the rebalance. We are now continuing preparations for the full node migration.
Posted Oct 31, 2017 - 15:48 CET
Update
Despite repeated efforts, our infrastructure provider was not able to resolve the performance issues on the affected database host machine. We will now be taking necessary steps to rebalance the cluster in preparation for a full node migration of the affected machine.
Posted Oct 31, 2017 - 15:19 CET
Update
Our infrastructure provider reports that investigation and remediation efforts are ongoing. We will update as progress is made.
Posted Oct 31, 2017 - 11:36 CET
Update
We have confirmation that this issue has been given top priority by our infrastructure provider, and that they are currently working on a solution. We expect to have an update within the next 30 minutes.
Posted Oct 31, 2017 - 11:00 CET
Identified
As we had seen yesterday ( http://status.patchman.co/incidents/nczb55f0xj79 ), the host machine for one of our databases is experiencing high load, and this is negatively impacting the performance of the Portal and Customer API. Current information suggests that the cause lies with our infrastructure provider, and we are actively working with them to resolve the problem.
Posted Oct 31, 2017 - 10:47 CET
Investigating
We are currently experiencing degraded performance resulting in intermittent outages on both the Portal and the Customer API. We are investigating the issue.
Posted Oct 31, 2017 - 10:27 CET