There have been no regressions on our platform. We will be planning maintenance for reintroducing the previously faulty node into the database cluster when it is required. For now, we will be closing this incident.
Posted about 1 year ago. Jan 26, 2018 - 16:23 CET
We continue to monitor the situation. While we have finished what we considered to be emergency maintenance on the faulty database node, we are still performing lower-profile tests on it to validate our steps have indeed solved the problems.
Posted about 1 year ago. Jan 26, 2018 - 12:13 CET
The cluster rebalancing has been completed and all services are fully operational. Emergency maintenance on the faulty database node is ongoing.
Posted about 1 year ago. Jan 26, 2018 - 10:43 CET
Some of the necessary steps to prevent future regressions require significant downtime to one of our database nodes. We will be doing rebalancing in our database cluster to take the problematic node out of service. This may cause short periods of unavailability on the Portal and Customer API.
Posted about 1 year ago. Jan 26, 2018 - 10:31 CET
All systems have remained stable over the past hour. We are currently still investigating the root cause and taking measures to prevent this from occurring again in the future.
Posted about 1 year ago. Jan 26, 2018 - 10:06 CET
All services are back online. We are closely monitoring the situation for regressions, as well as managing the problem that cause this outage in the first place.
Posted about 1 year ago. Jan 26, 2018 - 09:15 CET
The Portal and Customer API are back online. We are currently still working on the data processing backend.
Posted about 1 year ago. Jan 26, 2018 - 09:10 CET
Due to issues with our database cluster we are seeing intermittent failures on various services. The cause has been identified and we are currently working on a solution.