OVHcloud Web Hosting Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
privatesql040.p19
Incident Report for Web Cloud
Resolved
English below

Suite à une maintenance sur le cluster Ceph, l'instance privatesql040.p19 est restée bloquée et a du être redémarrée. L'indisponibilité a eu lieu de 08:23 à 08:53.

---

Due to an intervention on the Ceph cluster, the privatesql040.p19 was frozen and had to be rebooted. The indisponibility was from 08:23 to 08:53.

Update(s):

Date: 2016-09-28 11:20:19 UTC
This morning, Ceph team had a polrad changed on one of their storage servers, and they reintegrated it into the farm.
When you add back a server to a farm, Ceph has to check the data integrity of the data it has, in this case, that represents ~24T of data, and files can be fragmented pretty much everywhere.
When a server using a Ceph disk tries to access a file which has not been checked yet, it has to wait, leading to a huge amount of IO Waits on our hosts.
The Ceph team continues their investigations to understand how can they avoid such problems.

Date: 2016-09-28 09:52:32 UTC
We still have issues on Ceph. Big latencies expected. We're still investigating.
Posted Sep 28, 2016 - 09:47 UTC
This incident affected: Web Hosting || Datacenter GRA (Cluster002, Cluster003, Cluster006, Cluster007, Cluster011, Cluster012, Cluster013, Cluster014, Cluster015, Cluster017, Cluster020, Cluster021, Cluster023, Cluster024, Cluster025, Cluster026, Cluster027, Cluster028, Cluster029, Cluster030, Cluster031).