* Incident on cluster015 summary: all web mutualized unavailable on cluster015
* Start time: 2018-01-05 19h15
* Impact: cluster is unreachable, no website is working for this cluster, except for web perf
* Impact type: Service unavailable
* Estimated time to recovery: 1 hour
* Actions undertaken: investigating and try to lower the load on the cluster
Update(s):
Date: 2019-01-05 18:58:11 UTC *Summary : some host are still unavailable but it don't disturb the service anymore
*Time : 2018-01-05 19h57
*Impact : not anymore
*Recovery time : 2018-01-05 19h53
*Action undertaken : still taking of the last faulting hosts, and keep a look on the cluster
Date: 2019-01-05 18:53:17 UTC *Summary : service is progressively coming back online
*Time : 2018-01-05 19h53
*Impact : service partially unavailable
*Estimated time to recovery : 15 minutes
*Actions undertaken : reboot last failing host, restart apache on others
Posted Jan 05, 2019 - 18:26 UTC
This incident affected: Web Hosting || Datacenter GRA (Cluster002, Cluster003, Cluster006, Cluster007, Cluster011, Cluster012, Cluster013, Cluster014, Cluster015, Cluster017, Cluster020, Cluster021, Cluster023, Cluster024, Cluster025, Cluster026, Cluster027, Cluster028, Cluster029, Cluster030, Cluster031).