OVHcloud Private Cloud Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
pcc-22-n5
Incident Report for Hosted Private Cloud
Resolved
Ce n5 ne répond plus, nous investiguons.
le service est maintenu par le pcc-23-pcc.

Update(s):

Date: 2014-09-20 13:15:59 UTC
tout est okay, la redondance est de nouveau en place.

Date: 2014-09-20 13:11:34 UTC
pcc-22-n5# sh fex
FEX FEX FEX FEX Fex
Number Description State Model Serial
------------------------------------------------------------------------
100 fex100|27C02 Online N2K-C2232PP-10GE SSI14210EWH
101 fex101|27C02 Online N2K-C2232PP-10GE SSI14210EPS
102 fex102|27C04 Online N2K-C2232PP-10GE SSI142203E8
103 fex103|27C05 Online N2K-C2232PP-10GE SSI14270R32
104 tests |28D17 Online N2K-C2248TP-1GE SSI1423007F
105 fex105|28B26 Online N2K-C2248TP-1GE SSI14270E7E
106 fex106|28B25 Online N2K-C2248TP-1GE SSI14270R4Y
107 fex107|28B27 Online N2K-C2248TP-1GE SSI14270CJS
108 fex108|28B24 Online N2K-C2248TP-1GE SSI14270R7X
109 fex109|28B23 Online N2K-C2248TP-1GE SSI14150XK0
110 fex110|28B22 Online N2K-C2248TP-1GE SSI142800E3

Date: 2014-09-20 13:05:13 UTC
cc-22-n5(config-if-range)# sh fex
FEX FEX FEX FEX Fex
Number Description State Model Serial
------------------------------------------------------------------------
100 fex100|27C02 Online N2K-C2232PP-10GE SSI14210EWH
101 fex101|27C02 Image Download N2K-C2232PP-10GE SSI14210EPS
102 fex102|27C04 Online Sequence N2K-C2232PP-10GE SSI142203E8
103 fex103|27C05 Offline N2K-C2232PP-10GE SSI14270R32
104 tests |28D17 Image Download N2K-C2248TP-1GE SSI1423007F
105 fex105|28B26 Image Download N2K-C2248TP-1GE SSI14270E7E
106 fex106|28B25 Image Download N2K-C2248TP-1GE SSI14270R4Y
107 fex107|28B27 Image Download N2K-C2248TP-1GE SSI14270CJS
108 fex108|28B24 Image Download N2K-C2248TP-1GE SSI14270R7X
109 fex109|28B23 Offline N2K-C2248TP-1GE SSI14150XK0
110 fex110|28B22 Image Download N2K-C2248TP-1GE SSI142800E3


Date: 2014-09-20 12:47:17 UTC
le fex se met à jour
pcc-22-n5# sh fex
FEX FEX FEX FEX Fex
Number Description State Model Serial
------------------------------------------------------------------------
100 fex100|27C02 Image Download N2K-C2232PP-10GE SSI14210EWH

Date: 2014-09-20 12:40:52 UTC
conf resync, j'allume les FEX un par un


Date: 2014-09-20 09:12:02 UTC
N5 a jour,
PCC resync la conf.

Date: 2014-09-20 07:08:37 UTC
le N5 est remplacé, nous le mettons a jour avec la dernière version du NXOS validé pour PCC.

Date: 2014-09-20 05:27:11 UTC
le N5 est retombé, la gestion des fan est defailante, les 2 bloc fans sont neuf et pourtant sont tombé au même moment (d'apres les log)
on ne prend pas de risque, on remplace le N5 par un nouveau

[local7.crit] === : 2014 Sep 20 06:51:32 CEST: %NOHMS-2-NOHMS_ENV_ERR_FAN_SPEED: System minor alarm in fan tray 2: fan speed is out of range on fan 2. 9450 to 12600 rpm expected. 8794 rpm read
[local7.crit] === : 2014 Sep 20 06:51:32 CEST: %NOHMS-2-NOHMS_ENV_ERR_FANS_DOWN: System major alarm on fans: Multiple fans missing or failed
[local7.emerg] === : 2014 Sep 20 06:51:32 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 60 seconds due to fan removal
[local7.crit] === : 2014 Sep 20 06:51:34 CEST: %NOHMS-2-NOHMS_ENV_ERR_FAN_RECOVERED: Recovered: System minor alarm in fan tray 1
[local7.crit] === : 2014 Sep 20 06:51:34 CEST: %NOHMS-2-NOHMS_ENV_ERR_FANS_UP: Recovered: System major alarm on fans: Multiple fans missing or failed
[local7.crit] === : 2014 Sep 20 06:51:35 CEST: %NOHMS-2-NOHMS_ENV_ERR_FAN_SPEED: System minor alarm in fan tray 1: fan speed is out of range on fan 3. 12158 to 16211 rpm expected. 6087 rpm read
[local7.crit] === : 2014 Sep 20 06:51:35 CEST: %NOHMS-2-NOHMS_ENV_ERR_FAN_SPEED: System minor alarm in fan tray 1: fan speed is out of range on fan 4. 9450 to 12600 rpm expected. 6601 rpm read
[local7.crit] === : 2014 Sep 20 06:51:35 CEST: %NOHMS-2-NOHMS_ENV_ERR_FANS_DOWN: System major alarm on fans: Multiple fans missing or failed
[local7.emerg] === : 2014 Sep 20 06:51:35 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 60 seconds due to fan removal
[local7.emerg] === : 2014 Sep 20 06:51:40 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 55 seconds due to fan removal
[local7.emerg] === : 2014 Sep 20 06:51:45 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 50 seconds due to fan removal
[local7.emerg] === : 2014 Sep 20 06:51:50 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 45 seconds due to fan removal
[local7.emerg] === : 2014 Sep 20 06:51:55 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 40 seconds due to fan removal
[local7.emerg] === : 2014 Sep 20 06:52:00 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 35 seconds due to fan removal
[local7.emerg] === : 2014 Sep 20 06:52:05 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 30 seconds due to fan removal
[local7.emerg] === : 2014 Sep 20 06:52:10 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 25 seconds due to fan removal
[local7.emerg] === : 2014 Sep 20 06:52:15 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 20 seconds due to fan removal
[local7.emerg] === : 2014 Sep 20 06:52:20 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 15 seconds due to fan removal
[local7.emerg] === : 2014 Sep 20 06:52:25 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 10 seconds due to fan removal
[local7.emerg] === : 2014 Sep 20 06:52:30 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 5 seconds due to fan removal
[local7.emerg] === : 2014 Sep 20 06:52:35 CEST: %PFMA-0-SYS_SHUTDOWN_FAN_REMOVAL: System shutdown in 0 seconds due to fan removal
[local7.emerg] === : 2014 Sep 20 06:52:35 CEST: Sep 20 06:52:35 %KERN-0-SYSTEM_MSG: Shutdown Ports.. - kernel


Date: 2014-09-20 01:37:51 UTC
le N5 est UP, nous ferons le resync de la conf au matin.
Posted Sep 20, 2014 - 00:58 UTC
This incident affected: VMware Private Cloud || RBX (Compute).