OVHcloud Network Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
bhs4-15a/b-n56
Scheduled Maintenance Report for Network & Infrastructure
Completed
Nous allons mettre a jour ce couple de nexus en release 7.1.3.N1.2
L'intervention est planifiée le 11 février 2016 a partir de 8h00 am CET ( 2h00 am EST )

Le but est d'ajouter de la feature et du bug fix:
- Pouvoir installer du fex2348 ( 48 ports 10g baseT )
- Bug fix, entre autre le vpc mismatch speed, mais aussi des crash sur pfstat, ptplc

Vu la catastrophe des mises a jour en Non-disruptive ISSU ( 4 crash/fail sur 5 ), nous allons mettre ce couple a jour en Disruptive Upgrade.
Nous allons donc controller le downtime, estimer a ~5-6min par fex

Voici la méthode:
- Install all FORCE sur le primary
==> le primary va se mettre a jour, pre-loader les images sur les fex, puis le supervisor va reload.
==> pas de downtime.

- Le primary reboot, les fexs sont toujours active via le secondary
==> pas de downtime.

- Une fois le Primary sur login, il sera a jour en 7.1.3
==> Les fexs seront en tjs sur le Secondary, mais en mismatch version sur le Primary
==> Toujours pas de downtime.

- Reload des Fexs 1 par 1, l'image est pré-chargée, ca va reload en 7.1.3
==> Downtime le temps du reload - ~5-6min
==> Le fex va s'attacher au Primary et forwarder le trafic

- Une fois tt les fexs migrés , on isole le Secondary puis disruptive-ISSU sur le Secondary.




Update(s):

Date: 2016-02-11 13:58:12 UTC
le couple est stable, rien dans le monitoring depuis 1h, all done

Date: 2016-02-11 12:22:24 UTC
Tout les fex sont UP

bhs4-15b-n56# sh fex
FEX FEX FEX FEX Fex
Number Description State Model Serial
------------------------------------------------------------------------
100 fex100 Online N2K-C2348TQ-10GE FOC1847R1CG
101 fex101 Online N2K-C2348TQ-10GE FOC1847R1LN
102 fex102 Online N2K-C2348TQ-10GE FOC1930R4Z7
103 fex103 Online N2K-C2348TQ-10GE FOC1930R4YS
104 fex104 Online N2K-C2348TQ-10GE FOC1930R15G
105 fex105 Online N2K-C2348TQ-10GE FOC1930R51H
106 fex106 Online N2K-C2348TQ-10GE FOC1930R50F
107 fex107 Online N2K-C2348TQ-10GE FOC1930R1KH
108 fex108 Online N2K-C2348TQ-10GE FOC1930R3X6
109 fex109 Online N2K-C2348TQ-10GE FOC1930R51A
110 fex110 Online N2K-C2348TQ-10GE FOC1930R3XN
112 fex112 Online N2K-C2348TQ-10GE FOC1930R43M
113 fex113 Online N2K-C2348TQ-10GE FOC1930R41F
114 fex114 Online N2K-C2348TQ-10GE FOC1930R131
115 fex115 Online N2K-C2348TQ-10GE FOC1930R43E
116 fex116 Online N2K-C2348TQ-10GE FOC1930R11Y
117 fex117 Online N2K-C2348TQ-10GE FOC1930R15R
118 fex118 Online N2K-C2348TQ-10GE FOC1930R40Z
119 fex119 Online N2K-C2348TQ-10GE FOC1930R41S
120 fex120 Online N2K-C2348TQ-10GE FOC1930R434


On fait le tour du monito

Date: 2016-02-11 12:15:08 UTC
l'attach des fex se fait 2/2, c'est assez smooth pour ne pas charger le port-manager

bhs4-15b-n56# sh fex
FEX FEX FEX FEX Fex
Number Description State Model Serial
------------------------------------------------------------------------
100 fex100 Online N2K-C2348TQ-10GE FOC1847R1CG
101 fex101 Online N2K-C2348TQ-10GE FOC1847R1LN
102 fex102 Online N2K-C2348TQ-10GE FOC1930R4Z7
103 fex103 Online N2K-C2348TQ-10GE FOC1930R4YS
104 fex104 Online N2K-C2348TQ-10GE FOC1930R15G
105 fex105 Online N2K-C2348TQ-10GE FOC1930R51H
106 fex106 Online N2K-C2348TQ-10GE FOC1930R50F
107 fex107 Online N2K-C2348TQ-10GE FOC1930R1KH
108 fex108 Online N2K-C2348TQ-10GE FOC1930R3X6
109 fex109 Online N2K-C2348TQ-10GE FOC1930R51A
110 fex110 Online N2K-C2348TQ-10GE FOC1930R3XN
112 fex112 Online N2K-C2348TQ-10GE FOC1930R43M
113 fex113 Online N2K-C2348TQ-10GE FOC1930R41F
114 fex114 Online N2K-C2348TQ-10GE FOC1930R131


Date: 2016-02-11 12:03:28 UTC
uplinks UP, ca forward du trafic ( et passe par le peer-link)

on monte les FEX

bhs4-15b-n56# sh fex
FEX FEX FEX FEX Fex
Number Description State Model Serial
------------------------------------------------------------------------
100 fex100 Online N2K-C2348TQ-10GE FOC1847R1CG
101 fex101 Offline N2K-C2348TQ-10GE FOC1930R15G
102 fex102 Online N2K-C2348TQ-10GE FOC1930R4Z7
103 fex103 Connected N2K-C2348TQ-10GE FOC1930R4YS


Date: 2016-02-11 11:54:10 UTC
La vpc est UP

bhs4-15b-n56# sh vpc
Legend:
(*) - local vPC is down, forwarding via vPC peer-link

vPC domain id : 330
Peer status : peer adjacency formed ok
vPC keep-alive status : peer is alive
Configuration consistency status : success
Per-vlan consistency status : success
Type-2 consistency status : success
vPC role : secondary
Number of vPCs configured : 22
Peer Gateway : Disabled
Dual-active excluded VLANs : -
Graceful Consistency Check : Enabled


on up les uplinks, puis les fex


Date: 2016-02-11 11:40:19 UTC
nous remontons le 15B, juste le mgmt / peer-keepalive pour debuter

Date: 2016-02-11 10:37:37 UTC
Nous sommes tjs sur le up de certain serveurs. On parle de 20serveurs / 1000

Le spare est pret, il y a un check en cours sur les fibres ( tag au besoin ), ensuite nous procèderons au remplacement en lui même.


Date: 2016-02-11 09:18:42 UTC
tt les fex online
on check le monito pour les derniers serveurs

en // on monte le spare



Date: 2016-02-11 09:15:52 UTC
bhs4-15a-n56# sh fex |
FEX FEX FEX FEX Fex
Number Description State Model Serial
------------------------------------------------------------------------
100 fex100 Online N2K-C2348TQ-10GE FOC1847R1CG
101 fex101 Online N2K-C2348TQ-10GE FOC1847R1LN
102 fex102 Online N2K-C2348TQ-10GE FOC1930R4Z7
103 fex103 Online N2K-C2348TQ-10GE FOC1930R4YS
104 fex104 Online N2K-C2348TQ-10GE FOC1930R15G
105 fex105 Online N2K-C2348TQ-10GE FOC1930R51H
106 fex106 Online Sequence N2K-C2348TQ-10GE FOC1930R50F
107 fex107 Online N2K-C2348TQ-10GE FOC1930R1KH
108 fex108 Online N2K-C2348TQ-10GE FOC1930R3X6
109 fex109 Online N2K-C2348TQ-10GE FOC1930R51A
110 fex110 Connected N2K-C2348TQ-10GE FOC1930R3XN
112 fex112 Online N2K-C2348TQ-10GE FOC1930R43M
113 fex113 Online N2K-C2348TQ-10GE FOC1930R41F
114 fex114 Online N2K-C2348TQ-10GE FOC1930R131
115 fex115 Online N2K-C2348TQ-10GE FOC1930R43E
116 fex116 Online N2K-C2348TQ-10GE FOC1930R11Y
117 fex117 Online N2K-C2348TQ-10GE FOC1930R15R
118 fex118 Online N2K-C2348TQ-10GE FOC1930R40Z
119 fex119 Online N2K-C2348TQ-10GE FOC1930R41S
120 fex120 Online N2K-C2348TQ-10GE FOC1930R434


Date: 2016-02-11 08:57:58 UTC
bhs4-15a-n56# sh fex |
FEX FEX FEX FEX Fex
Number Description State Model Serial
------------------------------------------------------------------------
114 fex114 Online N2K-C2348TQ-10GE FOC1930R131
115 fex115 Online N2K-C2348TQ-10GE FOC1930R43E
116 fex116 Online N2K-C2348TQ-10GE FOC1930R11Y
117 fex117 Online N2K-C2348TQ-10GE FOC1930R15R
118 fex118 Online N2K-C2348TQ-10GE FOC1930R40Z
119 fex119 Online N2K-C2348TQ-10GE FOC1930R41S
120 fex120 Online N2K-C2348TQ-10GE FOC1930R434


Date: 2016-02-11 08:52:43 UTC
bhs4-15a-n56# sh fex |
FEX FEX FEX FEX Fex
Number Description State Model Serial
------------------------------------------------------------------------
118 fex118 Online Sequence N2K-C2348TQ-10GE FOC1930R40Z
119 fex119 Online N2K-C2348TQ-10GE FOC1930R41S
120 fex120 Online N2K-C2348TQ-10GE FOC1930R434


Date: 2016-02-11 08:51:54 UTC
On pense que les fex n'ont pas reload d'eux meme, donc tourne sur la vieille version 7.1.2.
Pour éviter les 20-30min d'image download, on va les reload la main => ils chargeront la 7.1.3 qui est pre-loadée



Date: 2016-02-11 08:30:27 UTC
on a le A en console, on shut les po, on set les boot var et reload


Date: 2016-02-11 08:28:38 UTC
fail again.

Les tag sur les switchs sont inversés... le A a ete reloadé a la place du B, :(




Date: 2016-02-11 08:18:09 UTC
nous reloader hard le nexus, et au pire spare...


Date: 2016-02-11 08:12:20 UTC
le B ne revient pas

ending all processes the TERM signal...
Sending all processes the KILL signal...
Unmounting filesystems...
[12066178.722329] Resetting board
DIMMs are not present in the system
ERROR: MRC failed
DIMMs are not present in the system
ERROR: MRC failed
DIMMs are not present in the system
ERROR: MRC failed
DIMMs are not present in the system
ERROR: MRC failed
DIMMs are not present in the system
ERROR: MRC failed

Le trafic est tjs forwarde par le A, aucun downtime



Date: 2016-02-11 08:00:12 UTC
bhs4-15a-n56.qc.ca : 2016 Feb 11 08:56:37.085 CET: %SATCTRL-FEX119-2-SATCTRL_IMAGE: FEX119 Image update in progress.
bhs4-15a-n56.qc.ca : 2016 Feb 11 08:56:37.087 CET: %SATCTRL-FEX119-2-SATCTRL_IMAGE: FEX119 Image update in progress.
bhs4-15b-n56.qc.ca : 2016 Feb 11 08:56:46.806 CET: %SATCTRL-FEX120-2-SATCTRL_IMAGE: FEX120 Image update in progress.
bhs4-15a-n56.qc.ca : 2016 Feb 11 08:56:47.085 CET: %SATCTRL-FEX120-2-SATCTRL_IMAGE: FEX120 Image update in progress.
bhs4-15a-n56.qc.ca : 2016 Feb 11 08:56:47.087 CET: %SATCTRL-FEX120-2-SATCTRL_IMAGE: FEX120 Image update in progress.
bhs4-15b-n56.qc.ca : 2016 Feb 11 08:57:30.816 CET: %SATCTRL-FEX101-2-SATCTRL_IMAGE: FEX101 Image update complete. Install pending
bhs4-15a-n56.qc.ca : 2016 Feb 11 08:57:31.085 CET: %SATCTRL-FEX101-2-SATCTRL_IMAGE: FEX101 Image update complete. Install pending
bhs4-15a-n56.qc.ca : 2016 Feb 11 08:57:31.087 CET: %SATCTRL-FEX101-2-SATCTRL_IMAGE: FEX101 Image update complete. Install pending
bhs4-15b-n56.qc.ca : 2016 Feb 11 08:58:04.826 CET: %SATCTRL-FEX100-2-SATCTRL_IMAGE: FEX100 Image update complete. Install pending
bhs4-15a-n56.qc.ca : 2016 Feb 11 08:58:05.085 CET: %SATCTRL-FEX100-2-SATCTRL_IMAGE: FEX100 Image update complete. Install pending
bhs4-15a-n56.qc.ca : 2016 Feb 11 08:58:05.087 CET: %SATCTRL-FEX100-2-SATCTRL_IMAGE: FEX100 Image update complete. Install pending

Date: 2016-02-11 07:54:59 UTC
witch will be reloaded for disruptive upgrade.
Do you want to continue with the installation (y/n)? [n] y

Install is in progress, please wait.

Performing runtime checks.
[####################] 100% -- SUCCESS

Setting boot variables.
[####################] 100% -- SUCCESS

Performing configuration copy.
[####################] 100% -- SUCCESS

Pre-loading modules.
[This step might take upto 20 minutes to complete - please wait.]
[*Warning -- Please do not abort installation/reload or powercycle fexes*]
[# ] 0%

Date: 2016-02-11 07:50:02 UTC
bhs4-15b-n56# install all system n6000-uk9.7.1.3.N1.2.bin kickstart n6000-uk9-kickstart.7.1.3.N1.2.bin force
Installer is forced disruptive

Verifying image bootflash:/n6000-uk9-kickstart.7.1.3.N1.2.bin for boot variable \"kickstart\".
[####################] 100% -- SUCCESS

Verifying image bootflash:/n6000-uk9.7.1.3.N1.2.bin for boot variable \"system\".
[####################] 100% -- SUCCESS

Verifying image type.

Date: 2016-02-11 07:15:18 UTC
nous allons débuter la maintenance par le B ( vpc primary )
Posted Feb 09, 2016 - 10:35 UTC
This scheduled maintenance affected: Infrastructure || BHS (BHS4).