rssLink RSS for all categories
 
icon_red
icon_green
icon_red
icon_red
icon_red
icon_green
icon_green
icon_red
icon_red
icon_red
icon_orange
icon_green
icon_green
icon_green
icon_red
icon_blue
icon_red
icon_blue
icon_red
icon_red
icon_red
icon_red
icon_red
icon_red
icon_red
icon_green
icon_orange
 

FS#2343 — FS#6311 — rbx-g1

Attached to Project— Network
Incident
the whole network
CLOSED
100%
We have a hardware problem on the router. We isolated a defected card on the router.
Date:  Tuesday, 31 January 2012, 14:43PM
Reason for closing:  Done
Comment by OVH - Friday, 27 January 2012, 16:39PM

We took off the defected car of the router. This provoked the reboot of the 7 other cards :

0/2/CPU0 A9K-8T-L MBI-BOOTING PWR,NSHUT,MON
0/3/CPU0 A9K-8T-L MBI-BOOTING PWR,NSHUT,MON
0/4/CPU0 A9K-8T-L MBI-BOOTING PWR,NSHUT,MON
0/5/CPU0 A9K-8T-L MBI-BOOTING PWR,NSHUT,MON
0/6/CPU0 A9K-8T-L MBI-BOOTING PWR,NSHUT,MON
0/7/CPU0 A9K-8T-L MBI-BOOTING PWR,NSHUT,MON


Comment by OVH - Saturday, 28 January 2012, 04:22AM

All traffic routed by rbx-g1-a9, one of our core routers on roubaix was impacted between 12:55 12:35 ET approximately. One of the new cards 24x10G that we inserted last night (http://status.ovh.co.uk/?do=details&id=2272) was found defective while wa was activating the new ports.

Sequence of events during the outage:
- The traffic through the router started decreasing (important packets loss)
- New ports were immediately taken off,but the problem persisted
- card 0 was removed of the chassis,no more packet loss, but all other cards rebooted 8T-l (not the other 24x10GE). The router loses instantly 48x10G of its capacity. The routing is now largely provided by the rbx-g2-a9.
- However,the traffic is impacted again, this time because several links were saturated and "side effects" caused by the loss of all these links on the other routers
- Cards reboot,but on this kind of equipment,the linecards take long minutes to get back operational.
- Finally, we set the card 24x10GE back after causing failure by 8T-L and we will set the uplinks on this card. The router is back to its normal status,after 20 minutes.

We are working currently with Cisco in order to identify the origin of the problem and replace of the defective card as soon as we can.


Comment by OVH - Tuesday, 31 January 2012, 14:42PM

We wait the spare card that should arrive this week.
they are very very new cards and the stock of spare
is not yet in place at Cisco.