Cisco Clock Issue – This Is Really Bad
Check out this advisory from Cisco that came out a couple days ago. You need to read it and act on it immediately! I’ll summarize for you : Thanks to a faulty clock signal component, certain Cisco devices will stop functioning after about 18 months and become really expensive bricks! Reading through it, you’ll see phrases like “we expect product failures” and “is not recoverable.” Seriously, what the hell? This really warms the heart.
The fault affects a couple Meraki devices, the Nexus 9504, and some models of the ISR 4000s – the ISR4331, ISR4321, and ISR4351. The 4000s are part of Cisco’s flagship branch routers, and I know several people (including myself!) who have some of the affected units deployed in production. Some unnamed people on Twitter tell me that they have 50 and even 120 of these guys deployed in the field. That’s a lot of faulty clocks.
The fix is to open a TAC case and get a new device. Cisco is using the word “platform” when talking about replacement, meaning that they’ll send you a naked device. If you have cards or memory upgrades or a particular software license, you’ll need to make arrangements to get that moved over to the replacement device. We’re talking unracking, undressing, re-dressing, re-racking, upgrading, licensing, and configuring. This is a complicated procure that is going to cost companies a lot of time and money. I imagine companies all over the world are going to wind up flying their engineers all over creation to do hardware swaps before the 18 months have elapsed. You can’t wait until “the next time you fly out there”. You have to do something now (I refuse to say it’s a race against the clock.). I’m sure you have nothing better to do.
Cisco says that the faulty clock signal component is used by other companies as well, though I haven’t heard of any other companies publishing advisories. Cisco is also not revealing the name of the supplier, so there’s no way to know what other products are affected. I’m not trying to make your ulcers flare up, but it entirely possible that you may have other stuff in your network just stop working. We’re all waiting to see what happens next.
Alka-Seltzer questions to me.
- Netbox Upgrade Play-by-play - April 25, 2023
- Sending Slack Messages with Python - March 15, 2023
- Using Python Logging to Figure Out What You Did Wrong - February 26, 2023
Re-licensing indeed. Last time I had to replace a failed nexus switch, Cisco licensing was refusing to re-issue the licenses for the replacement device (aka re-host the license) until they received and processed the RMAed device!!! Took hours and escalation to our SE and account manager (and we are a fairly large account) to get them to issue a temp license, and day more to get a new permanent *working* (because they also sent us two non-working license files).
Just read your post this morning and 2 hours later I got a call regarding a router being down.
The ISR4331 (HW rev. V01) was totally gone with a blinking amber status light. No amount of powercycling did anything. Had to replace it with spare.
I think we installed it approx. 18 moths ago, so the timeline fits.
So next week I will have to check the rest of the routers and newer ASA’s…
So in short… Yes! This is very bad and is already affecting production enviroments.
Thanks for writing about it. I appreciate it.
And that clock is inside a Intel Cpu and that unit used by many IT infrastructure units&vendors.Cisco is the only first flag riser, more&more will come I think.
My company wants me to install one that was delivered, instead of forcing Cisco to send a replacement. I think our customer may not be aware of the issue.
We have over 100 x 4331s that need to be replaced. Anyone come up with a relatively simple procure to swap out a router and get the new one back up and running as quick as possible?
The number of networking, server, and storage devices affected by the clock signal flaw has expanded greatly. By our count, there are now at least 20 vendors whose products include the potentially faulty Intel Atom C2000 chip.
To help, Auvik Networks has put together a master list of all known vendors and devices affected by the flaw. We hope this can serve as a central reference for IT pros and service providers looking to assess their exposure.