Configuring Fault Tolerance on the CSM
Like (nearly) everything in the Cisco world, you can set up your CSM to fail over to another module when the primary dies a horrible death. You can have two in the same chassis or even have them in separate chassis — the process is the same no matter how you have it set up. Either way, you have a primary and a secondary module in fault tolerance (FT) mode.
First, we’ll establish a VLAN that the CSM will use to do its configuration and state syncing over. This is just an ordinary VLAN; there’s nothing special about it, really, but it should be dedicated for the CSM to use for syncing. Let’s randomly choose VLAN 83.
vlan 83
name CSM-Sync
You will, of course, have to do this on every switch that holds a CSM, so, if you’re using them in two different chassis, you’ll put the same VLAN on each making sure they can see each other through a trunk. Cisco recommends that you dedicate a trunk between the two switches for the sync VLAN in order to remove the chance of other traffic stepping on the sync packets, but I’m not convinced that’s necessary. Use your judgement on that one.
Back to it. Next, you need to decide on a FT group ID. This is similar to a HSRP group and lets you run multiple FT groups on the same VLAN. The group ID needs to be in the range of 1 to 256, so, since this is the first one, let’s just use 1. Get into config mode for the CSM that you want to be the primary and do this.
ft group 1 vlan 83
This takes you to the config-slb-ft prompt. Just like HSRP, we need to set priorities for each device and whether or not it should preempt, so let’s configure. Yes, we want to preempt, right? Let’s set the priorities to 100 and 90, too.
priority 100 alt 90
preempt
This sets the primary CSM to priority 100 and the secondary to 90; both will preempt.
What about configuring the secondary for FT? That’s easy. Go into CSM config mode on the secondary and enter the ft group 1 vlan 83 command. That’s it. The two CSMs will do a little arguing and come back as the primary and secondary. After that, all configuration is done on the primary, which is synced over to the secondary just like an ASA. Pretty cool, eh?
When configuring things like IP addresses, though, you’ll need to make provisions for the secondary with the alt directive (remember that one from the priority). I won’t go into much, but you’ll need it mostly when settings IPs to VLANs. Here’s an example of setting an IP address on client VLAN 100 for both the primary and secondary.
vlan 100 client
ip address 192.168.0.11 255.255.255.0 alt 192.168.0.12 255.255.255.0
Alright…one more thing. The configurations don’t sync automagically (at least not on my old version of code). If you make a change to the primary CSM, you’ll see an out-of-sync message when you look at the FT status.
Switch#sh mod csm X ft
FT group 1, vlan 83
This box is active
Configuration is out-of-sync
priority 100, heartbeat 1, failover 3, preemption is on
alternate priority 90
If the primary goes down now and the secondary takes over, the changes you just made won’t be reflected on the secondary. You fix this with the hw-module contentSwitchingModule X standby config-sync command (where X is the module slot in the chassis). Alternatively, you can just type hw c X s c as a shortcut. It’ll take a few minutes depending on your configuration, so check your logs for when it’s finished. Note that the secondary does not save the new configuration to its startup-config; you’ll have to log in and save that manually (or automatically through CiscoWorks or something) to save changes there.
Let me know if you have any questions and check out my page on getting output from Cisco’s fine mid-tier load balancer. 🙂
- Generating Network Diagrams from Netbox with Pynetbox - August 23, 2023
- Out-of-band Management – Useful Beyond Catastrophe - July 13, 2023
- Overlay Management - July 12, 2023
[…] Configuring Fault Tolerance on the CSM Blogroll […]
hey,
thanks a lot.. thats a real easy an useable guide 🙂
just two little questions about it:
* can i configure this in a live-setup where i have a full configured csm and want to add a new/empty one?
* is there any downtime when i activate the second one?
thanks a lot
daniel
Those are great questions, Daniel.
If you already have a stand-alone CSM configured, configuration changes will be required; you would have to configure the alt addresses and FT VLAN. I imagine that would cause an outage. If you had the primary already configured for FT, however, then adding a second CSM shouldn’t take anything down at all.
Of course, this is all in theory and hasn’t been labbed out, so take my opinion at your own risk. 🙂
okay..
here’s my report:
* i just configured the alt-adresses and the FT stuff on the primary – no impact
* i configured the FT on the secondary
* i synced the config – no impact
sometimes its the best way to just try it.. my dear little module.. i like you 🙂
so bad that the end for the csm is near 🙁
Good info, Daniel. Thanks for sharing. I totally expected a hiccup.
hmmm..
now the big question for me.. how does failover work.. did you ever seen a failover? how long does it take? is it really hitless? (cisco told me that there is no downtime when switching to the backup-csm [if you have connection and sticky replication enabeld])
When I was doing our current deployment, I had several hosts connect to a few vservers and just wait. I then went down to the data center and pulled power to the 6500, and I didn’t lose any connections. I also did some ping testing through the CSM and to the alias on the back-end networks with the same result. I’m pretty confident that you’ll see a seemless failover if all the replication is successful.
Just remember to do the hw-modules “csm X standby config-sync” when changes are made. The CSM can replicate everything but the config, which makes no sense to me.
nice.. 🙂
in three weeks i have a downtime to make a failover-test.. i will report back after that..
Aaron,
Is it possible to setup 2 CSM mods in 2 different chassis in Active-Active mode ?
Somehow I missed this, Kashi. Sorry about that. I believe that CSMs can only be configured as active-passive. Looking through configuration notes, I only see commands that relate to failing over; I don’t see any related to modes or having two modules be active at the same time.
Hello Aaron, hope you still support this post, I just installed a second CSM on the same chassis, configured FT.
The test were successful, both CSM works when there are active.
I noticed that even I run on the active CSM, “hw-module csm 4 standby config-sync”, I still see Configuration is out-of-sync.
This is the event I got when run hw c 4 s c,
———–
Apr 2 08:18:22.209: %CSM_SLB-6-REDUNDANCY_INFO: Module 4 FT info: Active: Bulk sync started
Apr 2 08:18:22.209: %CSM_SLB-6-REDUNDANCY_INFO: Module 7 FT info: Standby: Bulk sync started
Apr 2 08:18:22.217: %CSM_SLB-6-REDUNDANCY_INFO: Module 7 FT info: STANDBY:Configuration is being received, This may take several minutes!
Apr 2 08:18:22.217: %CSM_SLB-6-REDUNDANCY_INFO: Module 4 FT info: Active: Sending configurations to Standby CSM, this may take several minutes!
Apr 2 08:18:31.541: %CSM_SLB-6-REDUNDANCY_INFO: Module 7 FT info: Standby: Started clearing configuration
Apr 2 08:18:31.545: %CSM_SLB-6-REDUNDANCY_INFO: Module 7 FT info: Standby: Completed clearing configuration
Apr 2 08:18:31.549: %CSM_SLB-4-REDUNDANCY_WARN: Module 7 FT warning: Standby: Config Sync does not save running-config to startup-config
Apr 2 08:18:31.549: %CSM_SLB-6-REDUNDANCY_INFO: Module 7 FT info: Standby: Previous configuration are being deleted from supervisor
Apr 2 08:18:31.553: %CSM_SLB-6-REDUNDANCY_INFO: Module 7 FT info: Standby: Previous configuration being deleted on Standby CSM
Apr 2 08:18:33.649: %CSM_SLB-6-REDUNDANCY_INFO: Module 7 FT info: Standby: New configuration are being configured
Apr 2 08:18:36.345: %CSM_SLB-6-INFO: Module 7 info: SLB-NETMGT: vserver V-FARM0 inservice
Apr 2 08:18:36.345: %CSM_SLB-6-INFO: Module 7 info: SLB-NETMGT: vserver V-FARM1 inservice
Apr 2 08:18:36.345: %CSM_SLB-6-INFO: Module 7 info: SLB-NETMGT: vserver V-FARM2 inservice
Apr 2 08:18:36.537: %CSM_SLB-6-REDUNDANCY_INFO: Module 4 FT info: Active: Manual bulk sync completed
————
According to the above logs, the sync works fine, but I still get this.
C6500#sh module csm 4 ft
FT group 1, vlan 121
This box is active
Configuration is out-of-sync
priority 50, heartbeat 1, failover 3, preemption is on
C6500#sh module csm 7 ft
FT group 1, vlan 121
This box is in standby state
Configuration is out-of-sync
priority 10, heartbeat 1, failover 3, preemption is on
Both CSM running the folling version 4.3(4)
I´ll really appreciate your help on this.
Regards
Hey, Edgar.
I’ve seen that type of thing in the past with 2 CSMs in different chassis. The solution for me was actually just wait; it sometimes took 10 or 15 minutes to sync the configs over. Since you’re asking here, I can only assume that they’re never syncing, though.
I can only suggest opening a TAC case. Sorry that I can’t be any more help.
Hello Aaron
Sorry for not answering asap.
Thanks you for you answer.
Regards,
Edgar
Hello Aaron
Good to hear about you.
I hope you can help me to figure out this question, I have a CSM module with more than 10 serverfarms, all of them working fine, and all of them are on different vlans. We are using route mode for all of them.
For example:
ServerFarm01-> Vlan10
Client_Side01-> Vlan11
ServerFarm02-> Vlan20
Client_Side02-> Vlan21
ServerFarm03-> Vlan30
Client_Side03-> Vlan31
I noticed something, When I generate outbound traffic from a real server, it does not matter it belong to SeverFarm01, 02 or 03; the packet leaves the CSM using alway the vlan31.
Can you please help to determine what’s going on?
Actually, we want realservers from ServerFarm01 to sent traffic to internet through the CSM, and those traffic should be seen on vlan11.
Thankds and Regards
Edgar