csm

Migrating CSM Serverfarms to Other Server VLANs

A coworker brought an interesting problem to me the other day.  He wanted to move a serverfarm from one server VLAN to another without taking an outage.  Since I didn’t want to have to come into the office late at night to do work, I decided to see what we could do.

It turned out to be pretty easy.  We tend to think of CSM VLANs as pairs — you have the client VLAN for the web servers where the vserver sits and the server VLAN where the serverfarm sits.  The CSM doesn’t know about these relationships; all it cares about is whether the servers are in a server VLAN, and we can use that to our advantage here.

Here’s a snippet of what the original config looked like (not really since I’m not telling you how my company’s network is set up). The original serverfarm included a serverfarm called SFARM-ORIG that included 192.168.0.10[12]. That farm is used by the vserver VSERV-ORIG that listens  to 1.1.1.1 on HTTP.  The probe is in there, too.

probe HTTP tcp
 port 80
!
serverfarm SFARM-ORIG
 real 192.168.0.101
  inservice
 real 192.168.0.102
  inservice
 probe HTTP
!
vserver VSERV-ORIG
 virtual 1.1.1.1 tcp http
 vlan 100
 serverfarm SFARM-ORIG
 inservice

To make the move, we start by creating a new vserver  and serverfarm that contains all the IPs invovled — both the original IPs that are already in service as well as the new IPs to which the servers will migrate.  The new vserver listens for 2.2.2.2.  In this case, we’re moving the servers to 10.10.1.10[12].

serverfarm SFARM-NEW
 real 192.168.0.101
  inservice
 real 192.168.0.102
  inservice
 real 10.10.1.101
  inservice
 real 10.10.1.102
  inservice
 probe HTTP
!
vserver VSERV-NEW
 virtual 2.2.2.2 tcp http
 vlan 200
 serverfarm SFARM-NEW
 inservice

When you first drop in the config, the original RIPs should come up as operational, and the new ones should fail since they don’t exist yet (duh!).  When everyone’s ready, you then move the service over to the new VIP and run off of that for a while to make sure everything’s working as expected.  When all the parties involved are happy, you can then start moving over the servers one at a time.  The probe should fail out a server pretty quickly, then, when the server is reconfigured and put on the right VLAN, the CSM should eventually see the new RIP come up and put it back in the available server pool for the farm.

Configured like that, you can move the servers whenever you would like, and the CSM will automatically detect the changes and act accordingly.  You just have to remember to remove the old IPs out of the serverfarm when a server moves.

Send any alternative study techniques questions my way.

CSCtd31622 – CSM, Cookies, and the year 2010

It seems that we have another piece of evidence that Cisco doesn’t like the CSM.  From what I’m able to creatively interpret, the software developers didn’t think anyone would be running the CSM for very long, so they set a variable that expires CSM-inserted cookies at 01:01:50GMT on 1 January 20101.  If you’re using cookies to make connections sticky, that means you may see some unexpected results; this shouldn’t affect the web servers’ cookies.

The bug tookit lists 4.3(3) as the “first found in” version, but I’m fairly confident that it exists in every version before 4.3(3).  If you want to be sure you have the bug, you can run the show mod csm # variable command and look for the COOKIE_INSERT_EXPIRATION_DATE value.  It should look something like this.

Switch#sh mod csm 2 variable

variable                        value
----------------------------------------------------------------
ARP_INTERVAL                    300
ARP_LEARNED_INTERVAL            14400
ARP_GRATUITOUS_INTERVAL         15
ARP_RATE                        10
ARP_RETRIES                     3
ARP_LEARN_MODE                  1
ARP_REPLY_FOR_NO_INSERVICE_VIP  0
ADVERTISE_RHI_FREQ              10
AGGREGATE_BACKUP_SF_STATE_TO_VS 0
COOKIE_INSERT_EXPIRATION_DATE   Fri, 1 Jan 2010 01:01:50 GMT
...

The “real fix” is to upgrade to 4.3(3.1) or 4.2(12.1).  Of course that means a reboot of the CSM and an outage and all that.  A workaround includes setting the COOKIE_INSERT_EXPIRATION_DATE variable to some time in the future.  The bug text gives an example of some time in 2020, but any time in the distant future will do.  Assuming your CSM is in slot 2 and you’ve selected 1 Jan 2020 at 00:00:00 for your expiration, you would do this.

Switch(config)#mod csm 2
Switch(config-module-csm)#variable COOKIE_INSERT_EXPIRATION_DATE "Web, 1 Jan 2020 00:00:00 GMT"

That’s much easier than upgrading the CSM, eh?  If you’re still using your CSM by 2020, you can set it again if you want, but you’ll be well past the EOL on that guy (4.1 goes EOL on 13 Oct 20122)

Send any space shuttle launch tickets questions my way.

Sources:

1: CSCtd31622 *
2: Cisco EOL/EOS for CSM 3.1.x and 4.1.x *

*  May require CCO access