Archive for April, 2009

Using SSH to Run Commands on a Router or Switch

SSH is more than just a shell.  You can copy files from and to a server or piece of network gear with it.  You can use it to tunnel traffic.  Possibly my favorite, though, is to use SSH to run a command on a remote box without interacting with a shell.

One of my biggest pet peeves with IOS (or pretty much any Cisco OS) is the lack of complex filtering.  Let’s say I want to look at all the downed ports and interfaces on modules 3 and 6 of my 6509.  I can’t easily do that with command from the IOS, but, on my Linux box, I can use multiple grep commands to get exactly what I want really easily.  Let’s work through the example, shall we?

To start with, let’s just do a show ip int brief without getting a shell on the switch.

ssh my.switch.com "show ip int brief"

When you run this and give your password, you see the output we’ve all learned to love, and, now that you’ve got it in STDOUT on your Linux box, you can start filtering. Now, let’s use grep to find the downed ports and interfaces on modules 3 and 6.

ssh my.switch.com "show ip int brief" | grep down | grep Ethernet[36]

How about downed ports and interfaces on modules 3 and 6 that not administratively down?

ssh my.switch.com "show ip int brief" | grep down | grep Ethernet[36] | grep -v admin

I’ll stop there, but it can go on and on.  Read up on regular expression and/or grep if you don’t know what we’re doing here.

What’s really happening is that we’re taking the output of the command “ssh ….” and piping it (with |) to the command grep.  We can send it to whatever command we want, though, so don’t be shy.  I’ve actually written several scripts that take output of commands like show int description on a router to generate some reports.  When I want to run one of those, I do something like this.

ssh my.switch.com "show int desc" | parseOutput.pl

There’s always a gotcha or two to watch for, isn’t there?  I’ve found a couple.

First, your command runs at your privilege level, so, if your user is priv 1, you’re not going to be able to do a show run or reload.  You could just ignore security for a bit and set your privilege to 15, but I don’t recommend doing anything like that.  Before you say it, you’ll probably have a hard time with enabling as well.  You can only run one command at a time, so you would just enable yourself and get kicked off.  Not very helpful.

Another problem I see is the lack of public/private key pair support on Cisco devices.  On a Linux box, you can copy your keys around, and those are presented in lieu of a password.  Since (most) Cisco devices don’t have home directories, there’s no place to drop the keys, and we’re left with just using passwords.  Support for this would be nice, but the security problems associated with keep SSH keys and user home directories are probably too much to even think about.

What else?  Oh, yeah.  The PIX/FWSM/ASA family supports SSH, but it acts differently from the IOS guys.  When you run a command through SSH, you actually get an interactive shell with the command already on the CLI for you. This is probably by design; the only thing you can really do from a non-priv prompt is to enable.

Anyway, send any grilling tips questions my way.

The Most Random Things Can Hurt The Network

This is a great one that I have to share.

A couple of coworkers walk in today and ask for some help on an issue.  It seems that a business unit was having latency problems with a web app, and, after research by the product team and sysadmins, nothing wrong could be found.  Lots of sites use the product, and only this one was having issues.  Also, the site was having no problems getting to other web sites and apps like Yahoo! or Google.

I informed the guys that their Internet access went over a T1 to an ISP, but, since the application was housed at our place, access to it was actually over the WAN circuit.  The guys told me that the unit had just called to complain about it again, so we checked out their WAN circuit.  Eureka!  The circuit was at 98% utilization.  There’s your cause.  The app is slow because the circuit is full.  Looking back through history, we see weeks of high utilization that explains why the users are having issues with the app.

Of course, the next logical step is to find out what’s talking so much.  I pulled up our netflow box to see what conversations were the big talkers.  I found a single IP at the site talking to a single IP at our home office via HTTP and HTTPs that accounted for 40% or so of the total bandwidth usage in the last 24 hours.  And the previous 24 hours.  And the previous 24 hours.  Hmmmm.

I didn’t recognize the IPs, so I asked the systems guys what they were.  They had no clue, and neither IP was reserved in our IP management system.  Quite strange.  I ran NMap against both boxes; that fine app told me that they were both printers.  Printers?  That makes no sense.  I noticed that HTTP was open (we saw that in netflow, too), so we pointed a browser at the one at the home office to see what turns up.  It’s the security system.  That makes even less sense.  Pointing a browser at the other IP, we find that it’s a security camera that’s looking at some server racks.  Great!  Somebody’s streaming video over the WAN.

We march upstairs to talk to the security guys.  They’re not streaming directly but pull up their app and notice that the camera in question has been sending 10 seconds of video every few minutes for the past 6 weeks.  The cameras only send video if their detect movement in the room, so what’s going on?  You can expect several movement events in a “computer room” as they call it, but people go home at night, and you shouldn’t see movies coming across for weeks at a time.  We pull up a few clips to see what the camera saw.

We see people going in.  We see people going out.  We see the rack doors open.  We see the usual stuff, but most of the videos are nothing.  Ten seconds of no change.  Finally, though, we see it.  Something stupid.  Someone had taped a piece of paper to one of the racks, and that guy was moving ever so slightly.  I couldn’t even really detect the paper itself moving; I could only see the text on the page move just a tad.  The camera saw it, though, and was doing its job and recording back to the home office.  A piece of paper took up 40% of the WAN circuit?  It’s always something new, eh?

I really hope they get our message to take down the paper.

Server NIC Aggregation to a Cisco Switch

Have you even noticed that your new servers all have 2 NICs on the board?  At least all of them that I’ve seen in the last 3 years have.  A lot of server admin actually use them in a NIC teaming scenario where both NICs are used as one logical device — much the same as Etherchannel on a switch.  This provides some fault tolerance and availability in case of failure, which is good idea in most cases.

There are a few different ways to configure teaming on the box (usually called bonding in Linux), and each has its own advantages and disadvantages.  The network dude(tte) may have to do some things on the switch side for some of them to work, though.  If you’re want to run in link aggregation mode (mode 4), for example, the switch ports need to be in the same channel group to work appropriately.

Let’s look at mode 4 a little closer to see what we need to do.  The scenario is that you have eth0 plugged into F0/15 of a 2950 and eth1 is in F0/16.  You’ve seen the configuration for channelling between switches before, so you know the basics.  Put the ports in the same channel-group and configure the proper Port-channel interface to do the work.  In this case, we’re just configuring the ports to house a host instead of being trunks.

int F0/15
 channel-group 1

int F0/16
 channel-group 1

int Port-channel 1
 speed 100
 duplex full
 switchport
 switchport mode access

I detect at least one problem with our setup, though.  Both NICs are plugged into the same switch; what happens when the switch goes down?  The server goes away.  Logic should tell you, then, to put the NICs in different switches to fix that, but you can’t do Ethernchannel on two different switches.   The ports have to be in the same device for the aggregation to work.  What’s the fix?

You can look at getting a nice chassis switch and putting each NIC in different modules.  Modern IOS versions allow etherchanneling across modules, so, if one module fails, you still have that other.  That would do it, but I’m sure you don’t have the money for a 4500 in the budget, right?

Another solution is to use a couple 3760s which, when connected using the StackWise cable, are one logical device.  That gives you two separate switches that you can configure with the same channel group.  An upgrade to this solution is to use a pair of 6500s with VSS 1440 modules in them so that you have a stack of 6500s!  I’m sure that’s not expensive at all, though.

Send any white shoes questions my way.