DHCP ACK Error on Avaya Phones
We're an Avaya voice shop (for now if I have my way) and have Avaya systems of various sizes and shapes all around the Enterprise. I was at one of our remote locations a few weeks back and helped the guys there replace a non-PoE switch so they could get the old power injector panel out of their rack. When we moved stuff around, the phones didn't come back and had the dreaded DHCP Ack Error.
In the environment, we have the usual data and voice VLANs on every switch port, and PCs are connected through the built-in switch in Avaya 4610SW IP phones. The DHCP server is a Windows something-something-something server (maybe 2008) serving both VLANs. When the phones boot, they come up on the data VLAN, get DHCP option 176 (remember it's Avaya and not Cisco), reboot onto the voice VLAN, and wait for an address; after a few seconds, they get the DHCP Ack Error displayed on the phone. Thanks for that very descriptive error, Avaya. The data VLAN was working fine, and, when I put my switch port into the voice VLAN natively, I got an address on my laptop. It's just the phones.
A quick search showed a solution that I've seen a thousand times but keep forgetting (another in a long list). If the DHCP server is in the data VLAN and is also tagged in the voice VLAN, the phones won't get an address. Huh? We didn't change the DHCP server. Well, it turns out that the server was in the original non-PoE switch, and we had moved it to a port configured for a workstation (a fact I left out in the original edit of this post). As soon as I took the switchport voice vlan X off of the server port, the phones started getting addresses, and we could finally hear the sweet sound of dial tone. Problem solved.
According to this Avaya users group post, this happens because the DHCP server receives a DHCP request in a tagged packet; the server doesn't like it and NAKs it. The fix works, but I'm not really satisfied with the answer of "it doesn't like it". I may take a day and test this myself in my home lab.
- Netbox Upgrade Play-by-play - April 25, 2023
- Sending Slack Messages with Python - March 15, 2023
- Using Python Logging to Figure Out What You Did Wrong - February 26, 2023
Your blog entry caught my attention since I'm very much into IP telephony specifically Avaya/Nortel. You actually have your Windows DHCP server connected to the network over an 802.1q trunk? And your tagging multiple VLANs up to the server? I'm curious why you don't just utilize a DHCP relay and connect the server with an access port configuration. The discussion forum thread you linked to seems to revolve around an an issue with the DHCP server when the packets arrive at the server with an 802.1q header.
Thanks for the information!
Hey, Michael. The DHCP server is supposed to be on an access port on the data VLAN with DHCP relay in place for the voice VLAN. Since its port was configured with a voice VLAN, it was receiving 802.1q tagged traffic from the voice VLAN. Once that config was removed from the switch, everything was kosher again.
Hey Aaron, I think what is happening is that the server is receiving two DHCP requests. One from the relay agent on the untagged vlan and one broadcast hitting the server directly over the Dot1q port. The server NAKs one and responds to the other. The phone receives both the NAK and the response to the broadcast and this is causing the problem. I haven't had a chance to test it out.. but I bet if you took a couple traces you'd see multiple packets hitting the phone with the same DHCP session id.
So, what does this have to do with removing the midspan PoE box? Did the DHCP server get migrated to the new switch too, and the 'switchport voice vlan X' configuration on the server port was unique to the new switch?
Windows is weird with 802.1Q tagging. In my experience, the windows drivers configure the NIC to pass tagged traffic, but /strip/ the 802.1Q header. It's about the most unhelpful combination I can imagine. I think Daniel Wood hit the nail on the head here. If the NIC had dropped the tagged packet, or left the tag intact (for the OS to drop), then everything would probably have been fine.
Ever try assigning the VLAN ID with LLDP? The last place I worked tried and failed. I didn't get enough details to know what went wrong. LLDP seems like the way to go, because it saves the administrative overhead of maintaining (possibly thousands) of VLAN assignments in the DHCP server.
Chris: The DHCP server did indeed get moved in the switch migration, but I actually didn't mention it in the post for some reason. Silly me. It's included now.
I haven't tried LLDP but I'm sure we'll all have to deal with it sooner or later. I'll check it out; thanks for the info!
Just had this very problem today after changing DHCP servers and a quick google search led me right to your post….Small world (I’m sure you get the irony when you take a look at my name). Anyway this was exactly the problem that I was having, so kudos, helpful article.
That’s funny, Daniel, because I wrote this article after we had this problem at another location with which we are both quite familiar. 🙂
I ran into the same problem here with Avaya phones on a Cisco network and Windows 2008 DHCP. Because we diddn know exactly on which switchport the DHCP server was running we just “quick and dirty” added the voice VLAN to all access-ports, assuming this would de the job. Apparently it didn’t and it took me an hour of headaches before I ran into your blog.
Your solution worked like a charm! So much for “Worthless Words”. Thanks very much for blogging this.
Thank you so much for your post. We just had a similar problem and were pulling our hair out trying to figure out what was wrong. Once I found your article, we made a similar change and all was well!
Had a similiar problem, “dhcp ack error” but phones worked when address was fixed. Ran wireshark on a pc and did ipconfig/renew. Instead of seeing a handfull of dhcp packets the screen filled with two dhcp servers fighting it out. Turned out a wireless router had it’s uplink cable plugged into a lan port. Use “bootp” as wireshark filter to see dhcp traffic.
Thanks so much for this! I just performed the exact maintenance that you did replacing an old non PoE with an HP 5406. All phones gave me the ACK error. Once I removed the VOICE VLAN tag from the port the server was in, all the phones obtained IPs and came up. Thanks so much!
Thank you. Was pulling my hair out and found your article. Fixed the problem in seconds..
Thank you for this! I had the same error and I just untagged the voice vlan on the DHCP server for the PC’s and everything worked like a charm!
I had the same problem going around the bush. your article was helpful we need more on this.