• Skip to main content
  • Skip to footer

NetworkJutsu

Networking & Security Services | San Francisco Bay Area

  • Blog
  • Services
  • Testimonials
  • About
    • About Us
    • Terms of Use
    • Privacy Policy
  • Contact Us

Switching

DHCP Issues (CoPP Profile)

01/12/2016 By Andrew Roderos Leave a Comment

  • Share on Twitter Share on Twitter
  • Share on Facebook Share on Facebook
  • Share on LinkedIn Share on LinkedIn
  • Share on Reddit Share on Reddit
  • Share via Email Share via Email

There are several ways to troubleshoot DHCP issues. One could verify that DHCP scope is good, verify if the DHCP relay configuration is configured correctly, etc. Today, this blog post will cover a possible solution for DHCP issues that one might encounter. Though, this might be unlikely to happen in some networks. Of course, that depends on the environment.

Symptom

There were several IP phones spread throughout the enterprise that was working before but then stopped getting IP addresses. Other phones on the same VLAN were getting IP addresses just fine though. Several buildings were affected by this issue and when the phones were replaced, they were still not getting IP addresses. If they were brought back to the tech’s office, the phones did get IP addresses but the topology is not the same, so the only thing that test tells us was that the phones were not the issue.

Troubleshooting

Few steps were already taken so either one would verify everything again or save some time and continue where it was left off before the escalation. For example, DHCP scope was checked already to make sure they were correct and there were available IP addresses left to give out, voice VLAN was set, VLAN is on the switch’s VLAN database, DHCP relay configurations were correct, etc. One interesting piece of information was that the DHCP server never saw DHCP Discover and/or DHCP Request from the phones that were having issues. Majority of the time (at least in my experience), the issue was the relay configuration. This time, it was not. This makes sense since only a few phones were affected and not the whole voice VLAN.

So far, this tells us that somewhere in the network is dropping packets. Before busting out the Wireshark or Ethanalyzer, if available, to track where the packet is being dropped, one possible location where it is being dropped is the gateway. In this scenario, the gateway is a pair of Nexus 7000 running HSRP. With Nexus switches, CoPP (Control Plane Policing) is one of the configurations that will be set during the initial configuration.

In this scenario, the CoPP is set to strict. This can be verified by several ways and one way is by using show run | i “copp profile” command. To verify if the CoPP is being violated, one has to issue the command below.

N7K-enable# show policy-map interface control-plane class copp-system-p-class-normal-dhcp
Control Plane
  service-policy  input: copp-system-p-policy-strict
    class-map copp-system-p-class-normal-dhcp (match-any)
      match access-group name copp-system-p-acl-dhcp
      match redirect dhcp-snoop
      set cos 1
      police cir 680 kbps , bc 250 ms
      module 3 :
        conformed 3430912611 bytes; action: transmit
        violated 1331596081 bytes; action: drop
      module 4 :
        conformed 6375239885 bytes; action: transmit
        violated 2201866339 bytes; action: drop

As shown above, the CoPP for DHCP is being violated and packets are being dropped. There are some instances that the nodes will recover since it will send DHCP Discover again, but there might be times where a custom CoPP is needed. To verify that this policy is the right one for DHCP, issue the commands below.

N7K-enable# sh run copp all | sec normal-dhcp
class-map type control-plane match-any copp-system-p-class-normal-dhcp
  match access-group name copp-system-p-acl-dhcp
  match access-group name copp-system-p-acl-dhcp
  match redirect dhcp-snoop
N7K-enable# show ip access-lists copp-system-p-acl-dhcp
IP access list copp-system-p-acl-dhcp
        10 permit udp any eq bootpc any
        20 permit udp any neq bootps any eq bootps

Custom CoPP

To customize the CoPP to fit in one’s environment, the recommended is to double the rate and monitor for issues. If problem persists, continue to modify the rate until the issue is gone. Make sure to monitor the CPU as well because this might be impacted with the changes on the CoPP since we’re allowing more packets to go through the control plane.

The first step in customizing the CoPP is to copy the profile.

N7K-enable# copp copy profile strict suffix custom-copp

Once the CoPP profile has been copied, the value(s) that need(s) modification can now be done.

N7K-enable(config)# policy-map type control-plane copp-policy-strict-custom-copp
N7K-enable(config-pmap)# class copp-class-normal-dhcp-custom-copp
N7K-enable(config-pmap-c)# police cir 2048 kbps bc 1250 ms conform transmit violate drop

The configuration is well above the recommended rate, but to eliminate the possibility of the CoPP being the issue, it might be a good idea to increase it high enough so that no packets will be dropped.

The last step is to now assign the policy to the control plane and some verification.

N7K-enable(config)# control-plane
N7K-enable(config-cp)# service-policy input copp-policy-strict-custom-copp
N7K-enable(config-cp)# exit
N7K-enable (config)# exit
N7K-enable# copy run start
N7K-enable# show copp status
Last Config Operation: service-policy input copp-policy-strict-custom-copp
Last Config Operation Timestamp: 06:04:03 UTC Jan 1 2015
Last Config Operation Status: Success
Policy-map attached to the control-plane: copp-policy-strict-custom-copp
N7K-enable# show policy-map interface control-plane class copp-class-normal-dhcp-custom-copp
Control Plane
  service-policy  input: copp-policy-strict-custom-copp
    class-map copp-class-normal-dhcp-custom-copp (match-any)
      match access-group name copp-acl-dhcp-custom-copp
      match redirect dhcp-snoop
      set cos 1
      police cir 2048 kbps , bc 1250 ms
      module 3 :
        conformed 980075838 bytes; action: transmit
        violated 0 bytes; action: drop
      module 4 :
        conformed 275544063 bytes; action: transmit
        violated 0 bytes; action: drop

Thoughts

This may be an uncommon solution for DHCP issues but once a similar symptom is being experienced, then it does not hurt to do some verification that the CoPP is being hit. There are still a lot things I have to learn in Nexus platform since it is quite different from the Catalyst platform. This is one of those things that is different between Nexus and Catalyst.

Want to learn more about NX-OS?

NX-OS and Cisco Nexus Switching: Next-Generation Data Center Architectures

Disclosure

NetworkJutsu.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.

How to configure Quanta T3040-LY3

07/17/2015 By Andrew Roderos Leave a Comment

  • Share on Twitter Share on Twitter
  • Share on Facebook Share on Facebook
  • Share on LinkedIn Share on LinkedIn
  • Share on Reddit Share on Reddit
  • Share via Email Share via Email

I was looking at Nexus 3064-T for one of the projects that I was involved in that required 10GBASE-T but while I was waiting for a demo unit I was tasked to play with the Quanta T3040-LY3. I’ve worked with them a little bit since some of our clients have them deployed and sometimes I get phone calls about helping them to get it working. Technically, I didn’t have to but I didn’t want to be the guy who doesn’t try help.

The Quanta’s syntax is almost identical to IOS which is not a surprise since there are vendors out there copy the CLI commands. Only one I’ve encountered so far that is completely different from IOS is Junos OS. Having said that, it was quite easy to convert my template to Quanta equivalent commands. However, I did have to read the docs when I had questions on how to do a specific command on their OS.

Configuration

The configuration shown here are pretty basic so if you’re looking at advanced stuff then you might want to move on to the next site. If you work with IOS, then there’s really no need to explain line by line since it’s pretty self-explanatory, for the most part. Here’s a sample configuration:

hostname networkjutsu-switch
!
no username guest
enable password passwordhere
username admin password passwordhere
!
ip domain-name networkjutsu.com
ip name-server 192.168.200.100
ip name-server 192.168.201.100
!
port-channel load-balance src-dst-ip all
vtp
vtp mode transparent
lldp med all
no cdp run all
!
vlan database
!Do you remember vlan database in Cisco IOS?
 vlan 99
 vlan name 99 MGMT_192.168.1.0/24
 vlan 10
 vlan name 10 DATA
 exit
!
interface vlan 1
 no ip address
 shutdown
interface vlan99
 ip address 192.168.1.100 255.255.255.0
 no shutdown
 no ip redirects
 no ip unreachables
!
serviceport protocol none
!This is to disable DHCP on the management port of the switch.
!To statically assign an IP address to the management port then
!use serviceport ip ipaddresshere subnetmaskhere gatewayhere command.
!
ip dhcp snooping vlan 1-4093
!
no ip dhcp snooping information option
no ip dhcp snooping verify mac-address
ip dhcp snooping
!
errdisable recovery cause bpdu
spanning-tree edgeport bpduguard
!
sflow rate 2000
sflow receiver 1 ip 192.168.100.200
sflow source-interface vlan 99
!
interface range 0/1 - 0/40
 no shutdown
 switchport access vlan 10
 switchport mode access
 storm-control broadcast
 spanning-tree edgeport
!
interface range 0/41 - 46
 no shutdown
 switchport tagging 1,10,99
 switchport allowed vlan add 10,99
 ip dhcp snooping trust
 exit
!
interface range 0/47 - 48
 no shutdown
 channel-group 1 mode active
interface port-channel 1
 no shutdown
 switchport tagging 1,10,99
 switchport allowed vlan add 10,99
 ip dhcp snooping trust
!
ip default-gateway 192.168.1.1
no ip http server
no ip http secure-server
!
logging traps debug
logging host 192.168.202.100 ipv4
!
line console
line vty 
 no sessions
line ssh
!
sntp clock timezone CA 8 0 after-utc
sntp server 172.16.100.50 ipv4
sntp server 172.16.100.60 ipv4
sntp source-interface vlan 99
!
aaa authentication login default radius enable
aaa authentication enable default enable
!
radius source-interface vlan 17
radius server timeout 2
radius server retransmit 1
radius server host auth 192.168.210.100 name radius01 port 1812
radius server host acct 192.168.210.100 name radius01 port 1813
radius server key auth 172.16.100.100 keyusedbytheradiusserverhere
!RADIUS key cannot be more than 16 characters.
radius server host auth 172.16.100.101 name radius02 port 1812
radius server host acct 172.16.100.101 name radius02 port 1813
radius server key auth 172.16.100.101 keyusedbytheradiusserverhere
!
exit
copy run start

Thoughts

As you can see, the configuration is almost identical as the IOS. There are some differences but for the most part almost identical. The OS does allow you to configure the switch via web, but it was quite painful to use. It’s still better to just use the CLI than the Web GUI.

The switch is pretty inexpensive compared to its competitors. However, there are few things that I didn’t quite like about it. First, the switch didn’t have QSFP. If I am not mistaken, at the time I played with this switch they weren’t selling a 10GBASE-T with QSFP uplinks in them like the Cisco Nexus 3064-T, 3172TQ, or Juniper QFX5100. Second, the RADIUS key was limited to 16 characters. Not such a big deal but it is quite a hassle to involve the RADIUS person/department to generate a shorter key for the switch. Third, I was using a demo unit and it had a trial OS license expired that didn’t warn me that it was expired and ports are unusable. It was quite annoying that the OS didn’t warn me about it instead of me wasting time in figuring out why the ports would automatically be in disabled state when I literary admin up the ports. Last, I was not able to configure SNMP ACL. I looked at the docs and I didn’t find a way to configure it. I believe this should come standard with the OS.

Disclosure

NetworkJutsu.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.

Implementing Wired 802.1X

06/29/2015 By Andrew Roderos 2 Comments

  • Share on Twitter Share on Twitter
  • Share on Facebook Share on Facebook
  • Share on LinkedIn Share on LinkedIn
  • Share on Reddit Share on Reddit
  • Share via Email Share via Email

I first learned about 802.1X when I was studying for one of the CCNP exams, BCMSN exam (SWITCH equivalent), at Ohlone College. At the time, I assumed that the short material covered in the book was all of it. Of course, that was a bad assumption in my part. That’s probably a normal assumption of someone who at the time just finished Cisco Network Academy Program CCNA 1 to 4 and newly minted CCNA with no professional experience.

What is 802.1X?

Essentially, 802.1X is a security feature that provides a mechanism to authenticate devices before it can access network resources. While it’s a good idea to have this security feature implemented, I’ve worked for companies who didn’t have this feature or similar implemented or it’s on their roadmap. It’s a shame that it wasn’t on their roadmap a long time ago since it was ratified in 2001. Then again, implementing technologies have its challenges.

How it works?

802.1X_wired_protocols
Image from Wikipedia

While there are other sources that will explain this in detail, this post includes a very short description on how it works. Basically, when a device connects to the wired network, the authenticator (switch) will send an EAP message to the supplicant (computer). If the computer has a supplicant, it will send an EAP response to the authenticator. The authenticator will then send a RADIUS message to the authentication server (RADIUS server). The authentication server will then challenge the supplicant to verify its identity. Once verified, the device will then be able to connect to organization’s network resources.

Environment

Every organization has their own unique implementation of technologies, so gather what you can and go from there. For example, in this scenario the requirements were to have two sets of RADIUS servers: one for switch-based authentication and the other for port-based authentication. This seems to be an uncommon setup so it required some research to split the two sets of RADIUS servers. My initial assumption was that it wasn’t possible. That assumption is only correct in older code, but with IOS 15.x the feature is supported.

This is a multivendor organization so LLDP is used instead of CDP, which is disabled globally by default due to the switch template configuration.

A lot of users are using Apple notebooks and/or desktops and most of these users are running VMware Fusion to run Windows and/or Linux.

IP phones are ubiquitous so this requires a great deal of attention. If my memory serves me right, the 802.1X topic in BCMSN didn’t cover how to implement it with IP phones so Cisco’s documentation and Google were my friend during my research.

A lot of devices do not have supplicant and there are instances where PXE boot is needed.

In addition, there were some locations that need WoL (Wake on LAN) feature so that needs an attention as well.

Configuration

As mentioned earlier, the requirement is to have two separate RADIUS servers for both switch-based and port-based authentication. That said, let’s take a look on how to do this. But first, let me show you how it was done prior to IOS 15.x code. This command still works in 15.0(2), but you’ll receive a warning saying that it will soon be deprecated.

Old format

radius-server host 192.168.1.1 auth-port 1812 acct-port 1813
radius-server host 192.168.1.2 auth-port 1812 acct-port 1813
radius-server retransmit 1
radius-server timeout 2
radius-server key 7 hashkeyhere

Since the requirement is to split the RADIUS servers, we need to use the new format of specifying the RADIUS servers which will be needed when we create the AAA groups.

New format

radius server switch-auth1
 address ipv4 192.168.1.1 auth-port 1812 acct-port 1813
 timeout 2
 retransmit 1
 key 7 hashkeyhere
radius server switch-auth2
 address ipv4 192.168.1.2 auth-port 1812 acct-port 1813
 timeout 2
 retransmit 1
 key 7 hashkeyhere
radius server dot1x-auth1
 address ipv4 192.168.1.3 auth-port 1812 acct-port 1813
 timeout 2
 retransmit 1
 key 7 hashkeyhere
radius server dot1x-auth2
 address ipv4 192.168.1.4 auth-port 1812 acct-port 1813
 timeout 2
 retransmit 1
 key 7 hashkeyhere

Enable AAA

Once enabled, authentication method for 802.1X needs to be defined. I included the one for the switch-based authentication with the port-based authentication for completeness sake. RADIUS accounting is turned on as well since it is listed as best practice in Cisco’s deployment guide.

aaa new-model
aaa group server radius switch-auth
 server name switch-auth1
 server name switch-auth2
aaa group server radius dot1x-auth
 server name dot1x-auth1
 server name dot1x-auth2
aaa authentication login default group switch-auth enable
aaa authentication dot1x default group dot1x-auth
aaa accounting dot1x default start-stop group dot1x-auth
aaa authorization network default group dot1x-auth

Enable 802.1X

Issue the command below.

dot1x system-auth-control

Configure switch ports

Next step is to configure each switch port that will use 802.1X. This command will automatically include dot1x pae authenticator in the running configuration so don’t be alarmed if you see it there. This is to ensure that dot1x authentication still works on legacy configurations without manual intervention. NOTE: It seems to be that the IOS that I was using automatically included the dot1x pae authenticator command. That said, please make sure to add the command if you do not see it.

interface range g1/0/1 - 48
 ! Make sure that the ports should at least have switchport mode access or it won't take the commands.
 authentication port-control auto
 dot1x pae authenticator

In the IOS 12.x, this would’ve been a different command. The command in the old world is dot1x port-control auto.

Technically, the commands above are all we need to configure for the 802.1X to work. However, the environment in this scenario requires more things from us that we still need to address.

VoIP phones

Let’s address the IP phones first since it’s ubiquitous within the enterprise environment. By default, the interfaces are set to be single-host mode. This means only one MAC is allowed in the data VLAN. This mode technically allows another MAC address but on the voice VLAN and only if CDP is supported. Since CDP is disabled on all of the switches deployed in this scenario, this needs to be enabled. I included the single-host mode command below since it won’t show up in the running configuration because it is the default configuration.

cdp run
interface g1/0/1 - 48
 authentication host-mode single-host

While this configuration works, there are few things that we need to keep in mind. The single-host mode means only single MAC can be authenticated on a switch port. If a different MAC address is detected on a port after an endpoint has authenticated then a security violation is triggered on the port. This will cause the port to be in errdisabled state and will require a manual intervention unless errdisable recovery is configured.

Since the computers are daisy chained on the back of the Cisco phones, there are technically two MAC addresses that will be seen on the port. As mentioned earlier, the single-host mode ignores the MAC address seen in the voice VLAN so this should work. It does work, however, once you shut the port down and enable it again, and phone or switch reboots, the switch port will see two MAC addresses on the data VLAN.

Now, you’re probably wondering why would the switch see two MAC addresses in the data VLAN when the IP Phone should only show up in the voice VLAN especially when the boot process is described in books like this. But, I’ve seen this happened in all three organizations I’ve worked for where the phone’s MAC address shows up in both data and voice VLAN, as shown below. If you do a quick search, you’ll see more people are seeing the same thing so it appears that this is the default behavior.

switch#sh mac add int g1/0/1
          Mac Address Table
-------------------------------------------
Vlan    Mac Address       Type        Ports
----    -----------       --------    -----
  10    0004.f2f0.4d98    DYNAMIC     Gi1/0/1
  20    0004.f2f0.4d98    DYNAMIC     Gi1/0/1
Total Mac Addresses for this criterion: 2

As you can imagine, this could turn to an operational nightmare especially when you have facilities people going in and out of the closet to do some work and they occasionally bump into the power cord of the switch by accident. The solution is to change the host mode to something that will not cause a security violation. One option is to use the Multi-domain authentication (MDA), which is shown below.

interface range g1/0/1- 48
 authentication host-mode multi-domain

MDA vs Multi-Auth

Multi-domain authentication (MDA) allows one MAC address on both data and voice VLAN. It is kind of similar with the single-host mode but this mode requires the device in the voice VLAN to authenticate. Initial testing looks like it’s working as expected. I didn’t see the same behavior where the phone’s MAC address shows up on both VLANs when I bounced the port.

MDA does not address the fact that the environment in our scenario will have users running VMware Fusion on their computer(s). When the user configures the VM with a network type of bridged mode, which means the switch will see two MAC addresses, then that will result in a security violation. This needs to be addressed so there has to be another mode that we could use. Fortunately, there is and it is called multiple authentication.

interface range g1/0/1 - 48
 authentication host-mode multi-auth

The difference between MDA and multiple authentication is that it allows multiple MAC addresses in the data VLAN, however, all devices must be authenticated to access the network resources.

As mentioned, there is a way to automatically recover from a security violation, by default it is set to five minutes. Before I show you the command for it, let’s think about the fact that the port will be in the errdisabled state once a security violation occurs. That means, the phones will be out of commission too. This is going to be frustrating for the users so we need to find a solution that only errdisable the VLAN where the security violation occurred. Fortunately, the switch has the voice aware 802.1X security feature and is shown below with the errdisable recovery.

errdisable detect cause security-violation shutdown vlan
errdisable recovery cause security-violation

MAC Authentication Bypass

The devices that do not support 802.1X feature still needs to access network resources so we need to find a way to let them in without disabling the port-based authentication where these devices are connected to. Cisco supports fallback mechanisms when a device fails to authenticate using 802.1X. A great option for devices that do not support 802.1X is the MAC Authentication Bypass (MAB).

With MAB, the MAC address is entered to the RADIUS server and when the device fails to authenticate using the 802.1X then the switch will fallback to MAB. The switch will then forward a message, with the MAC address of the device, to the RADIUS server. RADIUS server will then check its database to see if the MAC address is in its list. If it is, then the RADIUS server will signal the switch to allow access to the network. To enable MAB, issue the command below.

interface range g1/0/1 - 48
 mab

Another version of this command is shown below. If this command is used, the IOS will change it to mab in the running and startup config.

interface range g1/0/1 - 48
 dot1x mac-auth-bypass

While this fallback mechanism works, Cisco Catalyst switches have default values which delays the transition of a non-802.1X compliant from unauthorized to authenticated for 90 seconds. This might cause some issues with DHCP or PXE clients so it is recommended to tweak the default values to make it faster for the non-802.1X compliant devices to access network resources.

The 90 seconds is the combination of the dot1x max-reauth-req and dot1x timeout tx-period values. The default value for the former is two and the latter is 30 seconds. Multiply both values and the result is 60 seconds. You’re probably thinking where’s the other 30 out of the 90 seconds? Well, that was the initial request for the device to authenticate and when it fails the switch will then send a request. It would keep sending up to the configured max-reauth-req values when there’s no response from the device. It is recommended to test what’s best for your network since there are really no recommended values. For our scenario, let’s configure them with a value of one and 10 seconds.

interface range g1/0/1 - 48
 dot1x max-reauth-req 1
 dot1x timeout tx-period timer 10

The last thing that we need to address is the WoL feature that some people use in the environment. By default, traffic through the unauthorized port is blocked in both directions and the magic packet, WoL packet sent by the server, never gets to the sleeping computer.

To support the WoL feature in 802.1X environment, we’ll need to configure the switch to allow outbound traffic to the unauthorized port but still control the incoming traffic. The command to do this is shown below.

interface range g1/0/1 - 48
 authentication control-direction in

Other considerations

Not every scenario is covered here so I recommend you to read Cisco’s configuration and deployment guide about 802.1X. For example, what if all RADIUS servers that handles the port-based authentication are unreachable? That would mean, unauthorized ports trying to move to authenticated ports will not work. Configuring critical VLAN both for data and voice may be necessary for this environment.

For partners’ devices, how would you like to handle their access to network resources? Would you allow them by implementing a Guest VLAN feature?

If you opt for using EAP-TLS, how would you manage the deployment of the certificates to all devices including mobile? This might frustrate users and may also overwhelm the desktop support staff if not handled properly.

What if your organization use non-Cisco phones? What will happen to the devices behind the phones once it gets authenticated and gets removed from the port? Does it support EAPoL Logoff/Proxy EAPoL Logoff? This is not an issue with Cisco phones with CDP since it supports CDP Enhancement for Second Port Disconnect. With this feature, when the user disconnects from the phone’s port, the phone will signal the Catalyst switch to move the data VLAN from authenticated to unauthorized state.

How do you want to authenticate the phones? Do you want to use EAP-MD5, MIC (Manufacturer Installed Certificate), or LSC (Locally Significant Certificate)?

If you do allow MAB fallback mechanism, how do you combat the possibility of unauthorized users spoofing MAC addresses that are in your RADIUS’s MAC address database? If the organization is big enough, how do you manage adding MAC addresses to the database? How do you maintain the database properly without leaving temporary entries?

Thoughts

Deploying 802.1X definitely has its challenges. This could be the reason why some organizations choose to not have some type of port-based authentication because of it may affect the availability of network resources. When it comes to deployment, I believe proper planning and testing is needed to make it a smooth deployment. Few things that could be used to make it a smooth deployment are the following: monitor mode, low impact mode, and closed mode, which is covered in this Cisco Live! presentation. Some might just opt for the lab testing then move to pilot phase, which is doable in my opinion.

Are you ready to improve your network security?

Let us answer more questions by contacting us. We’re here to listen and provide solutions that are right for you.

ENGAGE US

References

CCNP SWITCH
Wired 802.1X Deployment Guide
Catalyst 2960X Configuration Guide

Disclosure

NetworkJutsu.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.

  • Share on Twitter Share on Twitter
  • Share on Facebook Share on Facebook
  • Share on LinkedIn Share on LinkedIn
  • Share on Reddit Share on Reddit
  • Share via Email Share via Email

NetFlow-Lite on Catalyst 2960-X

04/19/2015 By Andrew Roderos Leave a Comment

  • Share on Twitter Share on Twitter
  • Share on Facebook Share on Facebook
  • Share on LinkedIn Share on LinkedIn
  • Share on Reddit Share on Reddit
  • Share via Email Share via Email

Several months ago, sFlow became instrumental in figuring out the issue with HP switches that we inherited. Just to give you an idea of what the issue was, the HP switches would sporadically drop off the network but the user data traffic was still flowing. Good thing it was only the management traffic that was dropping and not user traffic. With the help of sFlow collector, I was able to correlate the timestamps of when several HP switches went down and I found out that MLD (Multicast Listener Discovery) was the culprit. Tried to search the web for some answers but no luck. I upgraded the code of the switches but still no luck. Finally, I decided to contact HP Tech Support since they offer a lifetime warranty on hardware and software. When the tech support asked for the config, he saw that igmp querier was turned on and when we turned it off the problem never came back. Since we’ve been replacing the HP switches with Cisco Catalyst switches, I wanted to replicate some level of the sFlow functionality. Luckily, the Catalyst 2960-X supports NetFlow-Lite.

What is NetFlow-Lite?

Cisco defines it as shown below. If you want to read more about NetFlow-Lite, please read this. To me, it’s a way for a network professional to see some visibility of what’s on the wire and gather statistics.

NetFlow-Lite collects packets randomly, classifies them into flows, and measures flow statistics as they pass through the switch. It is a true flow-based traffic-monitoring mechanism that conserves valuable forwarding bandwidth when exporting flow-based data for analysis and reporting.

Prior to sFlow and NetFlow-Lite, I was somewhat exposed with NetFlow but it was very limited implementation. That NetFlow implementation was good enough for what we used it for. Besides, the traffic generated by devices and/or computers on the network were very specific to the business applications and the computers were locked down tight so it was not needed at all. The places where we needed application visibility had protocol analyzers deployed so there was not a whole lot of push to deploy NetFlow.

NetFlow-Lite is not available in all Catalyst switches, I believe it was first supported on Catalyst 4948 platform and now being supported on newer Catalyst switches. The NetFlow-Lite requires the FPGA (Field-Programmable Gate Array) that contains the logic to implement NetFlow engine. Without it, then there won’t be support of NetFlow-Lite. Hence, no support on older platforms.

NetFlow-Lite Configuration

If you want to know what the commands do, please visit the configuration guide here.

flow record netflow
 match datalink mac source address input
 match datalink mac destination address input
 match ipv4 protocol
 match ipv4 source address
 match ipv4 destination address
 match ipv6 protocol
 match ipv6 source address
 match ipv6 destination address
 match transport source-port
 match transport destination-port
 collect transport tcp flags
 collect interface input
 collect flow sampler
 collect counter bytes long
 collect counter packets long
 collect timestamp sys-uptime first
 collect timestamp sys-uptime last
!
flow exporter collector
 description To NetFlow Collector
 destination 192.168.1.100
 source Vlan100
 transport udp 9985
 template data timeout 60
 option interface-table
!
flow monitor netflow
 record netflow
 exporter collector
 cache timeout active 30
!
sampler netflow
 mode random 1 out-of 32
!
!
interface range Gi1/0/1 - 48
 ip flow monitor netflow sampler netflow input
!
interface range Te1/0/1, TeX/0/1
 ip flow monitor netflow sampler netflow input

NetFlow/sFlow Collector

There are many vendors out there that sell flow collector software. Vendors out there like inMon (sFlow creator), Plixer, ntop, SolarWinds, etc. Make sure that they support NetFlow v9 or IPFIX since that’s the format that NetFlow-Lite can export to. Most of these vendors have trial software that you could use to give you a demo of their product. I am sure they’ll be happy to do a webinar so that they could introduce you to their product before starting to play with their software.

Thoughts

While NetFlow-Lite gave us some visibility, I noticed that sFlow provided more information so it is still better than not having any visibility at all. If your switches are capable of doing NetFlow-Lite, I suggest you do some trial to see if it’s going to be helpful for your environment. For us, it’s definitely helpful to have visibility so it is still being used. Another pretty cool feature that I find it very convenient is the fact that it could tell you the switch and port number of the device you’re looking for. While it’s not quite of a big deal to just log in to routers and switches to trace the device you’re looking for, it’s rather inconvenient to do so, especially if you implement two-factor for your switch-based authentication.

Disclosure

NetworkJutsu.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.

Same Data and Voice VLAN

12/24/2014 By Andrew Roderos Leave a Comment

  • Share on Twitter Share on Twitter
  • Share on Facebook Share on Facebook
  • Share on LinkedIn Share on LinkedIn
  • Share on Reddit Share on Reddit
  • Share via Email Share via Email

Yes, it’s not the best practice to put both data and voice traffic in the same VLAN and subnet, but I’ve recently encountered it in production and caused us some head scratching scenario. While it was probably best to redesign the whole thing since it doesn’t follow best practices, it was not going to fly in this case.

Configuration

The stack’s configuration was copied from an existing Catalyst 3750-X production switch. For the most part, it’s a basic configuration that you will encounter in a lot of production switches. Some of the configurations are the following: AAA, BPDU Guard, PortFast, Storm Control, Access and Voice VLAN (same VLAN number), etc. The 3750-X stack’s configuration was not changed for more than two years. There were no issues or complaints from the clients so it was pretty much safe to copy the configuration to the new Catalyst 2960-X stack. This stack will be the new location of the same users of the 3750-X stack – just moving to a new location.

Issue

The desktop support guy started hooking up Avaya phones and desktops to the switch and noticed that all desktops and phones connected to switch 1 (two in a stack) were not communicating to the network. The desktops and phones weren’t getting IP addresses, but once he connects to switch 2 everything started to work. The interface configurations on both switches were configured the same but for whatever reason it wasn’t working. I rebooted switch 1 and all the desktops and phones connected to it started communicating to the network.

Several hours later, the desktop support guy said that everything connected to switch 2 stopped working. I decided to reboot the whole stack at this point. Before I rebooted the stack, I noticed that one of the switches was in EX4 and the other one was in EX5. Normally, there would be a version mismatch error when there are two different IOS installed and shouldn’t join the stack, but on this particular instance both switches were in the stack. I decided to upgrade the EX4 switch to EX5 since that is the latest code anyway and we’re making that as a standard IOS.

After upgrading and rebooting the stack, I still noticed that computers and phones on switch 2 were not communicating. At that point, our client was now panicking because they were not compliant with their contract to their client. That said, we decided to replace the whole stack with new switches. The switch replacement worked for a while but the issue came back which we kind of suspected anyway – we were not fully convinced that it would resolve the issue. Since we’ve been using the Catalyst 2960-X with EX5 code for several months now, we know it works just fine with our standard configuration. The only thing different with this stack and others was the same voice and data VLAN. While the switch is solid, we’ve hit bugs starting with EX1 all the way to EX4. That said, I begun to suspect that we might be hitting a bug in EX5, so I decided to open a TAC case while it was on the working state.

Troubleshooting

A Cisco TAC engineer contacted via e-mail me to troubleshoot the issue since it was working at the time I opened the case. As usual, the first thing I had to give them was show tech output. I still do not know why I do not include that by default when opening a case.  Anyway, as mentioned, the issue came back and I engaged the TAC engineer as soon as I was informed it stopped working.

The TAC engineer started checking some stuff up to try diagnose the issue. The first thing he mentioned is the voice and data VLAN being the same and it’s not a recommended practice. I agreed with him but I also reminded him that we have this deployed in two locations and were working fine using Catalyst 3750-X. He even asked me to connect to those switches so he could see it in his own eyes.

Upon issuing few show commands, we finally found some clue on what the problem was. There was no spanning tree instance on the interfaces where the phones and desktops were connected to. It acted as if there was nothing connected to the port even though the switch sees something connected to those ports. He decided to take out the switchport voice vlan vlan-id  and the desktop started pinging and spanning tree instance on the port showed forwarding.

While the configuration worked for desktops, I still needed the phones to work. Having only switchport access vlan vlan-id on the port won’t work with the phones even if I separated phones and desktop connections. That said, we needed a way to configure the port for the phones to also work. At the time, I only knew two ways of configuring a port with phones and data connected to the switch: configuring access vlan and voice vlan commands on the interface and configuring the interface as a trunk with native VLAN set for data. Essentially, they will act the same thing so we didn’t want to do that.

Upon him talking to his colleague(s) and probably done his research, he came back and said we’re going to try the switchport voice vlan dot1p command. That solved the issue! While he explained what the command did, I was not quite happy with his explanation because I felt it was incomplete, so I decided to read the configuration guide and was quite disappointed it didn’t have more information about it.

If you do a search, you will find plenty of information about the command, which I hoped the configuration guide would have. From what I’ve gathered both online and from the TAC engineer, if configured, the switch will instruct the phone to use VLAN 0 for its voice traffic and also mark its voice traffic with their proper CoS values, 5 or 3 depending on what traffic it is. Once the frame is received, the voice traffic with VLAN 0 will be accepted and drop the voice and data traffic in the same VLAN configured on the port – access vlan command.

Thoughts

While the dot1p configuration worked, I still would’ve preferred to have two different VLANs for both data and voice which is the standard configuration anyway in our environment. But, for whatever reason, they decided to design it this way and it’s going to stay that way. Maybe when it breaks that’s when they are going to decide to separate it, but at this point we had to choose the battle we were going to fight.

Disclosure

NetworkJutsu.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.

  • Share on Twitter Share on Twitter
  • Share on Facebook Share on Facebook
  • Share on LinkedIn Share on LinkedIn
  • Share on Reddit Share on Reddit
  • Share via Email Share via Email
  • Go to page 1
  • Go to page 2
  • Go to page 3
  • Go to Next Page »

Footer

WORK WITH US

Schedule a free consultation now!

LET’S TALK

Copyright © 2011–2023 · NetworkJutsu · All Rights Reserved · Privacy Policy · Terms of Use