Saturday, September 18, 2010

Lesson 13 - Layer 2 Connectivity Troubleshooting Part 3

This lesson is the last one in the series on how to troubleshoot connectivity issues at the layer 2 of OSI model. Bear in mind, that these do not involve layer 2 technologies such as VLANs or Spanning-Tree Protocol, since we have not talked about those yet.

The steps presented in this lesson are not the ALL possible diagnostics you can do. And they do not have to be done in this specific order. I am merely listing some logical steps which might be useful in order to 'nail down' the root cause of the problem.

Trouble Ticket 3
New installation as per Pic. 1 shows lack of connectivity between the two computers. Initial diagnostics performed revealed the following facts:
  • Switches are connected via fibre optics cable and the ports show proper status (interface is up, line protocol is up).
  • Computers have proper addresses and subnet masks assigned.
  • Cables connecting computers with switches have been tested and proved to be working correctly.
  • Computers reply to echo requests packets (firewalls disabled).
The technician who set up this new network calls you for help.

Pic. 1 - New design with connectivity problem
Icons designed by: Andrzej Szoblik -

Dealing with this trouble ticket we are going to collect the tools we used in the previous lessons trying to resolve this issue.

Step 1
First let's try to 'divide and conquer' (concept mentioned in lesson 11) by sending ping packets from PC1 to PC2. Before we do that though, we need to purge the existing ARP cache on PC1 and PC2 to have a fresh information. We do that by opening Command Line Interface window and typing: arp -d host-address (linux), or arp -d (MS Windows).
Test results:
  • The pings timed out. No reply from PC2.
Step 2
Since the ARP cache entries age out relatively quickly (depending on which operating system you use), we need to quickly check what they contain Alternatively we can send large series of ping packets.
Test results:
  • PC1 does NOT contain expected mac-to-ip mapping. We expected to see at 00:1e:4f:b0:b2:fc. It is not there, though.
  • PC2 DOES contain the the right mac-to-ip mapping. It shows the following: at 00:50:bf:9c:45:6a
These are very interesting results, don't you think? Before we take the next step let's gather what we know so far.

Since the ping was initiated by PC1, it sent its ARP request broadcast message and that query must have been delivered to PC2. We can conclude that, based on the fact that we have cleared PC2's ARP cache in step 1, and it has the proper mapping now. PC2 did receive ARP request from PC1 and learned what MAC and IP address it uses. 

Step 3
We could omit that step, but we're curious if PC2 replies to the ARP request from PC1. We launch our 'wireshark' tool on PC2, ping again from PC1 to PC2 and capture all packets on PC2. What we discover in this packet trace is that PC2 has replied to ARP request with proper ARP reply unicast message back to PC1.

Step 4
Clearly, something between the computers (switches) does not work properly. It seems that we have some sort of unidirectional communication. What we need to establish is, where this unidirectional communication is taking place. We login to the SW1 and SW2 and issue the following commands ('x' here stands for switch number in Pic. 1):

SWx#show mac address-table interface f0/1
SWx#show mac address-table interface f0/24

Test results:
  • SW1 learns source MAC address of PC1 (00:50:bf:9c:45:6a) on its Fa0/1 interface. This is expected.
  • SW1 does NOT learn MAC address of PC2 (00:1e:4f:b0:b2:fc) on its Fa0/24 interface. This is unexpected. It should learn it from the ARP reply sent by PC2.
  • SW2 learns source MAC address of PC2 (00:1e:4f:b0:b2:fc) on its Fa0/1 interface. This is expected.
  • SW2 learns source MAC address of PC1 (00:50:bf:9c:45:6a) on its Fa0/24 interface. This is expected.
This way, we have discovered that SW1 has unidirectional link towards SW2 (SW2 sends frames towards SW1 but the latter does not seem to receive those). Probably, the fiber optics connection does not work properly (grease, dirt, a strand is broken etc.).

One more time, this lesson illustrates how useful the commands and knowledge described in the previous posts, can be in real life scenarios. 

In my next post, I will show you how to log system messages so they can be analyzed later. System messages are invaluable pieces information in the process of troubleshooting networking issues.