Network Admin Stuff: Lesson 19 - Spanning-Tree Protocol Overview

Vlans described in the previous posts are very important elements of building modern networks. Equally important piece of technology is IEEE 802.1D, commonly known as Spanning-Tree Protocol. In the following few posts, I will focus on its application and basic operation.

If your network consists of layer 2 switches that allow computers connect and exchange data, you will need to consider the design that can withstand some types of failure.

Redundant Connections

Consider the following layer 2 design. Imagine that the SW1, SW2 and SW3 switches connect many devices and there is only a single connection between the switches like depicted in the Pic1.

Pic. 1 - Switch Topology Without Redundancy

Icons designed by: Andrzej Szoblik - http://www.newo.pl

Should either of the links between the switches break, the communication between many devices fail. Such design creates a single point of failure. We could easily tweak this simple design to make it more resilient by adding an extra path between SW2 and SW3. The below picture shows this modified design.

Pic. 2 - Redundant Paths

Icons designed by: Andrzej Szoblik - http://www.newo.pl

Unfortunately, creating the extra path here comes at a cost. The redundant connection (Pic. 2) between SW2 and SW3 creates a loop. The loop in turn, will create three serious problems. The last one in the list will eventually render our system unavailable. Let's see what these problems are.

Duplicate Frame Delivery

Pic. 3 - Problem 1 - Duplicate Frame Delivery

Icons designed by: Andrzej Szoblik - http://www.newo.pl

Look at the pic. 3 and imagine SW2 and SW3 do not have the MAC address of PC3 (0000.3333.3333) in their databases (CAM). This can happen if the PC3 doesn't speak for more than five minutes. This is the default time MAC address is kept in the database without refreshing it. Then, we have PC1 sending frame towards PC3. As you recall, SW2 will flood the frame out of its active ports if it does not know where PC3 is located (unknown destination MAC address). The frame travels out SW2's port F0/13 towards SW1 and out the port F0/12 towards SW3. SW2 will deliver the frame to PC3. Since SW3 floods the frame out as well, it will be sent towards SW1 out of its port F0/14. Then, SW1 obediently delivers the same copy of the frame to PC3 again.

MAC Address Table Instability
Another issue caused by the loop we have created will make switches change the MAC addresses depending on where they hear the sender. Consider pic. 4 below.

Pic. 4 - Problem 2 - MAC address table instability

Icons designed by: Andrzej Szoblik - http://www.newo.pl

Again, let us assume that none of the switches in the picture knows where PC3 is connected. This means they have not learned its MAC address yet. In our scenario, PC1 sends the frame to PC3 (destination MAC: 0000.3333.3333). SW2 floods the frame out F0/12 and F0/13 ports.

Now, SW3 receives this frame sourced with 0000.1111.1111 MAC address (PC1). It learns the source MAC address and maps it to its F0/12 port where it arrived. Since SW1 does not know where PC3 is connected (at least right now) it will flood this frame out all active ports. This way, the frame is sent out SW1's port F0/14 towards SW3. SW3, upon receiving the frame on its F0/14 port, reads the source MAC address (0000.1111.1111) and maps it to port F0/14 this time. This causes a little confusion as SW3 learned it earlier on and it was port F0/12 before. Previous mapping is removed and F0/14 becomes the outbound port for 0000.1111.1111 now.

Broadcast Storm
The last problem is really severe. It can bring our traffic to a halt. Take a look at pic. 5 below.

Pic. 5 - Problem 3 - Broadcast Storm

Icons designed by: Andrzej Szoblik - http://www.newo.pl

In this scenario, PC1 sends a broadcast frame. SW2 upon receiving it, floods it out all its active ports. SW1 receives it on port F0/13 and floods it out of other ports. SW3 receives the broadcast frame on its F0/12 port and floods it. Then, a tad later it receives this same broadcast frame from SW1 and again it floods it out all active ports except the port it arrived on. You can write the rest of the story on your own. This broadcast is running in the loop in both directions endlessly. Well, not exactly endlessly. It is true that there is not mechanism to stop it, but all three switches in the topology will be so busy sending out this broadcast, that eventually all its resources are consumed and they stop sending anything at all. If you look at switches that experience a broadcast storm, you will notice that all their LEDs are flashing amber like a Christmas tree. In a few seconds the switches become unresponsive. An attempt to access them remotely using SSH/telnet will fail. Even console connection may refuse to accept your commands. The only way to bring the switches back to the operation is to break the loop by pulling one of those cables.

So, what can we not have redundancy in our layer 2 topology? Of course, we can.

We will run Spanning-Tree Protocol (turned on by default), which will dynamically block redundant connections creating a loop free topology. Should the primary link fail, the one that is in the blocking state will start forwarding the traffic in about 30 seconds by default. Of course, we will need something much faster than 30 seconds, but I will show you that as soon as we know how STP works.

Here I am going to give you just an overview of its operation. But the devil is in the details which we will scrutinize in my next post.

Spanning-Tree Protocol Overview
STP is a layer 2 loop prevention mechanism. Switches running this protocol use special frames called Bridge Protocol Data Unit (BPDU). These frames contain enough information to allow the switches to create a loop free topology. This magic is accomplished using three distinct phases:

Elect a single switch to be the root bridge machine which is the central device in the layer 2 network. This machine will have all its ports in the forwarding state (designated port role).
All other switches (non-root switches), will select a single path towards the root bridge. That port is called the 'root port' and will be forwarding traffic that is destined out of the switch through the root bridge. This path is the least cost (best) path towards the root.
All other switches will select a single path per segment in order to block stop the loop. The port that is forwarding traffic is called designated port. The port that is blocking traffic to stop the loop is called non-designated port.

I will explain all the terms and the above process in details in my next post. Meanwhile, check the pic. 6 first.

Pic. 6 - Spanning-Tree Protocol

Icons designed by: Andrzej Szoblik - http://www.newo.pl

In the above picture, SW1 has been elected as the root bridge. SW2 uses port F0/13 as its root port (the best, or the least cost path towards the root). SW3 uses it port F0/14 as the root port. SW3 blocks the port F0/12 to stop the loop. SW2 keeps sending BPDU frames originated by the root bridge (SW1) out its F0/12 port towards SW3.

Now, what is really fascinating that the loop free structure like the above is done automatically (although you want and will affect how it works), and the fact that if the communication between SW2 and SW1, or SW3 and SW1 is broken, the SW3 port F0/12 will be put in the forwarding state.

If you are interested in the details how STP works please read my next post (lesson 20).

Network Admin Stuff

Sunday, October 17, 2010

Lesson 19 - Spanning-Tree Protocol Overview

Cisco Is Easy - Main