Sunday, August 15, 2010

Lesson 4 - Introduction to TCP/IP Layers

In this lesson we take a sneak peek at the fundamentals regarding TCP/IP. This is one of the most important aspects to understand in order to follow the upcoming lessons. It is not my ambition to explain everything here as this would turn into a hefty book to read. You can find a lot of them on the market anyway. Instead, I will try to focus on some basic aspects of TCP/IP. They will constitute the minimum knowledge to help us understand how computers communicate.

If you feel the urge to learn more on TCP/IP right now please visit the following site: http://www.tcpipguide.com/free/t_toc.htm

NOTICE!
Familiarize yourself with ALL terms in red. They will be used throughout the classes.


Introduction
Computer communication follows some well defined rules and guidelines which we call protocols. In order for the computers to exchange data they have to agree on using the same rules, otherwise they become incompatible. That was the case in the past. This was one of the reasons to create a common model for communication. It was called OSI Model (Open Systems Interconnection). This was an attempt to make different vendor's computers exchange data easily. This way IBM machines could talk to DEC machines and so on. Today however, it is the TCP/IP model that is all-pervasive. This protocol suite is derived from OSI model and somewhat loosely follows its rules and terminology. This is going to be the focus of our discussion and the main topic of this lesson.

Note!
Remember that TCP/IP model does not follow OSI model exactly. OSI model is now used as a reference. What's described in this blog is TCP/IP model, not OSI or IPX/SPX model. Look at the comparison below:

TCP/IP Layers
The designers decided to break down the whole complexity of data exchange and created five layers of functions to accomplish the goal. This approach helps develop and modify certain layers of code without touching other layers. For instance, if you are an application programmer, you will be writing a code in the 'application layer' which allows you to use already written code dealing with the transport of data between computers. You do not even have to learn how this transport is done. It has already been written for you. This way, you focus on the application you're creating, what it does and how it works rather than learning about network adapter drivers, signaling and other gory, hardware details.

TCP/IP model divides the functions related to data transmission by using five distinct layers of responsibility. Below area these layers.

Layer 5-7 - Application
This is where the data's journey begins. Everyday, you use many applications that  rely on network services. Those applications are classified to be layer 5 code. Your web browser and web server, mail client and mail server, ssh client and ssh server etc. You may have noticed the term client and server often used in the above description. Pretty much all applications use this architecture. Client, is an application that requests some services from the server application. Server application is providing a client with what they want. A common example of that architecture is your Firefox or Internet Explorer web browser (client application) requesting a page from Apache or IIS web server (server application). Applications, in general, provide a User Interface (UI) which offloads us from a burden of knowing how a computer does things internally.

So, once your application formed the request, that one is sent down to the layer 4 (transport layer) asking for the delivery to the host somewhere in the network.

Layer 4 - Transport
This layer accepts all requests coming from the upper layer (application layer) and tries to organize the transport of that request across the network. In TCP/IP model this layer of software is responsible for:
  • Breaking down big files that are sent across into smaller chunks called segments. There are technology limitations that do not allow our computers to send large files in one piece. It would not be a good idea anyway as any small change of the data during transmission would make the sender re-transmit the whole file again instead of the smaller chunk only. That of course, would take more time and resources to successfully transmit the data.
  • As your computer uses many applications that will transmit something across the network at the same time, the system must know how to mark those request such that they are delivered to the right receiver applications. And once the replies are coming back, they should be delivered back to the same process that initiated them. The concept of the port number has been introduced to deal with that. Source and destination ports ensure that all requests and replies are delivered to the appropriate processes on the computers exchanging data. More on that later in the upcoming lessons.
  • This layer also allows the application to use connection-oriented or connectionless services. The former, establishes communication with the receiving computer (or more generally: destination host) before data can be exchanged, the latter will send data without ensuring that the destination application is running and willing to receive anything. This form of transmission is used primarily for voice and video applications.
  • This layer will also give applications some options in terms of the reliability. Depending on which layer 4 protocol the application is designed to use, the reception of data can be verified or not. That creates reliable versus unreliable transport respectively. In the reliable transport any data that has not been delivered will be retransmitted, unlikely the unreliable transport.
  • One other function of layer 4 could be to moderate the transmissions so that the receiving host is neither swamped by the excess of packets coming in nor is it waiting and doing nothing because the sender's speed of transmission is too slow. Majority of the functions above are performed by TCP protocol, not UDP as applications choose one of them to use.
Once all aspects and functions in this layer have been taken care of, layer 4 sends the data it received from layer 5 down to the layer 3 requesting its service.

Layer 3 - Internet
Upon receiving a request from layer 4, this code is going to process the incoming information. Since, typically we have more than one path between the sender (source) and the receiver (destination), the function of this layer is to find the best path between them. In order to accomplish that there are two concepts I need to introduce here.

Firstly, we need to know how computers find themselves in the network. This is accomplished by using specially designed, layer 3 addresses uniquely identifying computers in any network. The addresses used by this layer consist of four bytes delimited by the dots (e.g. 10.1.1.1) which are followed by a, so called, 'netmask' also consisting of four bytes with the dot used as the delimiter (e.g. 255.255.255.0). The whole IP address can look like this:
10.1.1.1 255.255.255.0. More on those later.

Secondly, because the destination of our data can be outside of our own network, a device called router has been introduced to find the optimal paths between the different networks in which the computers reside. The data processed by layer 3 is called a packet or datagram. This layer also uses a mapping to the upper layer 4 that has requested its services. This is due to the fact that there are more than one protocols available in layer 4 (TCP or UDP). This information (which layer 4 protocol is sending the data) is going to be useful when the data arrives at the destination and the destination's layer 3 process needs to send the content to the appropriate layer 4 protocol for processing. It has to be the SAME protocol that the sender used in layer 4.


Layer 2 - Network Interface (Data Link in OSI model)
There is a great variety of technologies that handle data transmission on media such as copper and fiber optics cables or air (wireless). In order to offload the layer 3 protocols from learning all possible signaling methods, layer 2 was created. Thus, layer 3 can focus on finding the best path between the source and the destination, and the layer 2 functions will handle the details of preparing the data to be placed on the actual media (copper wire, fiber, air etc.). The piece of information processed at this layer is called a frame. This layer will also use specially designed addressing scheme to recognize the next device which a computer is sending the data to. For instance, in the commonly used layer 2 technology called Ethernet, this address uniquely identifying hosts in the same network is called MAC Address. The reason why we use different addressing schemes: layer 3 and layer 2, will become clearer when we get into some details of the actual data exchange. Please, bear with me till we reach the right lesson that explains it in more detail. The device that is capable of understanding the structure of a frame and delivers the data between the hosts in the SAME network is called bridge, or switch. As of the time of writing this, switches are very popular devices and bridges can be found mostly in museums.

Once the layer 2 has prepared the data which layer 3 requested to send, (the process is called 'framing'), layer 2 will send the request to layer 1 asking for the data to be placed onto the wire/fiber/air using the appropriate signaling method.

Layer 1 - Network Interface (Physical Layer in OSI)
This layer receives requests from layer 2 (data traveled from layer 5 to layer 1 now). The physical layer is going to encode data received from layer 2 software and place them in the form of bits (1s and 0s) on the medium. This way, ones and zeros travel across the media to deliver them to their receipient. The bits can traverse multiple devices as they go across such as hubs, switches, routers. What type of devices will forward those bits depends on the design of the network. The devices referred to as layer 1 devices are hubs, cables, network adapters, connectors, transceivers etc. Also the data processed by this layer is called bits. This layer defines low level aspects of the transmission such as cables used, maximum distance the cable can reliably sent bits across, types of the connectors, speed of the transmissions etc.

NOTICE!
The names given to the data at each of the layers described above (segment, packet, frame, bits) are taken from the OSI model terminology. They are referred to as: Protocol Data Unit (PDU).


Understanding the terminology and the concepts presented in this lesson are the pre-requisites to understand the next lesson.

Please, make sure that you read carefully this lesson before proceeding to lesson 5 in which I am going to show you some details regarding the protocols and headers used in TCP/IP transmissions.