Mastering Linux Network Administration
上QQ阅读APP看书,第一时间看更新

Understanding the TCP/IP protocol suite

TCP/IP is the most popular networking protocol in existence. Not only is it the primary protocol suite of the Internet, it's something that you can find on just about any device that supports network connectivity in one form or another. Your computer understands this suite very well, but nowadays your phone, TV, and perhaps even a kitchen appliance or two supports this technology. It really is everywhere. Although TCP/IP is often referred to as a protocol, it's actually a protocol suite made up of several individual protocols. From the name, I'm sure you can gather that two of them are the TCP and IP protocols. In addition, there is also a third, UDP, which is part of this protocol suite as well.

TCP is an acronym for Transmission Control Protocol. It's responsible for breaking down network transmissions into sequences (also known as packets or segments), which are then sent to the target node and reassembled back into the original message by TCP on the other end. In addition to managing packets, TCP also ensures that they were properly received (to the best of its ability). It does this via error correction. If a packet is not received by the target, TCP will resend it. It knows to do this because of the retransmission timer.

Before we discuss error correction and retransmission, let's first take a look at the actual process that TCP uses to send data. When setting up a connection, TCP performs a three-way handshake, which consists of three special packets that are sent between the communicating nodes. The first packet, SYN (synchronize), is sent to the receiver by the sender. Essentially, it's how the node announces that it wants to start a communication. On the receiving end, once (and if) the packet is received, a SYN/ACK (synchronize acknowledgment) packet is sent back to the sender. Finally, an ACK (acknowledge) packet is sent to the receiver from the sender, which is an overall verification that the transfer is all set to proceed. From that point forward, the connection is established and the two nodes are able to send information to each other. Further packets are then sent, which make up the remainder of the communication.

If we lived in a perfect world, this would be all that is needed. Packets would never get lost in transmission, bandwidth would be unlimited, and packets would never get corrupted during transmission. Unfortunately, we don't live in a perfect world and packets are lost and/or corrupted all the time. TCP has built-in features to deal with these types of things. Error correction helps ensure that the packet which was received is the same as the one that was sent. TCP packets contain a checksum, and an algorithm is used to verify it. If the verification fails, the packet is deemed incorrect and is then discarded. This verification isn't perfect, so it's still possible that the file you just downloaded may still have an error or two, but it's better than nothing. Most of the time, it works just fine.

The flow control feature of TCP handles the speed at which data is transferred. While most of us geeks have a very nice set of networking hardware that is able to handle a ton of bandwidth, the Internet is not a consistent place. Your uber high-end switch may be able to handle whatever you throw at it, but that really doesn't matter if there is a weak link somewhere upstream within the connection. A network transmission is only as fast as its slowest point. While you're sending a transmission to another node, you're only able to send as much data as its buffer is able to hold. At some point, its buffer will fill up and then be unable to receive any additional packets until it deals with the ones it already has. Any additional packets sent to the receiver at this time are dropped. The sender sees that it is no longer receiving ACK replies, and then backs off and slows down its rate of transfer. This is the method that TCP uses in order to adjust the transfer speed according to what receiving nodes are able to handle.

Flow control works by utilizing what is known as a sliding window. The receiving node specifies what is known as a receive window, which tells the sender how much data it's able to receive before it becomes overwhelmed. Once this receive window runs dry, the sender waits for the receiver to clarify that it's ready to receive data again. Of course, if the receiving end sends an update to the sender that it is ready to receive data and the sender never gets the memo, we could run into a real problem if the sender waited forever for an all-clear message that was lost in transmission. Thankfully, we have a persist timer in place to help deal with this. Essentially, the persist timer represents how long the sender is willing to wait before it needs to verify that the connection is still active. Once the persist timer elapses, the sender transmits another packet to the receiver, to see whether it is able to deal with it. If a reply is sent, the reply packet will contain another receive window, which identifies that it is indeed ready to continue the conversation.

The IP (short for Internet Protocol) handles the actual sending and receiving of the packets that TCP wants to send or receive. Within each packet, there is a destination known as an IP address (which we'll discuss further in this chapter). Each connected network interface will have its own IP address, which the IP protocol will use to figure out where a packet needs to go, or which device it is from. Together, TCP and IP make up a powerful team. TCP splits up a communication into packets, and IP handles routing them to their destination.

Of course, there's also UDP (short for User Datagram Protocol), which is part of the suite as well. It's very similar to TCP in that it breaks up a transmission into packets. The main difference, however, is that UDP is connectionless. This means that UDP does not verify anything. It sends the packets, but does not guarantee delivery. If a packet isn't received by the target, it will not be resent.

Those learning about UDP for the first time may question why such an untrustworthy protocol would even be considered. The fact is, in some cases, a connection-oriented protocol such as TCP may add unwanted overhead to certain types of transmissions. One example of this is contacting a colleague via Skype, which offers audio calls over the Internet as well as video calls. If a packet was lost by either end during a communication, it wouldn't make much sense to resend it. You would just hear a bit of static for a second or so, and retransmitting a packet certainly wouldn't change the fact that you had difficulty hearing a word or two. Adding error correction to such a transmission would be pointless and add overhead.

Discussing TCP/IP in its entirety would be a book in and of itself. In Linux, this protocol is handled in much the same way as other platforms, the real difference is in regards to how the protocol is managed. Throughout this book, we'll talk about ways we can manage this protocol and tweak our network.