logo
Published on

TCP

Authors
  • avatar
    Name
    Bowen Y
    Twitter

Transport Level Protocol

The application-level processes that use its services have certain requirements. The following list itemizes some of the common properties that a transport protocol can be expected to provide:

  • Guarantees message delivery

  • Delivers messages in the same order they are sent

  • Delivers at most one copy of each message

  • Supports arbitrarily large messages

  • Supports synchronization between the sender and the receiver

  • Allows the receiver to apply flow control to the sender

  • Supports multiple application processes on each host

Lower Level Protocol(best-effort service)

The underlying network upon which the transport protocol operates has certain limitations in the level of service it can provide. Some of the more typical limitations of the network are that it may

  • Drop messages

  • Reorder messages

  • Deliver duplicate copies of a given message

  • Limit messages to some finite size

  • Deliver messages after an arbitrarily long delay

TCP’s demux key

(SrcPort, SrcIPAddr, DstPort, DstIPAddr)

Why TCP uses three-way handshake?

Because the client and server have to make sure the opposite side know what sequence number they are going to accept.

So, to realize this goal, we need two round packets(4 in total: 2 * (SYN + ACK)).

A ---SYN--> B

A <--ACK--- B

A <--SYN--- B

A ---ACK--> B

However, we can combine the SYN and ACK in the second/third steps to realize three-way handshake.

2-way handshake is not enough.

  • A two-way handshake would involve only a SYN and an ACK (which also carries the server’s SYN). While this might establish a connection, it lacks an explicit final acknowledgment from the client that it is ready and has received the server’s initial sequence number correctly. This could potentially lead to scenarios where the server starts sending data without confirmation that the client is properly prepared, leading to data loss or connection errors right at the start.

4+-way handshakes are redundant.

  • Extending to a four-way handshake would involve additional messages beyond the SYN, SYN-ACK, and ACK. This could be used for further negotiations or settings (like in SSL/TLS protocols during HTTPS communications), but for TCP’s purpose of merely establishing a reliable connection, a four-way handshake would introduce unnecessary complexity and delay in establishing the connection. TCP's three-way handshake strikes a balance between reliability and efficiency.

Both of TWO-GENERAL-QUESTION and THREE-WAY HANDSHAKE involve the challenges of coordinating and ensuring reliable communication over an unreliable channel. While the Two Generals' Problem illustrates the need for reliable communication strategies in theory, it does not dictate the use of a three-way handshake specifically. The choice of a three-way handshake in TCP is more a product of practical engineering requirements—balancing reliability, speed, and simplicity—than a direct solution to the Two Generals' Problem. The TCP protocol design effectively addresses real-world network conditions and aims to establish a reliable and efficient connection with minimal overhead.

QUESTION: Will there be multiple connection in TCP?

No.

  1. Initial Sequence Numbers (ISNs)

    TCP uses a sequence number in the SYN packet to uniquely identify the start of a new connection. Each new connection attempt from a client includes a unique ISN, which is chosen based on a time-based algorithm that makes it very unlikely to repeat recent sequence numbers.

  2. Client Response to Unexpected SYN-ACK

    • If a duplicate SYN packet (with the same ISN as the original) reaches the server after the original connection has been established, the server's response (a SYN-ACK with what should be the next sequence number) will not align with the client's expectations for a new connection (as the client would use a new ISN for a truly new connection).

    • If the client receives a SYN-ACK that does not match any active connection attempt (meaning the ACK number doesn't match the ISN+1 of any SYN it sent), it will respond with a RST (reset) packet. This RST informs the server that the SYN-ACK it sent was unexpected and effectively closes that pseudo-connection from the server's side, preventing resource waste and potential confusion.

  3. Server Handling of RST

    Upon receiving the RST, the server understands that the client did not initiate the connection it is acknowledging and can safely discard it, thus cleaning up any state or resources allocated to what it initially perceived as a valid connection.

  • Data Link Layer: Uses sliding window protocols like Go-Back-N and Selective Repeat for reliable frame delivery between directly connected nodes.

  • Transport Layer (TCP): Uses a sliding window mechanism to ensure reliable, ordered delivery of data segments over a network.

Sliding window in TCP serves several purposes:

  • (1) it guarantees the reliable delivery of data,
  • (2) it ensures that data is delivered in order, and
  • (3) it enforces flow control between the sender and the receiver.

TCP’s use of the sliding window algorithm is the same as at the link level in the case of the first two of these three functions.

tcp-sliding-window

Flow Control

The only real difference is that this time we elaborated on the fact that the sending and receiving application processes are filling and emptying their local buffer, respectively.

TCP use AdvertisedWindow in the header to realize FLOW CONTROL.

TCP always sends a segment in response to a received data segment, and this response contains the latest values for the Acknowledge and AdvertisedWindow fields, even if these values have not changed since the last time they were sent.

Whenever the other side advertises a window size of 0, the sending side persists in sending a segment with 1 byte of data every so often. It knows that this data will probably not be accepted, but it tries anyway, because each of these 1-byte segments triggers a response that contains the current advertised window. Eventually, one of these 1-byte probes triggers a response that reports a nonzero advertised window.

Note that these 1-byte messages are called Zero Window Probes and in practice they are sent every 5 to 60 seconds. As for what single byte of data to send in the probe: it’s the next byte of actual data just outside the window. (It has to be real data in case it’s accepted by the receiver.)

Protecting Against Wraparound

TCP’s SequenceNum field is 32 bits long -> 2^32 AdvertisedWindow field is 16 bits long -> 2^16

The 32-bit sequence number space is adequate at modest bandwidths, but given that OC-192 links are now common in the Internet backbone, and that most servers now come with 10Gig Ethernet (or 10 Gbps) interfaces, we’re now well-past the point where 32 bits is too small. Fortunately, the IETF has worked out an extension to TCP that effectively extends the sequence number space to protect against the sequence number wrapping around.

Triggering Transmission

TCP maintains a variable, typically called the maximum segment size (MSS), and it sends a segment as soon as it has collected MSS bytes from the sending process. MSS is usually set to the size of the largest segment TCP can send without causing the local IP to fragment. That is, MSS is set to the maximum transmission unit (MTU) of the directly connected network, minus the size of the TCP and IP headers.

Silly Window Syndrome

  • Receiver Side:

    When the receiving application reads data from the TCP buffer slowly, the receiver advertises small window sizes to the sender. As a result, the sender can only transmit small amounts of data, leading to inefficient use of the network.

  • Sender Side:

    When the sender generates data slowly or in small segments, it can result in sending small packets. This can happen if the sender doesn't wait to accumulate a larger amount of data before transmitting, thus leading to a high number of small packets.

Nagle’s Algorithm

This algorithm is implemented on the sender side to prevent small packet transmissions. It works by combining a number of small outgoing messages and sending them all at once. Specifically, it prevents the sender from sending more than one small packet per round-trip time (RTT) and waits until it can send a full-sized segment or until all outstanding data has been acknowledged.

Some applications cannot afford such a delay for each write it does to a TCP connection, the socket interface allows the application to turn off Nagle’s algorithm by setting the TCP_NODELAY option. Setting this option means that data is transmitted as soon as possible.

Adaptive Retransmission

Adaptive Retransmission in TCP is a mechanism that adjusts the retransmission timeout dynamically based on the observed round-trip times. It involves calculating the smoothed RTT and RTT variance to determine an appropriate RTO, balancing timely retransmissions with network efficiency. This approach enhances TCP's ability to handle varying network conditions, leading to improved reliability and performance.

TCP Extensions

  • The first extension helps to improve TCP’s timeout mechanism.
  • The second extension addresses the problem of TCP’s 32-bit SequenceNum field wrapping around too soon on a high-speed network.
  • The third extension allows TCP to advertise a larger window, thereby allowing it to fill larger delay × bandwidth pipes that are made possible by high-speed networks.
  • The fourth extension allows TCP to augment its cumulative acknowledgment with selective acknowledgments of any additional segments that have been received but aren’t contiguous with all previously received segments.

Alternative Design Choices (SCTP, QUIC)

stream-oriented protocols like TCP and request/reply protocols like RPC. TCP is a full-duplex protocol

  • No Message Boundary

    Since TCP deals with a continuous stream of bytes, it does not provide any mechanism to distinguish where one message ends and another begins. It simply ensures that the bytes are delivered in the same order they were sent.

  • Message Orientation Protocol

    Request/reply applications, such as HTTP or SMTP, operate with discrete messages. A message is a distinct unit of data that represents a complete request or response, often with clear boundaries (e.g., start and end markers).

    In these applications, the concept of a "message" is defined at the application layer, not by the transport layer (TCP). The application layer protocol must handle the parsing and recognition of message boundaries within the byte stream provided by TCP.

The first complication is that TCP is a byte-oriented protocol rather than a message-oriented protocol, and request/reply applications always deal with messages.

the second complication is that in those situations where both the request message and the reply message fit in a single network packet, a well-designed request/reply protocol needs only two packets to implement the exchange, whereas TCP would need at least nine: three to establish the connection, two for the message exchange, and four to tear down the connection.