TCP Reno
TCP is a reliable connection-oriented protocol that implements flow control by means of a sliding window algorithm. TCP Reno which make use of the slow start (SS) and congestion avoidance (CA) algorithms to adjust the window size, have enjoyed much success to date. In particular, Reno is currently the most widely deployed TCP stack, enabling all sorts of Internet applications.
The flow on a TCP connection should obey a ‘conservation of packets’ principle. By ‘conservation of packets’ we mean that for a connection ‘in equilibrium’, i.e., running stably with a full window of data in transit, the packet flow is what a physicist would call ‘conservative’: A new packet isn’t put into the network until an old packet leaves. The physics of flow predicts that systems with this property should be robust in the face of Congestion. And, if this principle were obeyed, congestion collapse would become the exception rather than the rule. Observation of the Internet suggests that it was not particularly robust.
There are only three ways for packet conservation to fail:
1. The connection doesn’t get to equilibrium, or
2. A sender injects a new packet before an old packet has exited, or
3. The equilibrium can’t be reached because of resource limits along the path.
Slow-Start
Thus Reno congestion control involves finding places that violate conservation and fixing them. To getting equilibrium, Reno introduces Slow-Start algorithm to gradually increase the amount of data in-transit to achieve the connection equilibrium. It has a congestion window to control the amount of data in-transit. For each data packet acknowledgement is received from receiver, the congestion window size will be doubled until packet lost or time-out event occurring.
Round-trip Timer
From above, we can notice that a good round trip time estimator, the core of the retransmit timer, is the single most important feature of Reno implementation that expects to survive heavy load. The TCP Reno suggests estimating mean round trip time via the low-pass filter:
R = aR+(1−a)M
where R is the average RTT estimate, M is a round trip time measurement from the most recently acknowledged data packet, and a is a filter gain constant with a suggested value of 0.9. Once the R estimate is updated, the retransmit timeout interval, rto, for the next packet sent is set to bR.
“Congestion avoidance” strategy
If the timer is in a good shape, the second problem stated in above can be easily solved. And we also can say that packet loss under timeout event indicate that the network is congested in 99% of problems. A packet lost is due to the congestion of the network. TCP Reno would use “congestion avoidance” strategy which consists of AIMD window policy to solve the problem.
l On any timeout or packet lost during slow-start period, the congestion window is set to half of the current window size (this is the multiplicative decrease).
l On each ack for new data, increase the congestion window size by 1 ( this is the additive increase).
l When sending, send the minimum of the receivers advertised window and sender’s congestion window.
TCP Vega
In an effort to improve the loss-based TCP Reno, TCP Vega is design base on a on delay-based scheme. It makes a prediction by observing throughput to determine if congestion is going to occur. The motive of having the prediction is to adjust the sending rate before an actual packet loss happen. The improvements made were base on the following 3 areas: Congestion avoidance rather than congestion control, Packet loss detection (New retransmission technique), Modified slow start algorithm. These modifications improvement enables an efficient use of available bandwidth, and hence TCP Vega is able to attain better throughput ranging from 37% to 71%. The data retransmission is also reduced from one fifth to one half as compared to TCP Reno.
Congestion Avoidance
For a start on TCP Vega, let us begin with the Congestion avoidance mechanism. Vega uses the difference between the expected throughput and the actual throughput to predict congestion. The expected throughput is defined as the best possible throughput. Actual throughput is the measured throughput at the sender which can be obtain by packet timestamping. The difference is use to adjust the congestion window accordingly using the following conditions:
If(DIFF*BaseRTT < a) increase 1 congestion window size
If(DIFF*BaseRTT > b) decrease 1 congestion window size
Else congestion window size remains,
Where a and b are constants values that can be determine by experimentation and DIFF*BaseRTT is actually the backlog at queue refer as N. The main key is to keep the queue size, N between the a and b threshold in order to achieve high throughput and avoiding packet overflow.
Retransmission mechanism
Moving on, a new retransmission mechanism is introduced that detect packet loss earlier and reduce its congestion window in a multiplicative factor of 3/4. When a duplicated ack is received, Vega will retransmit the packet without the need to wait for at least 3 duplicate acks if the round trip time of the leading unacknowledged packet is greater than the fine grain timeout value. When non duplicate ack is received, Vega will repeat its check if the time interval of the sent unacknowledged packet exceed the retransmission timer. Retransmission occurs if exceed. Using this technique, Vega can detect losses faster and avoiding multiple window reduction when multiple packet losses occur in the same window. The congestion window should only be reduced according to losses in the current sending rate. Hence Vega is able to deal with the Reno’s suffering of multiple drops in same window of data.
Modified Slow start algorithm
Lastly is the slow start algorithm that has been modified. The basic principle is similar to Reno, but with the ability to find the appropriate window size without the need to incur an actual loss. The window is increasing in exponential with every other round trip time. Slow start come to a halt when Vega detect a queue built up when the actual rate fall below the expected rate by a threshold, then it moves into its congestion avoidance state.
Vega’s problem areas
Even though TCP Vega can provide performance improvements, it also suffers from problems. In asymmetry network, N measures the ack backlog instead of the actual data backlog. The bottleneck in reverse (ack) network can cause the available bandwidth of the forward path (data) to be underutilized. Rerouting might also change the propagation delay of connection which is use as an estimate in adjusting window size, thereby affecting the performance throughput seriously. Persistence congestion can also pose a problem when connections keep many packets in the network which cause over conservative propagation delay estimation.
TCP Veno
Wireless has become part of the current Internet and it stands to hold a leading position in the future for wireless access network. For TCP performances, the loss of packet indicates the occurrence of network congestion, but this does not apply for wireless network because we need to consider other factors such as noise and link error. The losses caused by these factors are referred as random losses not due to congestion network. The importance to differentiate between the random loss and congestion loss is a key issue because random loss can result an unnecessary reduction in the transfer rate which is use to deal with congestion network. Therefore, any misinterpretation can pose severe resultant to the wireless network. TCP Veno is an end to end congestion control mechanism design to deal with random losses effectively. In the simplest term, TCP Reno is the combined essence of TCP Reno and Vega.
TCP Vegas, which employs proactive congestion detection, was proposed with the claim of being able to achieve throughput improvement ranging from 37% to 71% compared with Reno. Vegas could indeed offer higher throughput than Reno. However, the performance of Vegas connections degrades significantly when they coexist with other concurrent Reno connections. The TCP Veno can overcome this problem by relaxing the need to change any receiver protocol stack or intermediate network station. The modification of Reno to Veno only involves the server side. Therefore it enables harmony of coexistence.
Packet Loss Identification
The initial step for TCP Veno is the ability to identify the type of losses correctly. The essence of Vega is deploy here where the measurement of backlog at queue, N, is use to determine the congestion state of the network when loss occur instead of controlling the window size as in Vega. Any loss detection during a non-congestion state is labeled as Random loss, whereas loss detection during congestion state is labeled as Congestion loss. The usual Reno window size adjustment scheme is retained for use in Veno when there is congestion loss. A different scheme would be use for random loss. In TCP Veno, the Reno window size progressive increment remains intact when there is no occurrence of packet loss.
Veno’s AIMD scheme
The following will discuss about the new AIMD scheme that Veno has adopted.
Veno is still using the slow-start algorithm of Reno at the start-up phrase of a new connection without any modification. TCP Veno refined the additive increase algorithm of Reno. When the window size is below the threshold size, the slow-start algorithm is still used to adjust the window size. When the window size is above the threshold size, window size is increased by one every two round-trip times. This will defer the onset of self deducted congestion loss, hence keeping the connection in the good operation region longer and therefore improving transfer rate.
The Multiplicative Decrease Algorithm of Reno has been refined. A packet is considered as lost when the timeout event occurs, then the slow-start algorithm is used with threshold is set to half of current window size and window size is reset to one. When retransmit the lost packet due to the received of duplicated acknowledgement, the threshold size is set to 4/5 of original instead of half in original Reno way. In general, Veno could use any factor larger than, but smaller than one so that the cutback in window size is less drastic than the case when loss is due to congestion.
However, it is worth to note that using TCP Veno, there are possibilities that the type of loss is misinterpreted thereby causing the window size to be reduce more than what is required. In such situation, the performance of TCP Veno will be degrade as to Reno, but will not be worse than Reno in any case.
Veno’s deployment areas
In general, TCP is highly desirable in the following 3 aspects:
l Deployability: Veno only need to modify the sender side algorithm to achieve the enhance performance compare with Reno.
l Compatibility: Veno coexists harmoniously with Reno without “stealing” bandwidth from Reno.
l Flexibility: Veno is more flexible than Reno in that it can deal with random loss in wireless networks better, alleviate the suffering in asymmetric networks [1], and has comparable performance in wired networks.
TFRC
TFRC, an equation-based congestion control for unicast traffic. With TFRC, sender explicitly adjusts its sending rate as a function of the measured rate of loss events, where a loss event consists of one or more packets dropped within a single round-trip time. As compare to TCP sending rate is controlled by a congestion window which is halved for every window of data containing a packet drop, and increased by roughly one packet per window of data otherwise. End-to-end congestion control of best-effort traffic is required to avoid the congestion collapse of the global Internet. Probably for real-time application, halving the sending rate in response to a drop of packet is not necessary because this cause noticeably drop in user-perceived quality. Therefore equation-based congestion control is designed to provide relatively smooth congestion control for such application.
The algorithm for calculating the loss event rate is the key design issue in equation-based congestion control. The sending rate in bytes/sec is determine by a function of packet size, round-trip time, steady state loss event rate, and the TCP transmit timeout value. The aim of TFRC is not to maximize the available bandwidth, but to maintain a steady sending rate according to congestion. This tradeoff the aggressively seeking out available bandwidth by TCP. TFRC do not reduce the sending rate in half for an occurrence of a loss event. It do reduce the sending rate in half after several successive loss events. In order for equation-based congestion control to work, required the receiver to send back the loss event rate or the calculated sending rate to sender. the round-trip time parameter is measure via the sender and receiver together use sequence number. The estimated loss rate should be measure based of a loss event which can consist of several packets lost within a round-trip time rather than the packet loss rate.
Response to persistent congestion
TFRC requires from three to eight round-trip times to reduce its sending rate in half in response to persistent congestion. For TCP, its increase the sending rate by one packet for every round-trip time without congestion. While TFRC does not increase its sending rate at all until a longer period has passed without congestion. For such situation, TFRC slower response to persistent congestion and slower increase in the sending rate without congestion.
Result of the experiment on TFRC
TFRC is generally fair to TCP traffic under different type of networks and conditions. For transmission rate of TFRC is lower than TCP on an average. However transmission rate of TFRC is much smoother than TCP flow which varies strongly even over relatively short period of time.
Multicast Congestion Control
Unicast equation-based congestion control is suitable for basis structure for sender-based multicast congestion control. Especially the mechanism of estimation of loss event rate and sender rate is adjusted accordingly. Equation-based congestion control lay a foundation for later development of multicast congestion control.
DCCP
Transport protocol which provides congestion control over unreliable datagrams.
Application like streaming prefers timeliness to reliability. These applications used UDP which does not equip with congestion control mechanism. And it’s a difficult task to implement own congestion control function. DCCP is design to ease the deployment of such application without risking congestion collapse. DCCP is an unreliable alternative to TCP. Congestion control is an important aspect for internet. Customization of transport protocol such as DCCP to facilitate different group of application. There is a growth of online stream, VOIP type of applications which greatly depend on UDP. Increase in such application usage that works on UDP will pose threat to Internet congestion collapse.
Although applications can implement their own congestion control mechanism, the implementation is difficult and error-prone. Therefore a combination of unreliable datagram protocol equip with congestion control should be design to support existing application, simply deployment and reduce the risk of causing congestion to the internet.
DCCP design factors
DCCP design factors are to serve a specific group of applications and ability to support future use and must be general. Selection of congestion control mechanism such as which congestion control algorithm, implementation on congestion feedback etc. Minimize the growth of packet header size which will cause overhead. Include the explicit congestion notification support mechanism. Improvement on the UDP support over middleboxes such as network address translators and firewalls. And what other alternatives can congestion control can be implemented rather than a new transport protocol. Negotiation mechanism for reliable acknowledgements on an unreliable connection. DCCP header structure has a four bit type field to support 16 different types of DCCP packets. DCCP also has option mechanism similar to TCP. Options such as acknowledgement reporting and negotiation.
DCCP negotiation
At the start of DCCP connection, both end-points must agree on a set of parameter. DCCP provide minimal options for negotiation. Either of the endpoints can send a “change” option request. The received party cans response with “prefer” which mean it prefer a different values, or “confirm” which mean the feature’s value has changed. An endpoint may reset the connection if a negotiation is taking too long. Generic and reliable feature negotiation is design to support additional functionality. DCCP choose the handshake approach for reliable acknowledgements. A DCCP sequence number of 24bit number that increases by one on every packet sent, including acknowledgments. DCCP connection make up of half-connections. Feature negotiation for two half-connections is completely independent, and may happen simultaneously. Half-connections have significant benefits in flexibility of use.
Quiescence mechanism
Many applications have most data flow from server to client. Therefore a quiescence mechanism ensures that the protocol can handle unidirectional communication without unreasonable overhead. An endpoint is recognizing as gone quiescent if it stop sending data after some amount of time. And another endpoint will shifts to a unidirectional pattern of communication. The quiescent endpoint will eventually send only acknowledges. Both endpoints will adjust the feedback and acknowledges to exactly those acks-to-acks required for it congestion control mechanism.
With DCCP, application has a choice of congestion control mechanism. Application indicates their selection of congestion control by using congestion control IDs (CCIDs) which is being negotiated at connection start-up. DCCP provide a TCP-like Congestion control mechanism, labeled as CCID 2. DCCP operates on unreliable datagrams, therefore the congestion control framework differs from the TCP. DCCP TCP-like congestion control still make use of sender’s congestion window to limit the number of unacknowledged packets outstanding in the network. But cumulative acknowledgement field cannot be implemented in DCCP. One of the method which is similar to SACK TCP to accomplish in an unreliable transfer. Receiver transmits acknowledgement information using Ack Vector and acks-to-acks. The Ack Vector describe exactly which packets have been received and whether these packets were ECN-marked in the network. Therefore congestion control mechanism can react accordingly based on this information. Acknowledgement congestion control is important for bandwidth-asymmetric networks. DCCP, unlike TCP can detect the reverse-path congestion using per packet sequence numbers. In CCID 2, DCCP sender responds by adjusting the Ack Ratio, which controls the rate of acknowledgement stream from the receiver.
TCP-Friendly Rate Control (TFRC) in DCCP’s CCID 3
The sender uses a sending rate, and receiver will sends feedback to the sender with a round trip time basis. This feedback consists of the loss event rate calculated by the receiver. And the sender will determine it sending rate based on the feedback information. If after several round-trip times, sender has not received any feedback, then the sender will have to halves its sending rate. For such basic feedback mechanism is not sufficient flexible could have problem in the future.
And for the sender application would like to know exactly which packets were received by the receiver for several reasons. For these cases, a CCID 3 half-connection can additionally include Ack Vectors and acks-to-acks, as in CCID 2.
Some applications for example voice or video would prefer to have partially damaged payloads delivered rather than discarded by the network. This motivates the partial checksum in UDP-Lite. As DCCP is a congestion-controlled transport protocol, to implement partial checksum concept into it would be complex. For an instance, DCCP has checksum that cover the packet’s payload, for any bit error detected will cause that packet to be drop. And DCCP would treat this as an indication of congestion in the network. Corruption of packets do not generally reflect the network is congested. Therefore DCCP use a separate checksums for the header and payloads. This will prevent DCCP to mistaken corrupted packets as a signal of network congestion. However partial checksum will not be significantly efficient if the data-link layer CRC over a noisy link will always discard corrupted packets.
SCTP
Purpose
SCTP was conceived as a reliable data transport protocol for use on top of an unreliable, connectionless packet switching network. It features checksum to minimize data corruption. Packet loss or duplication is handled by sequence numbers which enables the client to request for resend.
Association Setup
An association (connection) between the server and client is only achieved after the exchange of four messages.
Client
Server
Repeatedly send an association request (INIT) and waits for INIT-ACK from server.
If no INIT-ACK received after a number of sends, error is reported to the application.
On receiving INIT, generates COOKIE and MAC and returns cookie to sender (INIT-ACK)
On receiving INIT-ACK from server, stops INIT timer and repeatedly send COOKIE-ECHO to server
If no COOKIE-ACK after a number of sends, error is reported to application.
On receiving COOKIE-ECHO, determines from MAC if the COOKIE originates from itself. If yes, it initializes and allocates resources for the SCTP connection and send COOKIE-ACK to client. Server at this point is ready to send/receive data from client.
On reception of COOKIE-ACK, stop COOKIE-ECHO timer and is ready to send/receive data to server.
Association Termination
SCTP can perform a direct abortion of the association, as well as graceful termination to ensure no lost of data. In the case of graceful termination, the protocol is defined as follows:
Peer1
Peer1
Requested by application to SHUTDOWN, stops receiving data and waits for all outstanding data to be acknowledged. Once acknowledged, send SHUTDOWN to its peer.
Upon receipt of SHUTDOWN, waits for all data to be acknowledged and reply SHUTDOWN-ACK
Receives SHUTDOWN-ACK and responses with SHUTDOWN-COMPLETE. All resources for the association instance are freed.
On receiving SHUTDOWN-COMPLETE, resources on this side are freed as well.
For direction abortion, the procedure is as follows:
Peer1
Peer1
Application requests for abort, sends ABORT to its peer. This chunk must contain the peer’s Verification Tag and not have any data chunks bundled with it.
Verifies the chunk. If verified, frees resources and report ABORT to application.
Flow Control
SCTP features flow control scheme similar to TCP. The receiver may control the window size by returning the value together with a SACK (Selective Acknowledge) chunk. In SCTP, data chunk acknowledgements can contain Cumulative TSN Ack, which indicates that all data chunks of TSN <= has been successfully received; and Gap Blocks, which indicates that certain chunks have arrived, but there’re gaps in between. Lost data chunks are transmited by the sender after its transmission timer has expired, or in the situation where 4 SACK received that reports the same missing chunks, those chunks are retransmitted using Fast Retransmit scheme.
Congestion Control
Congestion control is required for application onto a large scaled packet switched network. SCTP’s congestion control scheme have been largely derived from TCP’s, adapted for mulihoming. In SCTP, flow and congestion state information are kept for all the transmission paths.
Multihoming
For failure tolerance, an SCTP association can be mulit-homed. This can be setup by the initial INIT and INIT-ACK chunks sent between the client and server. SCTP monitors the state of all transmission paths by sending HEARTBEAT chunks on all other non-primary paths, which has to be acknowledged by HEARTBEAT-ACK. A transmission path may take an active, or inactive state. SCTP considers a path as inactive if HEARTBEAT, or any other SCTP chunks (on the primary path) fails to be acknowledged repeatedly. On user request, or when a transmission path changes its state, a notification is issued, which the upper layer may request SCTP to change the primary path to an alternative active one.
Streams
SCTP offers partial ordering of datagrams through use of streams. Within an association, multiple streams may be setup, where within each stream, the ordering of datagrams is ensured through use of sequence numbers. The streams are independent of each other such that no ordering of datagrams are imposed between the streams. SCTP also offers out-of-arrival delivery, which sends the message to the upper layer as soon as it is completely received, ignoring sequence constraints.
Future of Veno, Veno II
The transport layer is currently composed of SCTP, DCCP, UDP, and various implementations of TCP. They each have application specific uses. Veno can possibly be evolved to encompass all the separate protocols into a common standard, Veno II. In this way, higher level layers can be based off a common API which can ease maintenance and development of the higher layers. Veno II provide a universal transport layer to cover the different needs of application and better security implementation for wireless communication.
Since the transition from IPv4 to IPv6 is likely to take a while, Veno II can act as a seamless layer that offers transparent access regardless of the version of IP used in the lower layer.
The problems of the increasing heterogeneous networks can be abstracted by Veno II from the higher layers. It can possibly offer both reliable, and unreliable based transport, with better QOS as well as an arbitration scheme to better distribute bandwidth across multiple connections.References
http://en.wikipedia.org/wiki/TCP_congestion_avoidance_algorithm
http://www.hep.ucl.ac.uk/~ytl/tcpip/background/vegas.html
http://www.cs.arizona.edu/projects/protocols/
http://nms.csail.mit.edu/6829-papers/congavoid.pdf
http://www.icsi.berkeley.edu/~widmer/tfrc/
http://www.icir.org/tfrc/tcp-friendly.TR.pdf
http://www.read.cs.ucla.edu/dccp/dccp-icnp03s.pdf
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment