Monday, September 29, 2008

A Comparison of Mechanisms for Improving TCP Performance over Wireless Links

This paper discusses several schemes designed to improve the performance of TCP over wireless links that suffer from significant packet losses due to bit errors and handoffs. Currently, TCP control mechanisms are insufficient for such links because they assume that all losses are congestion-related and thus can unnecessarily reduce end-to-end throughput. The authors classify these schemes as end-to-end protocols, link-layer protocols and split-connection protocols. Using end-to-end throughput over wired and wireless hops as a performance metric, a TCP-aware link-layer protocol with selective acknowledgments proved to provide the highest throughput. End-to-end schemes also proved to be effective, do not require additional support from intermediate nodes and can improve throughput dramatically when combined with an ELN mechanism.

Reading this paper provided a good insight into a variety of methods for effective TCP-based transmission over lossy wireless links without having to learn the details of these methods. The focus was on the bottom line: optimizing end-to-end throughput. However, the evaluation seemed rather simplistic as the protocols were implemented under a basic range of bit-error rates. This provided valuable results but simulation and analysis under real network conditions may reveal new information in this field. Overall, this was a very informative paper and built somewhat on the previous one.

MACAW: A Media Access Protocol for Wireless LAN's

In light of the recent emergence of mobile computing devices that will necessitate controlled access to indoor wireless media, this paper proposes a new media access protocol for wireless LAN's: MACAW. Given that the relevant congestion in the described infrastructure is at the receiver, carrier-sense approaches are avoided and Karn's MACA (multiple access, collision avoidance) proposal is used as a starting point. The MACA protocol uses an RTS-CTS-DATA packet exchange and a binary exponential backoff (BEB) scheme. The authors argue that reliable transmission can be enforced more efficiently using link-layer recovery. Moreover, bi-directional transmission schemes require attention to congestion at both the receiver and the sender. Instead of adopting a standard CSMA/CA approach, the MACAW protocol sends a data-sending packet (DS) to block all other transmissions until the ACK packet has been transmitted - this prevents useless transmission requests in the exposed terminal case. Hence, MACAW uses an RTS-CTS-DS-DATA-ACK packet exchange. Moreover, a coordinated MILD (multiplicative increase, linear decrease) backoff scheme is used to ensure fairness of access.

The development of MACAW revolves around fair access to wireless media - the multiple stream model, RRTS mechanism and coordinated backoff scheme are all introduced with this goal in mind. That said, I wonder why the focus is completely on fairness and there isn't even the slightest trade-off with performance. Moreover, the discussion of the MILD (MIAD) backoff scheme made me wonder why the AIMD scheme is not used here, given that it helped achieve fairness in the congestion control scheme discussed earlier. These are different manifestations of fairness but I was surprised to see no mention of AIMD. Finally, the illustrations of the exposed terminal, hidden terminal and other situations were very effective in explaining the reasoning behind the features of the MACAW protocol. Overall, I really enjoyed this paper.

Tuesday, September 23, 2008

A Fast Switched Backplane for a Gigabit Switched Router

This paper presents a strong case for routers with a switched backplane. Initially, the authors provide a very useful overview of router architecture, breaking the path of a packet down into the forwarding decision, the backplane and the output-link scheduler. Router architectures are moving away from the use of shared backplanes as they can easily become congested and limit the performance of the system. Using crossbar switches instead allows multiple packets to be simultaneously transferred across the backplane, thereby increasing the aggregate bandwidth of the system. Simply making the backplanes faster has proven to be insufficient to keep up with common high packet arrival rates, so switched backplanes are the optimal solution. The authors briefly discuss why handling variable-length packets leads to lower throughput - essentially, abnormally long packets at the input can hold up other packets instead of being forwarded to a starved output. HOL blocking severely limits the throughput since the cell at the head of the FIFO input queue blocks cells behind it from access to the outputs. The authors present a solution in virtual output queueing (VOQ), which was also discussed in the previous paper. Next, the iSLIP algorithm is introduced as an input-output matching algorithm that provides a fast, high-throughput, starvation-free solution.

I truly enjoyed reading this paper. First of all, the background material was reviewed very well before jumping into any algorithms or proposals. Router architecture was covered in great detail and the decision to focus or not focus on each aspect of this architecture was explained quite well. This paper should definitely be part of the syllabus. One thing I wasn't sure about was whether these ideas were already being implemented at the time or whether they were proposed ideas for future implementation.

Scaling Internet Routers Using Optics

This paper introduces the use of optical technology inside routers to increase their capacity and enable it to keep up with the rapid growth in Internet traffic. Despite the expected growth of router capacity, traffic will likely outgrow it and balloon in comparison within a decade. In describing the problems of existing multi-rack routers, the authors reveal the need for a scalable solution that guarantees 100% throughput. Having expressed the motivation behind the new technology, the authors initially present Chang's load-balanced switch architecture and explain how it satisfies the aforementioned requirements. To address the first major issue present in the load-balanced switch, the expensive optical switch fabric can be replaced by a single fixed mesh of optical channels. Here, the actual packet switching occurs in the intermediate linecards. Due to TCP's poor performance in the presence of out-of-order packets, the Full Ordered Frames First (FOFF) scheme is used to prevent packet mis-sequencing. The FOFF scheme is efficient, practical and has no pathological traffic patterns. The router must also be robust: it must support an arbitrary placement of an arbitrary number of linecards. This requires a reconfigurable switch that scatters traffic uniformly across the linecards. Here, the authors propose a hybrid electro-optical switch, as well a pure optical switch that should be implementable in the near future.

The primary obstacle for me throughout this paper was my limited knowledge of router internals such as linecards and racks. To an extent, I had to deduce what these terms meant from the early sections of the paper. Otherwise, this was an interesting read on router technology and the need to increase router capacity. I wouldn't say it's absolutely essential for the syllabus.

Wednesday, September 17, 2008

Supporting Real-Time Applications in an Integrated Services Packet Network: Architecture and Mechanism

This paper considers real-time applications in an Integrated Services Packet Network (ISPN) and proposes a modified ISPN architecture. The authors observe that real-time applications may not always require a pre-computed worst-case delay bound provided by a guaranteed service. Instead, a predicted service that tracks network performance and uses it as a guide in computing dynamic delay bounds. The introduction of this predicted service would provide not only superior application performance due to the use of adaptive playback, but also higher link utilizations and thereby greater apportioning of service costs. The authors provide a unifying mechanism for real-time as well as datagram traffic - the concepts of isolation and sharing are central to this mechanism. Finally, there is an emphasis on the importance of pricing to encourage use of suboptimal service. This will avoid having to enforce sharing by refusing a large fraction of requests for real-time traffic. Price discrimination is also used to explain the potential value of the predicted service.

Given that I worked with real-time streaming and multimedia over the summer, I definitely found myself quite engaged in this paper. The innovative concept of predicted service also intrigued me after having worked with fixed-delay packet-dropping mechanisms in video streaming. This paper definitely builds upon the previous one by giving an explicit case of providing various classes of service to suit different application requirements. The discussion of price discrimination felt a little incomplete as a few assumptions were made about lower prices leading to requests for lower-quality service.

Fundamental Design Issues for the Future Internet

This paper considers whether the Internet should adopt a new service model to satisfy the traffic characteristics and service requirements of the variety of applications that are growing in demand among the general public. One example of the problems posed by such a variety of applications is that real-time multimedia applications not only suffer from the delays and dropped packets typical of the Internet architecture, but also do not back off in the face of congestion - in the process, data applications that adhere to congestion control receive very little bandwidth. Essentially, the primary goal in designing a network architecture is to optimize the efficacy, or total utility to various applications, of the architecture. This can be achieved by supplying more bandwidth or by extending the service model to accommodate applications requiring different classes of service. The authors condense the trade-off between increased bandwidth and extended services to the fact that while delivering real-time suitable service to all Internet applications is not feasible, extending the service model for real-time applications would automatically free up bandwidth for data applications. To maintain the service interface and the layering model, the particular class of service should be "explicitly requested" by a flow rather than detected and "implicitly supplied" by the network.

The authors add further commentary to the trade-off between bandwidth and mechanism. Extensions to the service model could be rendered superfluous if incorrect assumptions are made about application service requirements. On the other hand, additional bandwidth may not be extremely inexpensive and this would result in a suboptimal architecture that cannot be feasibly provisioned. The authors clearly state that their intention is to open a discussion and present the issues rather than to present definite solutions. This paper didn't seem to add as much value as some of the others since it simply touches upon issues that are somewhat obvious. However, it does present a picture of the future direction of the Internet and the issues that must be resolved to make it flourish even more. This is not essential but it's worth reading and keeping in the syllabus.

Sunday, September 14, 2008

Congestion Control for High Bandwidth-Delay Product Networks

This paper suggests that the inevitable move towards networks with high bandwidth-delay products will cripple the congestion control mechanism currently provided by TCP. In its place, the authors propose a new "explicit" congestion control protocol, XCP. The key feature of XCP is that unlike TCP, it separates efficiency control and fairness control. The authors argue that since efficiency and fairness are independent factors, decoupling them provides optimal control. For example, AIMD achieves fairness but the additive increase isn't quite the most efficient solution for high-bandwidth connections as too many RTTs are wasted in achieving the capacity. XCP continues to use an AIMD scheme for fairness control but instead opts for MIMD for efficiency control - the separation allows each control to use a suitable scheme. Despite using the AIMD scheme employed by TCP, XCP has quicker convergence to fairness than TCP since multiplicative decrease is performed more often than merely on the detection of dropped packets. Upon describing the XCP control mechanisms, the authors present simulations that compare XCP and TCP in the presence of previously discussed features such as RED and CSFQ queues. The results demonstrate that XCP matches TCP under normal conditions and that it outperforms TCP when the per-flow delay-bandwidth product becomes large.

This paper was a surprise after reading so many older publications that assumed TCP congestion control. The arguments made for XCP seem quite convincing; that said, I'd be surprised if there are no tradeoffs in TCP's favor as this paper appears to imply. The provision for security extensions is compelling and the section on XCP's TCP-compatibility is a very good idea considering the resistance to the status quo that always arises to some degree in technology. Papers like this really diversify the syllabus and differentiate this course from the standard undergraduate networking course.

Random Early Detection Gateways for Congestion Avoidance

This paper explores the use of Random Early Detection (RED) gateways as a congestion control mechanism in cooperation with network transport protocols such as TCP. The primary goal of RED gateways is to maintain the network in a high throughput, low delay region by detecting incipient congestion and responding appropriately. Other crucial goals stem from the shortcomings of Random Drop and Drop Tail gateways. One is to avoid global synchronization due to many connections reduce their transport windows in response to congestion and lead to suboptimal network throughput. Another is to accommodate bursty traffic by allowing transient congestion. RED gateways use minimum and maximum thresholds for the average queue size to control packet dropping. If the average queue size is less than the minimum threshold, no packets are dropped; if the average queue size exceeds the maximum threshold, all packets are dropped. In the critical region between the two thresholds, RED gateways mark packets for dropping with a probability linearly related to the average queue size. This approach ensures that a particular connection is notified of congestion with a probability that is roughly proportional to the connection's share of the gateway bandwidth. Moreover, this definitely avoids the global synchronization problem. To determine the average queue size, RED gateways use an exponential weighted average low-pass filter, which effectively eliminates any bias against bursty network traffic. The authors assert that the optimum average queue size for maximizing throughput and minimizing delay is dependent on network characteristics.

This paper was an enjoyable read, especially because the explanations were very lucid and there was no unnecessary mathematical notation to explain a relatively simple concept. The discussion of using packet-based vs byte-based queue control was excellent, especially because that was a central issue in the previous papers on gateway FQ-based congestion control. The mention of using separate queues for separate classes of traffic was essential. Even though the authors accounted for the ambiguity in the optimal queue size, some examples of average queue sizes and min/max thresholds used for RED gateways would have been appropriate. The coverage of misbehaving hosts felt a little inadequate as well, but overall this is a must for the syllabus.

Wednesday, September 10, 2008

Core-Stateless Fair Queueing: Achieving Approximately Fair Bandwidth Allocations in High Speed Networks

This paper addresses the drawbacks of the Fair Queuing (FQ) algorithm proposed in the previous paper: the need for state maintenance, buffer management and per-flow packet scheduling. Instead, the authors propose a simpler architecture comprised of edge routers and core routers. Edge routers maintain per-flow state and insert flow rate information into packet headers; core routers use the flow rate information to implement a probabilistic dropping algorithm. The authors analyze the performance of this Core-Stateless FQ scheme and present simulations to illustrate the scheme. The goal here is to achieve reasonable approximations to fair bandwidth allocation rather than matching the FQ scheme perfectly. The packet dropping algorithm is carefully explained using a fluid model, followed by a discussion of rate estimation at the edge routers and determining the fair share. One impressive caveat of this algorithm is the weighted CSFQ mechanism, which enables the use of differential services such as the file server pair that was problematic with the FQ scheme of the previous paper. The authors also assure that each operation that the CSFQ scheme requires can be efficiently implemented. Finally, CSFQ is compared to other queueing algorithms by simulation and evaluation of the results. The authors also explore the effect of high latency in an "all-core" design. Later, the identification approach is introduced as an alternative method of achieving gateway-based congestion control.

I really enjoyed this paper as the concept of separation into edge routers and core routers seems quite innovative. The authors did an excellent job of identifying the areas where this scheme may or may not thrive, rather than presenting a purely positive assessment. This should definitely remain in the syllabus.

Analysis and Simulation of a Fair Queueing Algorithm

This paper discusses the role of gateway queueing algorithms in congestion control and proceeds to present the many advantages of a proposed fair queueing algorithm. Unlike the previous papers that detailed source-based congestion control schemes, this one focuses on the effect of gateway-based queueing algorithms on congestion control. The authors argue that standard FCFS queuing algorithms place the burden of congestion control completely on hosts, thereby leaving the network bandwidth vulnerable to ill-behaved hosts that do not comply with standard TCP congestion control schemes. In this light, Nagle's round-robin-based fair queueing (FQ) algorithm aims to provide protection against "antisocial" behavior by employing gateway queueing-based congestion control. Unfortunately, Nagle's proposal does not ensure fairness for several reasons - for one, it does not account for packet lengths. Before diving into their own proposal, the authors clarify the motivation behind a fair queueing algorithm.

The fair allocation of bandwidth, promptness and buffer space among users is complicated by the question of what defines a user. Rather than defining the user as the source, the destination or a process running on the source host, the best trade-off is achieved by allocating by conversations (source-destination pairs). The proposed algorithm uses a packetized queueing mechanism based on the end times of packets; this successfully accounts for packet lengths. Moreover, newly arriving packets are accommodated in the queue by dropping the last packet from the dominant conversation. In effect, this punishes ill-behaved hosts by charging them for poor congestion control on their end. Finally, the authors compare the performance of the FQ scheme to that of the FCFS-based standard congestion control. In conclusion, the authors assert that achieving low delay or avoiding dropped packets is very difficult with congestion control only at hosts, especially in a non-cooperative environment. Two objections have been raised to FQ algorithm-based congestion control. The first objection is that some conversations need more bandwidth than would be deemed "fair" (file server pairs, for example) - one solution to this is to utilize bandwidth priorities in the queueing algorithm. The second objection is that such a FQ algorithm requires advanced gateways, which may not be technologically or economically feasible.

This paper was very intriguing because it provided a new angle to congestion control, one which I'd never seen before. I'm surprised that this aspect of congestion control is rarely covered in undergraduate networking. The authors did a fairly good job of covering the caveats involved, such as applications that require "unfair" bandwidth allocation, but elaboration on assigning bandwidth priorities would have been interesting indeed. Definitely one to keep in the syllabus.

Tuesday, September 9, 2008

Congestion Avoidance and Control

This paper describes several early algorithms put in place for TCP congestion control and avoidance. Initially, the authors introduce the slow-start algorithm as a means to gradually increase the amount of data in transit. This algorithm starts with a congestion window of one packet and increases the window in response to acknowledgments - in this sense, it is a self-clocking mechanism that moves towards equilibrium. Unlike other start-up mechanisms, the authors argue, slow-start contains the data rate to within twice the maximum allowable on the network path. Next, a method to estimate the round-trip time is presented and a case is made for exponential back-off. Although the need for such a retransmission mechanism is rather obvious, the actual proof is not provided here. Finally, the authors delve into congestion avoidance - what happens around the "knee" of the curve. The additive increase, multiplicative decrease scheme from the previous paper is substantiated here in a congestion avoidance algorithm for TCP. There is also some talk on adding congestion control mechanisms at network gateways. This definitely seems to be very important in promoting fair sharing of the network. One interesting caveat here is that misbehaving hosts will have their packets dropped even if they try to ignore gateway signals - that's pretty neat.

Given the background from the previous paper, this was very appropriate and took a different angle on congestion control and avoidance. The authors probably should have spent more time talking about edge cases as the basic mechanisms presented seem pretty obvious, but that's probably with hindsight more than anything else. Finally, it was good to see appended sections with more rigorous explanations of the algorithms.

Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks

This paper explores a class of increase/decrease algorithms that can be used for congestion avoidance schemes and concludes that the additive increase, multiplicative decrease algorithm quickly converges to the key performance metrics: efficiency and fairness. The authors classify both congestion avoidance (operating around the "knee" of the network traffic curve) and congestion control (staying to the left of the "knee") mechanisms as dynamic resource management problems. Congestion avoidance mechanisms specifically increase or decrease the packet output rate in response to the network load. For simplicity, the analysis in this paper is based on the binary feedback model, which assumes a synchronous feedback and control loop. Four general increase/decrease algorithms are presented - MIMD (multiplicative increase/multiplicative decrease), MIAD, AIMD and AIAD - and the criteria to be used to evaluate their effectiveness are also established: efficiency, fairness, distributedness and convergence. Next, the additive increase, multiplicative decrease scheme is graphically shown to converge to the optimal point marked by the intersection of the fairness and efficiency lines. On the other hand, the path traced by the additive increase, additive decrease scheme remains parallel to the fairness line and never converges. The case for an AIMD scheme is further supported by sections with somewhat rigorous algebraic and vector analysis.

I really enjoyed this paper because it provided detailed, technical reasoning behind the choice of the AIMD increase/decrease scheme in TCP transport, something that is taken for granted throughout undergraduate networking. Moreover, there was no inherent bias towards the AIMD scheme - the authors set out to explore all four classes of increase/decrease algorithms and demonstrated exactly why they arrived at this scheme. Even though the authors provided a preview of delayed feedback and asynchronous operation at the end of the paper, an actual effort to incorporate it into the initial analysis would have been interesting. Some of the assumptions that were made may be a little too strong. Overall, very interesting paper and definitely a keeper for the syllabus.

Wednesday, September 3, 2008

On Inferring Autonomous System Relationships in the Internet

This paper proposes a method to explore relationships between ASes, which were described in the previous paper. These relationships can provide insights into network traffic and optimal interdomain routing policy. The author suggests that BGP routing table entries can be used to generate graphs that accurately represent AS relationships. The communication subsystem is only one among several possible sources of error for the described mechanism. Since the file transfer application must perform an extensive end-to-end check even in the presence of low-level reliability measures, the communication subsystem should not implement expensive functionality to guarantee reliability. The degree of low-level reliability to strive for should be carefully determined based on the performance tradeoff. Low-level functionality may be efficient for applications that can perfectly leverage the provided reliability measures; on the other hand, it may force applications to bear the burden regardless of whether they actually require the reliability measures.

Even though this paper was an interesting read, it isn't quite as fundamental to the Internet as the previous papers. Reading this immediately after the Interdomain Internet Routing is a great idea for two reasons - the author elaborates on aspects of AS routing policy that may have been previously unclear and the reader already has a basic understanding of the topic before trying to delve into the proposed algorithm and method for generating graphs that depict AS relationships. I'm not sure if this is vital for the syllabus.

Interdomain Internet Routing

This paper clearly explains the routing mechanisms that connect administrative domains in the Internet topology. The Internet comprises Autonomous Systems, administered by a single commercial entity, that exchange reachability information with one another using the Border Gateway Protocol (BGP). Unlike intradomain routing, interdomain routing requires a great emphasis on scalability. Internet Service Providers (ISPs) can engage in transit relationships, financial agreements to share access to each other's routing tables. Peering relationships, on the other hand, involve granting access to a subset of each other's routing tables for mutual benefit. In the commercial reality, interdomain routing decisions must be based not only on reachability but also economic considerations. For example, an ISP may deny access to a competitor or seek a financial gain before agreeing to transit packets from another AS. Rather than looking to optimize the path cost, BGP revolves around scalability, policy and cooperation.

The content of this paper is still very relevant today and the clarity of the explanation makes it an essential read in the syllabus. This is the first paper that has explored the real-world, commercial aspect of the Internet rather than focusing on theoretical mechanisms and hypothetical situations. Perhaps some elaboration on potential security threats for interdomain routing would have been appropriate. Overall, this was a very informative read.

The Design Philosophy of the DARPA Internet Protocols

This paper discusses the design goals that drove the development of early Internet architecture. The primary objective was to interconnect and unify existing networks such as ARPANET and the packet radio network. Other goals included survivability in the face of network failures, support for multiple types of service and a variety of networks and distributed resource management. The communication subsystem is only one among several possible sources of error for the described mechanism. Since the file transfer application must perform an extensive end-to-end check even in the presence of low-level reliability measures, the communication subsystem should not implement expensive functionality to guarantee reliability. The degree of low-level reliability to strive for should be carefully determined based on the performance tradeoff. Low-level functionality may be efficient for applications that can perfectly leverage the provided reliability measures; on the other hand, it may force applications to bear the burden regardless of whether they actually require the reliability measures.

The author proceeds to provide further illustrative examples of this end-to-end principle such as delivery acknowledgment (the case where acknowledgments must be sent from the application layer), secure transmission (the case where data must not be vulnerable before or after network transmission) and duplicate message suppression (the case where the application generates duplicates beyond the recognition of lower layers). Finally, the end-to-end argument should not be used as an absolute guide but in conjunction with a careful assessment of application requirements and performance tradeoffs.

Given that this paper presents a principle that is the foundation of modern data communication networks, it should be a fundamental part of the syllabus. The author suggests that this end-to-end principle applies to general systems but only elaborates on applications that involve data communication systems. Some of the examples seem slightly far-fetched but overall, the argument is quite convincing.

Tuesday, September 2, 2008

End-to-End Arguments in System Design

This paper presents an argument against excessive low-level functionality in distributed systems, especially those with a data communication component. Initially, this line of reasoning is illustrated through the example of a file transfer mechanism in which one host reads a file from disk and communicates it across a network to a second host, which then writes the data to disk. The communication subsystem is only one among several possible sources of error for the described mechanism. Since the file transfer application must perform an extensive end-to-end check even in the presence of low-level reliability measures, the communication subsystem should not implement expensive functionality to guarantee reliability. The degree of low-level reliability to strive for should be carefully determined based on the performance tradeoff. Low-level functionality may be efficient for applications that can perfectly leverage the provided reliability measures; on the other hand, it may force applications to bear the burden regardless of whether they actually require the reliability measures.

The author proceeds to provide further illustrative examples of this end-to-end principle such as delivery acknowledgment (the case where acknowledgments must be sent from the application layer), secure transmission (the case where data must not be vulnerable before or after network transmission) and duplicate message suppression (the case where the application generates duplicates beyond the recognition of lower layers). Finally, the end-to-end argument should not be used as an absolute guide but in conjunction with a careful assessment of application requirements and performance tradeoffs.

Given that this paper presents a principle that is the foundation of modern data communication networks, it should be a fundamental part of the syllabus. The author suggests that this end-to-end principle applies to general systems but only elaborates on applications that involve data communication systems. Some of the examples seem slightly far-fetched but overall, the argument is quite convincing.