Thursday, November 27, 2008

Improving MapReduce Performance in Heterogeneous Environments

This paper introduces a new scheduling algorithm to optimize MapReduce (specifically the Hadoop implementation) performance in heterogeneous conditions: Longest Approximate Time to End (LATE). The motivation here lies in the fact that the MapReduce task scheduler assumes linear progress and node homogeneity to make re-execution decisions in case of a straggler, a node that is performing poorly for various reasons (e.g., faulty hardware). Since this "speculative execution" is crucial to MapReduce's high performance, the authors set out to optimize the scheduler to finish the fastest tasks first without wasting unnecessary resources.

In short, task scheduling does not simply identify slow tasks but those that will hurt the average response time the most, as with any scheduling algorithm. Scheduling priority should be given to failed tasks, followed by non-running tasks and finally tasks for speculative execution. Moreover, speculative tasks should be scheduled on fast nodes. The ideas presented in this paper are all very intuitive and not too complicated - honestly, I'm surprised that MapReduce implementations didn't already use at least this level of sophistication in task scheduling. Overall, I definitely enjoyed this paper as I have plenty of Hadoop experience; this is certainly a keeper for the syllabus.

A Policy-aware Switching Layer for Data Centers

This paper addresses the need for more control over middlebox traversal in today's data centers. Data centers use a variety of middleboxes to improve the performance and security of services but existing networks provide suboptimal means to coerce traffic through a specific path of middleboxes. For this reason, the authors propose a policy-aware layer (PLayer) for data centers that is flexible, efficient and resilient to stern network conditions. Essentially, the solution is to wrap another layer of headers on each ethernet frame in order to track its progress through the policy (intended series of middleboxes).

In many ways, the ideas here are similar to the one that advocated a smoother integration of middleboxes into the Internet backbone by adding new identifiers. However, the solution proposed here appears to integrate rather seamlessly and with less overhead. The applications are also very obviously relevant (unlike the other middlebox paper, which didn't have a very convincing motivation in my opinion) and the concepts are generally explained clearly. I didn't completely understand the use of MAC addresses for pswitch-routing.

Scalable Application Layer Multicast

This paper presents an overlay-style protocol for scalable, application-layer multicast delivery. The primary focus here is on developing a hierarchical node organizational structure that minimizes control overhead. Even though the paper focuses on low-bandwidth applications, the authors claim that applications with different requirements can be handled by varying the overlay paths and metrics. The protocol employs a tree structure to avoid loops and data duplication, with a careful explanation of joining mechanisms for nodes. However, this tree model has several clusters at each layer and that may present problems - nodes at the top are overloaded compared to those at the bottom, which effectively lose bandwidth in one direction.

The cluster-based model may get complicated in a real network, especially merging and growing the tree. To an extent, this paper presents general approaches towards application-layer multicast rather than a specific protocol that can be readily implemented. It's more interesting than the previous one but once again, it may not really belong in the syllabus. The previous paper provides a better basic exposure to the requirements for multicast delivery.

A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing

This paper describes SRM, an algorithm designed to support scalable, reliable multicast delivery. The authors importantly explain that several integral TCP parameters lose meaning in a multicast environment; for example, the round-trip time used for retransmission or the delay-bandwidth product is difficult to pinpoint when receivers have several orders of difference in magnitude. For this reason, receiver-based control is preferred to the traditional sender-based, fate-sharing approach. The SRM framework builds upon application-level framing (ALF) as well as the Light-Weight Session (LWS) protocol that was designed for large-scale conferencing applications. SRM includes some interesting features, such as localized retransmissions to avoid multicast spamming. However, I found some of the detailed discussion of simulation somewhat tedious. This was an interesting paper but not necessary essential to the syllabus.

Monday, November 17, 2008

A Delay-Tolerant Network Architecture for Challenged Internets

This paper presents the DTN architecture, designed for networks with exceptionally poor or unpredictable performance characteristics. The existing TCP/IP service model makes basic assumptions about the existence of end-to-end paths between any two nodes, as well as upper bounds on round-trip times and packet drop probabilities. The growing importance of networks that violate these assumptions - for example, sensor networks or military ad-hoc networks - is the motivation behind the DTN architecture. Rather than adopting band-aid solutions using link-repair approaches or proxies, DTN uses a messaging scheme and is based on a partially connected network graph model.

Unlike the majority of papers covered in this course (apart from the really early ones), this challenged some fundamental assumptions of the Internet. Given that deep-space and military communication, as a few examples, will remain essential for years to come, the motivation behind a DTN-like architecture is very convincing. It's compatible with the current TCP/IP stack, appears to be a practical solution at first-glance and definitely merits further study. This one definitely belongs in the syllabus as a bit of a wild-card paper.

An Architecture for Internet Data Transfer

This paper introduces an extensible data-oriented transfer (DOT) service that decouples application-specific content negotiation from generic data transfers. One motivation behind the DOT service is to optimize bandwidth usage by minimizing control overhead (e.g., from HTTP or SMTP messages). Moreover, this modular architecture not only eliminates the need for re-implementation for each new service but also enables more focused efforts to improve data transfer. The authors acknowledge that since the DOT service is essentially designed for bulk transfers, services that promise low round-trip latency (e.g., CDNs or Google web services) or those with small-sized transfers pose an inherent incompatibility issue. The solution offered here is to use traditional delivery mechanisms for such cases rather than compromise the advantages of DOT.

The concept proposed in this paper is already used in sectors of the Internet today; for example, real-time media can use a TCP-based control plane and RTP over UDP for bulk packet delivery. Even though several complications arise with the large-scale employment of a DOT-like service (many of them addressed in the paper), it seems worthy of further exploration nevertheless. The idea is not too far-fetched and somewhat feasible, even if not completely practical for the Internet.

Wednesday, November 12, 2008

X-Trace: A Pervasive Network Tracing Framework

This paper introduces X-Trace, a unified network tracing framework for that spans several protocol layers (as opposed to tcpdump, for example) and applications. The concept behind X-Trace debugging is essentially to carry X-trace meta-data in packets. Received meta-data at one node's layer will be forwarded to the next node at that layer and related packets (such as in a DNS response) will all carry meta-data so that they are effectively linked together in traces. Finally, a reporting infrastructure (with a website, etc.) is presented to ensure visibility of network trace data to all relevant parties.

This is a neat idea that could potentially be quite useful for today's large-scale distributed systems. That said, I'm not certain whether the packet overhead due to X-Trace is justified by its benefits. Some of the applications that are mentioned (tracing messages, ISP troubleshooting, etc.) make a lot of sense and this is definitely a concept worthy of additional investigation. The paper goes well with Paxson's study on packet dynamics and definitely belongs in the syllabus.

End-to-End Internet Packet Dynamics

This paper sets out to model and explore measurements that reflect packet dynamics over a network. The author approaches the Internet as a mysterious quantity that needs to be probed to discover "pathologies" in the network. Specifically, out-of-order delivery, packet replication and packet corruption are examined using TCP-based measurement techniques. Of course, there is an emphasis on the need to account for TCP protocol-related symptoms to attain accurate measurements. Statistical analysis reveals several interesting trends, such as the occurrence of spikes (indicating a minimum time related to common 56 kbps links) in time-plots for out-of-order delivery . Moreover, measurements are explained in the context of TCP congestion control and acknowledgment mechanisms and reveal valuable insight into why, for example, TCP uses triple duplicate acknowledgments to trigger fast retransmission (and why it should but can't make a change to only two duplicate acknowledgments).

One thing that I appreciated here was the clear acknowledgment of possible improvements in measurement techniques and the drawbacks of some techniques employed in this study. This paper definitely gives an idea of just how complex and unpredictable Internet packet dynamics can be and provides approaches to measuring them; such measurements may be more difficult in today's significantly more evolved Internet, but several notions discussed here still hold value. For its time, Paxson's paper dispels several crucial assumptions about the Internet.

Thursday, November 6, 2008

Internet Indirection Infrastructure

This paper proposes an architectural revamping of the Internet to decouple the act of sending and receiving. In this scheme, packets carry identifiers rather than IP addresses - all packets with a given identifier are directed towards an i3 node using a Chord-style mapping infrastructure. Finally, receivers will trigger the appropriate i3 nodes to route packets to receivers with matching identifiers. In addition to using identifier matching at i3 nodes to route packets to the intended destination, partial matches can be used to simplify multicast and load-balancing mechanisms.

One immediate concern with this mechanism, quite like the ones that arose for the previous DOA paper, is that the DHT look-up and intermediate i3 node traversal could significantly hamper the round-trip latency. Implementing reliability mechanisms (i.e., TCP) on such an architecture could also be a mess, as could security complications. Overall, there are some neat ideas presented here (more so than in the DOA paper) but the concept seems quite unrealistic and impractical.

Middleboxes No Longer Considered Harmful

This paper underscores the vital role of middle-boxes - specifically, NAT boxes and firewalls - in the modern Internet. Traditional networking principles dictate that such middle-boxes violate not only the end-to-end principle but also the notion that end hosts must be uniquely identifiable throughout a network. The authors claim, however, that these components provide essential security mechanisms, address space utilization and performance enhancements (in the case of caching). Instead of trying to achieve compliance with the "rules" of the Internet, the authors present a Delegation Oriented Architecture (DOA) that seamlessly incorporates middle-boxes. Essentially, DOA provisions unique endpoint IDs in a flat name-space and the ability for the sender and receiver to specify intermediaries that must be traversed (for example, a NAT that must process the packet).

In my mind, the proposed DOA architecture has several issues, starting with the fact that the intermediary specification still appears to defy the end-to-end principle. More importantly, the overhead introduced in the DOA packet header is considerable and I'm not sure how the authors underplayed this issue. Finally, the idea that DHT look-ups (going by previous papers, these are already intricate) will be required on a regular basis raises doubts over the performance and scalability of this architecture. The concept was interesting but the motivation is not very convincing (do we really want to introduce so much overhead just to maintain the sanctity of some networking "rules" or middle-boxes?) and the potential drawbacks are not sufficiently addressed.

Sunday, November 2, 2008

DNS Performance and the Effectiveness of Caching

This paper evaluates the performance of DNS over several network traces, specifically looking at query failures (no response or response indicating nonexistent record) and latency perceived by Internet clients. Moreover, it takes the discussion of DNS caching in the previous paper one step further by questioning its impact on overall performance. The authors claim that DNS caching may inherently conflict with some applications that use the naming system, such as Content Delivery Networks (I'm still a bit confused about exactly why there's a conflict here). Looking at the trace analysis, they conclude that caching does not significantly improve performance for the clients but merely acts as a load-controlling mechanism for the name servers.

Even though the evaluation results were convincing, an intuitive explanation for the minimal impact of caching on client-perceived latency (for example) seemed missing. Yet again, several critical performance issues (negative responses, negative caching, etc.) were identified but a deeper insight into these problems was not necessarily offered. The paper provided a useful understanding of the practical issues surrounding the DNS, especially as it begins to interact with other protocols (e.g. TCP) and mechanisms such as CDNs.

Development of the Domain Name System

This paper discusses the motivation and design behind the Internet's Domain Name System (DNS). Initially, the authors describe the existing centralized HOSTS.TXT record file (which seems like a terrible idea right now, given the single point of failure) and the clear need for a distributed scheme. The hierarchical, tree-like structure that is presented seems quite intuitive and there is little mention of possible architectural alternatives. Unlike most previous papers in the syllabus, this one spends less time foreseeing potential problems; instead, the focus is on DNS design, functionality and a few immediately observed performance issues.

Caching was probably one of the more interesting points of discussion. One issues that was mentioned is that while meaningful query results are cached, negative results for nonexistent records are not. In my opinion, this is a valid but somewhat trivial concern. Instead, some foresight into security issues (i.e., cache poisoning) would have been appropriate. Interestingly, there appears to be a trend throughout many of the original Internet papers - strong assumptions are made about the trustworthiness of participants. Overall, this was an informative paper that underscored the reasons behind the DNS architecture. However, it did not pose as many questions going ahead; in a sense, it felt like undergraduate networking material.