Monday, September 8, 2008

Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks

By mathematically characterizing the effect of different back-off schemes on achieving fairness and efficiency of a network system, Chiu and Jain are able to show that an additive increase and multiplicative decrease network traffic feedback/control scheme does the job of converging to both fairness and efficiency under any initial starting state. Chiu and Jane define fairness to be the degree (or percentage) to which nodes are utilizing an equal amount of network resources. Efficiency is defined in terms of the distance from the "knee" of the throughput x load curve, where being closer always means better efficiency. By using a simple two node example it is shown that additive increase and multiplicative increase always moves closer to the optimum point (i.e. 50% utilization of network resources by node 1 and node 2). The paper later generalizes this to n nodes, and shows that in fact for optimal convergence to fairness additive increase and multiplicative decrease is the best feedback policy. Nonlinear controls are considered but not explored much and possibilities for future work is presented.

The paper overall did an excllent job of characterizing the system mathematically and showing that the additive increase/multiplicative decrease was the best under the assumptions made. The paper however did not justify thoroughly the assumptions that it made about the system sending the same feedback response to both nodes and simplifying the system to be "turn-based" in that nodes react and then all nodes get feedback, etc.

Interesting dicussion topics would be how this work has influenced network design today, and how networks today implement feedback.

Tuesday, September 2, 2008

On Inferring Autonomous System Relationships in the Internet

Gao essentially classifies ASes as providers, customer, peers, and siblings. Gao's main argument is that by understanding the relationship between these ASes today we may increase our understanding of the current architecture of the internet. She also contests that while studies have been done on the physical existence of links in the internet, these studies do not reveal the true connectivity because of the import and export policies between ASes. Thus, she embarks on a valiant attempt to find out these relationships, despite their confidentiality in the business environment, by developing a set of characteristics of routing tables implementing particularly common import and export policies. Although stated in my previous blog entry, I feel that it is most relevant to this work, so it is repeated here:

Route Export Policy (to whom: what routes are exported to them)
Customer: {provider, customer, peer, internal}
Sibling: {provider, customer, peer, internal}
Peer: {customer, internal}
Provider: {customer, internal}

Using this export policy and a lot of formal mathematics, Gao proves that if ASes all follow these rules then each routing table entry's AS Path fits 1 of 6 explicit patterns of inter AS paths. Given this conclusion, a set of routing tables, some heuristics and a dash of algorithmic magic, Gao produces experimental results of what potentially may be the AS relationships in the internet. Ultimately 99.1% of the relationships inferred by her algorithm were confirmed by AT&T internal information. Quite impressive, if you ask me.

Background material for this paper is the previous paper, "Interdomain Internet Routing".

This paper was very impressive in its results that it presented. 99.1% of the relationships inferred were confirmed by actual information, in the face of a few interesting assumptions and heuristical guesswork. The paper probably could have been a bit easier to read if it had used less formal mathematics, but this is small matter of style; Gao's mathematical formalisms ultimately got the point across. The analysis of the results was also excellent work. Overall an excellent and inspiring paper. Definitely a keeper.

Interdomain Internet Routing

Wide area network architecture is a collection of interconnected Autonomous Systems (ASes). BGP is used as the protocol for communicating routes and route policies between these ASes. IGPs are used to route packets (even BGP packets) through an AS, and typically use either link-state or distance-vector protocols. The goals for BGP were the following: scalability, policy enforcement, and cooperating with other ASes despite being competitors. BGP links are either transit links or peer links. A transit link is a link over which a provider carries paid-for traffic from a customer. A peering link is one over which two ASes mutually benefit from carrying traffic over, usually to allow customers from both ends reach each other. Neither AS pays the other because the traffic carrying is roughly symmetric.

An AS wants to make as much money as possible and provide the best service to its customers. Therefore, an AS's best interest routing policy boils down to the following:

Route Export Policy (who: what routes exported to them)
Customer: provider routes, customer routes, peer routes, and some internal routes
Peer: customer routes and some internal routes
Provider: customer routes and some internal routes

Import policy is based on the following preferences, in order of priority. Local preference means that a route coming from a customer > peer > provider. If two routes to the same destination are both say, routes through peers, then we choose the smallest AS path length. If those match, we choose the smallest multi-exit descriminator. Otherwise we choose eBGP learned paths over internet BGP learned paths, and following that are shortest internal path and finally router ID, if all else is equal. MEDs are useful to prevent the free-loading problem between ASs, which is also responsible for asymmetric traffic paths.

Some interesting background information here is the difference between link-state routing and distance vector routing in IGP protocols. Link-state routing is where each switch in the network constructs its own image of the topology of the network and makes routing decisions based of it that image. Distance-vector routing involves using the Bellman-Ford algorithm, and so at each time step exchanges "best path" information between neighbors.

This was an educational "paper", and having read the next one already I can see it serves as excellent background reading. The organization could have been better by juxtaposing the export and import policy sections, instead of having an interlude of the design of BGP between the two. The paper also lacks much discussion on the topic of failover and scalability, rather just mentioning a few issues here and there.

A question I have is, if the AS Path that's received contains duplicate entries, why can't the route importer detect this and eliminate the duplicate AS entries in order to know the true AS hop length? Are there honor codes among ASes that prevent them from cheating the system? Another interesting discussion topic would be about the security flaws of BGP. What damage could a malicious AS do to earn more money without being detected? What are the legal ramifications of this?

Monday, September 1, 2008

A History and Evaluation of System R

Seen today as commonplace and ubiquitous in network engineering, features and functionality are consistently placed at the end-hosts instead of being implemented in the network architecture itself. Although this seems appropriate at first glance, justifying the end-to-end argument is non-trivial and requires careful consideration. The argument itself is stated briefly as such: Any feature between end-hosts must always depend, at least in part, on the end-hosts themselves. Thus, the communication system cannot completely implement any end-host feature. To justify this argument the paper delves into a classic example of implementing reliable communication, either in the network architecture itself or in the end-hosts. If reliability were a network feature, the end-hosts must still do their own error checking since the file system, file transfer program, or the buffering system may have a failure. Thus, the burden on the end-hosts is not lifted. It is noted, however, that an appropriate failure rate is still necessary to ensure that an exponential number of retried packets don't completely flood the network. The author, in fact, note that it's probably sufficient to have an error rate less than that of the application level.

Another reason for not implementing reliability at the network level is simply because some applications are not going to need it, or even want it. By placing the burden of implementation to the higher levels, only the applications that need it can have it. This reasoning applies not only to reliable communication, but also data acknowledgments. For some applications this is completely unimportant (i.e. voice communication). Additionally simple acknowledgments may be useless to the end-hosts, as they are only interested in knowing whether or not the receiver acted appropriately on the data. In this case, the receiver would be sending it's own acknowledgments as well. The paper also delves into the area of encryption, considering that if this were a network feature then authenticity must still be checked by the application and the network system must then be completely trusted by those that use it. By implementing encryption on the end-hosts, they do nearly the same work and don't have to worry about trusting the routers. The paper does, interestingly, note that one good reason for doing this may be to prevent end-hosts from transmitting freely unencrypted information that they shouldn't. The paper goes on to apply similar arguments to duplicate message suppression, FIFO message delivery, and transaction management.

Some good background information would actually be "The Design Philosophy of the DARPA Internet Protocols", which goes into a more anecdotal analysis of why the network architecture turned out to be so basic, in contrast to this paper's style of formal reasoning. For instance, that part of the reason for leaving greater functionality to end-hosts was simply that it was difficult to acheive consistency and correctness between a group of interconnected networks that were all forced to implement anything more than the very basics. Some interesting dicussion topics may include what problems we've run into as a result of pushing features more and more on to network programmers. How difficult has it been to develop features on the end-hosts (such as reliable file transfer) and what mistakes have we made along the way?

The paper's main focus was to give straightfoward and logical arguments to support the end-host argument, which I believe it did quite successfully and very clearly. However, in the design of something as complex as the internet it may be more convincing to give more examples of systems that were feature rich in the network architecture and experience problems in supporting a variety of communications. Of course, such an example may not exist, but if it did, it would lend incredible support to the authors' argumenents.

The paper should be kept in the syllabus because its reasoning is clear, straightforward, and most importantly, convincing. Although what the paper actually says is repeated in other papers, it may be useful to have this paper as a condenced set of arguments for the end-to-end argument.

Because of the paper's clear reasoning, few issues are left to future research, aside from possibly thowing the end-to-end argument out the window and seeing how far you can get in "beating" the internet in terms of the 7 goals listed in the previous paper discussed.

The implications of the work are that we can rest easy at night, knowing that in the last 20 years we've made the right engineering decisions when we designed the Internet.

The Design Philosophy of the DARPA Internet Protocols

In the U.S. Department of Defense's initial efforts to build a network of networks, 7 goals were stated clearly that governed the design. First on the list was network resiliency, which naturally led to a stateless networking infrastructure, as it was too costly to replicate state sufficiently to survive network failures. Instead "fate-sharing" was adopted, with the tradeoff of having less expensive network hardware at the cost of more expensive and potentially faulty end hosts. The second goal was to support many types of communication services, leading to the simplicity of services offered by the network architecture. The third goal also argued in favor of this approach, as it was to accomodate a variety of networks, which was generally felt a much easier task if the network were kept simple.

Some of the other goals were not as successfully met. Distributed managment lacked (at the time of writing) sufficient software tools, and many times routers were manually configured. Cost effectiveness too has been questioned, as a more featureful network would do better to support particular types of communication, although clearly not all. Attaching hosts with little effort was also questioned because bare-bones internet architecture forced features to be implemented on the end hosts. Finally, accountability was hardly addressed at all, and is still an active area of research.

Some background information on the OSI model and how different types of services are partitioned in the network stack today reveals the fruit of the design decisions that have been analyzed in the paper. In particular, the hour-glass form of the internet where IP is the common protocol over which all other protocols must communicate demonstrates the simplicity of serivces that the internet provides at its core, and what services must be implemented at the end hosts.

I felt the paper did a relatively good job of recalling historical events in the creation of the internet architecture and why particular design decisions were made. Enumerating the 7 goals at the beginning and digging into each one sequentially provided for a writing structure that was easy to follow. Indeed, looking at current internet protocols reveals little about why they were constructed, and the paper does a good job of recalling why they became what they are today.

Topics for class discusison may include what is currently being done in the area of accountability in intenet architectures, and if such things can be accomplished without burdening the network to such a degree as to degrade service to customers. Also, what were the tools like during the initial construction phases of the internet to provide selective connectivity between separate administrative domains (ASes, as we call them now)? What are the privacy/legal issues with implementing detailed accountability of customer traffic?

The open issues now are, as far as I know, accountability and the cost effectiveness of the internet architecture. Could we develop a mesh of partially specialized networks and acheive greater network utilization per dollar? How can we track traffic flows on a per customer basis and charge appropriately? What are the privacy issues with this?

Taken as a whole, Clark's work allows us to see with much more clarity the reasons why, quite simply, things are the way they are. Years of guesswork and trails went into building an internet architecture that met the intial 7 goals, and this paper takes a proud look at what has been accomplished through the best engineering insights we could muster. The implication of this paper is that we are now more educated about these insights and can use them in our own analysis of the Internet.