Thoughts on Systems

Emil Sit

May 25, 2006 - 5 minute read - Research conferences

NSDI 2006, Day 2 Afternoon Sessions

Following up my summaries of the morning sessions, this post reviews the second half of the second day of NSDI 2006. (This post was updated to correct bogus timestamp and add links to papers in the first session.)

Measurement and Analysis

Haifeng Yu began the afternoon session with his award paper on the availability of multi-object operations (PDF). The main concern is that prior work has largely focused on the availability of single objects. However, it is often the case that there are many objects stored in a distributed system that are related—in fact, an application-level operation likely requires accessing multiple objects at the storage layer. This paper takes a look at how object replica placement policies impact the ability of such operations to complete successfully. The basic trade-off is in whether related objects are clustered on the same hosts (partitioning) or distributed across multiple hosts (random placement). The paper discusses when one or the other may be more appropriate.

Suman Nath followed up with a related paper on correlated failures (PDF). The key points made in this paper affect the accuracy and utility of failure prediction models and what can be done about it. Based on the analysis of some traces, the Nath first argued that existing predictions of failures work well enough but don’t necessarily help: correlated failures of small numbers of nodes tend not to have much impact anyway so prediction doesn’t matter, whereas correlated failures of large numbers of nodes is quite hard to predict but tend to have a huge impact on availability or durability. This means, for example, that additional replicas or fragments quickly offer only diminishing returns. He introduced a model for generating failures based on a bi-exponential distribution that better matched the traces that were analyzed and argued that such considerations are important to keep in mind during design. The question naturally raised by this was how generalizable these results are as different node populations will have very different failure distributions: choose your traces carefully.

Bianca Schroeder closed the session by presenting her experiences with various benchmarking tools (PDF). She observed that despite identical load parameters (e.g., requests per second), two different benchmarking tools could show very different results. The reason for this turns out to be in whether requests are generated using an open or closed system model. An open system model is one where “users” (and hence requests) arrive continuously: there is no linkage from the completion of one request to the arrival of the next. Such a model might accurately capture the effect of Slashdot flash crowds. The closed system model simulates some set number of users executing some browsing workload: here request completions spawn new requests. The paper goes into detail about different web benchmarking tools, the model they use, and when it may be appropriate to use one type of model or the other.

Architectures and Abstractions (and Email)

The final session of the day had two talks about Internet architectures and two about dealing with spam.

Niraj Tolia spoke about data-oriented transfer, an architecture for Internet data transfer designed to make innovation in data transfer protocols easier. Currently many protocols (e.g., HTTP, SMTP, SFTP) involve two basic phases: content negotiation and then content transfer. Negotiation involves selection of content (perhaps based on language, supported encoding or encryption algorithms, etc.); transfer is just moving the actual bits around. There are many mechanisms for moving bits around, from HTTP chunked encoding to straight TCP writes to BitTorrent. However, adding support for a new bit moving mechanism to each protocol is difficult. The paper proposes an architecture that separates negotiation from delivery and shows how it can be easily incorporated. The concept seems similar to that of GSSAPI or PAM, which tries to abstract out security/authentication protocols. I suspect it will be hard for this idea to gain traction though.

Dilip Joseph spoke about Ocala, which is designed to facilitate interoperability between varying overlays. It does this by (naturally) introducing a layer of abstraction between applications and the underlying overlay. There are additional naming tricks used to allow users to cleanly specify overlay which overlay they want to use. The implementation is complete and runs on the major operating systems; Dilip gave a live demo during the talk.

Cleverly, the program committee placed two alternative proposals for handling spam together at the end of the longest day: everyone has a stake in spam and anti-spam proposals always face harsh criticism so two competing proposals was sure to generate some discussion.

Mike Walfish started off with the grand vision of distributed quota enforcement (DQE): a distributed mechanism for ensuring that no one (spammers or not) sends more than an allocated amount of mail per day. Since there is a huge disparity between the number of mail messages a spammer sends per day and the number that you and I send, this concept should work out quite nicely. It would avoid perhaps the worst problem of all: legitimate mail that gets mis-filed as spam and subsequently ignored. Mike carefully deflected criticism of the concept by focusing on the technical aspects of quota enforcement: stamps (not associated necessarily with money) are generated based on the quota and canceled in a loosely managed distributed system. The simple requirements allow for a relatively efficient solution.

Michael Kaminsky followed with a white-list proposal based on social networks, also with the goal of avoiding false positives. Re allows mail through without spam filtering if it is from your friend or a friend of a friend. The details of Re are in how it can provide friend lists and friend-of-friend checking without sacrificing privacy requirements: friend lists should be private and queries should only reveal information about the person sending mail (do we have a friend in common). The authors use a relatively new cryptographic mechanism to perform this matching task while protecting confidentiality; Michael briefly discussed the evaluation against real e-mail traces where false positives are inferred based on other known patterns. Unlike DQE, Re can be deployed at smaller scale but may not be as comprehensive; it also requires more work on the part of the users (specifying friend lists).