Thoughts on Systems

Emil Sit

May 12, 2006 - 5 minute read - Research conferences

NSDI 2006, Day 1

This week was the Third Symposium on Networked Systems Design and Implementation, sponsored by Usenix and held this year in San Jose. As with any conference, there were many opportunities for networking and meeting other researchers: I met and caught-up with students (mostly) from from CMU, Cornell, NYU, UCSB, UCLA, UMass Amherst, UT Austin as well as old MIT colleagues who are now out taking over the world.

There were a lot of talks that were interesting to me. I liked talks that presented systems with a neat underlying idea, such as Evan Cooke’s system for expanding the view of honeypots or Kyoung-Soo Park’s way of leveraging OS optimizations for web server performance while cleanly factoring out common functionality (presented by Vivek Pai); there were also impressive systems that resulted from clean design and solid engineering, like Dilip Joseph’s overlay network layer (which he demoed live during his talk!) and Ashwin Bharambe’s multi-player game system. This post summarizes the talks from the first day; I’ll try and post my notes from the next few days soon.

Udi Manber, Google: Keynote

The conference opened with a keynote by Udi Manber, now of Google. He recycled many standard slides about the growth of Google and their goals. He presented a summary of the Google architecture: a dispatch server integrating results from index servers and document servers, and a number of services on the side like spell-checking and AdSense. The main point of his talk was to show why search is still a hard and interesting problem. He gave three main reasons: first, users ask questions using one or two words that are hard to interpret; second, websites don’t always make it easy to find answers; third, there is no real curriculum nor research agenda for “search” in academia.

The first two Google works very hard on. He gave some examples of how hard it is to understand users (e.g., you need domain knowledge to correctly interpret a query for “Cofee Anan” and some luck to handle a request for “Briney Spears” from someone looking for pickles) and how websites don’t always make it easy to index information (e.g., a website that’s just a scanned image of product brochures). Google’s index and spell-checker try to take into account context and domain knowledge to solve the first case, and they’ve produced some technologies like sitemaps for the second. He did say that Google does not yet try OCR on all images. The third point is something that conference attendees could help with. However, he did not offer any concrete suggestions about curricula or research directions in the talk or in response to questions. (The talk was mostly advertising Google and focusing on Google’s features.)

In terms of teaching users how to search, there are books like O’Reilly’s Google Hacks. Off-hand there are a few areas that I would focus on as a graduate student if I wanted to build large search systems: natural language processing, sub-linear algorithms, and distributed systems; Mr. Manber’s point is that there’s probably no school that explicitly groups those two things together for a single degree!

Replication

Unfortunately, I missed the first session of the conference because I was doing last minute polishing of my own talk. I opened the second session by presenting our paper on efficient replica maintenance.

Jiandan Zheng from UT Austin gave a talk about PRACTI replication which is a framework that can capture a wide range of replication algorithms. PRACTI captures three dimensions of distributing and maintaining up-to-date replicas: partial replication (as opposed to full replication), arbitrary consistency (allowing clients to see stale or out of order updates), and topology independence (allowing any structure of node synchronizations. They have an implementation of basic replication algorithms and one of their goals is to implement the algorithms of existing systems like Bayou or Coda within their framework.

James Mickens gave a talk about predicting node availability based on past observations and whether it was possible to take advantage of any predictability. He analyzed a number of traces and classified the types of node availability—for example, unstable nodes versus always on nodes. The talk did not discuss the possibility of automatic classification perhaps using basic machine learning. PlanetLab appeared to be relatively hard to predict.

Tools

The last session of the first day presented three tools. Time dilation was introduced as a technique for evaluating the potential impact of future trends using current hardware by using virtual machines and telling them that time is going slower than reality, allowing the VM to do more in the same amount of virtual time.

Evan Cooke presented the Dark Oracle which is a clever idea for expanding the view of honeynets and network telescopes. These have traditionally relied on measuring traffic to small and contiguous unused address spaces. However, these may be easy for attackers to avoid. Fortunately, based on their analysis of announced eBGP routes and looking internally at some iBGP/OSPF routes, there are many addresses unused within allocated and active address spaces. Dark Oracle allows dynamic discovery of these addresses and redirects incoming traffic to those unused/dark addresses to a honeynet or analysis tool. It can also be used to trap outbound traffic for unused addresses as a way to detect malicious hosts on a local network.

Finally, Patrick Reynolds from Duke closed the day by presenting Pip, a system for detecting anamolies in distributed system. It works by allowing the programmer to annotate the system with events (or extending the middleware to automatically annotate, like I’d probably want to do for dm’s libasync); events are then collected from nodes in the system and analyzed centrally. The programmer can specify expectations for system behavior in a domain specific language; Pip identifies when these expectations are violated and can present a trace of actual behavior. It also includes a graphical for exploring traces. Pip has been successfully used to find bugs in the implementation of a number of published systems. I’m excited about the idea of this tool but it looks like it would be a bit of work to get it usable with my current development toolchain.