Reduce your context switch delay

June 25th, 2008

Sometimes, simple shell scripts can save a lot of time. Recently, I noticed myself waiting for various unit tests to complete by surfing the web: a surefire way to be distracted for more than the time it takes for the tests to complete (or fail). Enter the following script, which I call notify:

#!/bin/sh
"$@"
status=$?
xmessage -center "$(basename $1) done, status $status"
exit $status

You run it like this:

notify make all

at which point make runs along merrily. Of course, you replace make with whatever command you run that takes a long time to complete. When make exits, a small window appears in the middle of your screen that says “make done, status 0″. This immediately notifies you to stop surfing and get back to work.

So… get back to work!

Characterizing failures in a data center

April 30th, 2008

Part of my research has been investigating how to build storage systems that can provide availability and durability despite failures. It’s been interesting to see recent papers that characterize failures, such as Ethan Katz-Bassett’s NSDI paper about Hubble, or last year’s papers about drive failure characteristics from Google and from several high performance computing facilities. Today, while catching up on reading High Scalability, I came across a Jeff Dean presentation about scalability at Google, which includes fascinating anecdotal tidbits about failures over a year in a typical Google data center, with frequency, impact and recovery time, such as:

  • 0.5 overheating events: power down most machines in <5 mins, ~1–2 days to recover
  • 1 PDU failure: 500–1000 machines suddenly disappear, ~6 hours to recover.
  • 1 network rewiring: rolling 5% of machines down over a 2 day span.
  • 5 racks go wonky: 40–80 machines see 50% packet loss.

His presentation includes several other classes of rack, network, and machine failures that you can expect to see with real hardware, and that scalable, distributed systems have to cope with and hopefully mask. For the full list of failures, you can view the presentation (1.2 Mbyte PDF) or the video. I wonder how well Chord/DHash would fare in such an environment…

Choosing a camera for your small business

April 1st, 2008

If you are in a small business that needs the occasional picture—for record-keeping, documenting events, or including in promotional material—having a digital camera on hand is definitely useful. A friend recently wrote:

We want something that isn’t too complex, but takes decent shots for print and web. Possibly with a good zoom so we can get wide-angle [...] as well as portraits. Nothing crazy on either end, we don’t need to have multiple lenses and all that.

Would you recommend an slr?

For my friend’s uses, I have no hesitation recommending a digital SLR over a compact, ultra-compact, or superzoom. Normally, the main reasons for getting a smaller camera are price and portability. While DSLRs have come down significantly in price—for almost the same cost as a high-end superzoom, you can purchase a quality, entry-level DSLR—there are definitely functional options available for under $200. DSLRs do not fit into your pants pocket, unlike ultra-compacts; then again, a more functional compact or superzoom will also not fit in your pocket. A small-business, however, is unlikely to require a camera that can be carried off in someone’s pants pocket.

Digital SLRs currently have numerous advantages, including:

  • high quality sensors,
  • fast focus and shutter response time, and
  • flexibility in terms of lens, lighting and processing.

This means excellent image quality, never missing a shot because of the camera, and room to grow. Much has already been written about the importance of a quality sensor—pixel for pixel, DSLRs will have better sensors, capable of taking clearer pictures with less noise in low-light conditions (e.g., the interior of a business). Similarly, even a basic lens will have higher quality optics than the average compact camera. The ergonomics and usability of digital SLRs is excellent: they are comfortable to hold, turn on instantly, and take pictures when you click the shutter. Modern DSLRs have excellent auto-exposure (”program”) modes, allowing them to function as point and shoots, but with the option of additional photographer control.

The reason I highly recommend a digital SLRs for a small business in particular, however, is flexibility. First, digital SLRs almost always offer the option of RAW capture, which allows for great latitude in image-processing after the fact. Second, with a known brand like Canon or Nikon, it easy to incrementally improve the capability of the camera by using additional lenses and off-camera lighting equipment. You may not want to own a plethora of lenses, but you may occasionally want to rent the highest quality professional gear. With rates starting from $25/weekend (from a local store like Calumet) or $50/week (from a mail-order shop like BorrowLenses.com), you can get what you need for a specific project, while having a quality camera around for regular use.

How to pick the camera for you? For a small business, don’t worry about perusing the many specs at dpreview.com. An entry-level (or one-generation-old medium-level) camera from Canon or Nikon, purchased at a reputable store like Amazon, B&H Photo or Adorama, will serve you well. If you know any photographers, choose the brand that they own, in case you have any questions or want to borrow lenses or flashes. A more expensive camera is generally unnecessary—you will know when you are ready to use one.

Clean up a Twitter feed with a Yahoo Pipe

March 17th, 2008

Twitter provides RSS/Atom feeds of your posts; with these feeds, your posts can be easily tracked in news readers like Google Reader, monitored in aggregators like FriendFeed or SocialThing!, and cross-posted into other blog services such as Tumblr. This idea works fine, except for the fact that Twitter has been co-opted to be not only an ambient intimacy service, but also a chat service. This can create noise in other people’s view of your feed—consider the chat versus status/micro-blog updates on, for example, Adam Darowski’s blog:

Twitter Microblog example

Lacking context for the replies, the individual message may be hard to follow. Using Yahoo! Pipes, we can generate a clean RSS feed that can be used in FriendFeed or Tumblr. Seeing that the hashpip.es service (that filters out #hashtags from a Twitter feed) was built in part with Pipes, I built a simple pipe called Twitter Feed without Replies that anyone can parameterize and use to filter their replies. Simply visit the pipe’s information page, enter your username, and get the results as RSS (under “More options”). The main downside of the current implementation is that the feed description and title can not be parameterized as well.

Incidentally, Yahoo! Pipes were really easy to use and seem nicely designed for easy integration with other services. The above pipe took an hour to build and it was my first experience with the service. With a little more work it would probably be possible to build a pipe that parses/generates JSON, for use in programs such as the WordPress Twitter Widget, as well as RSS for feed readers. On the other hand, for those cases, it is probably easier in those cases to take Twitter JSON output and filter that directly.

Do @replies in microblogs bother you? Would you care enough to remove them?

How to use ssh to securely access the net

March 14th, 2008

Public wireless networks can be scary; you never know who might be sniffing your traffic, recording your GMail authentication cookies, or worse. Ideally, all of your net activity should be end-to-end authenticated and encrypted. Fortunately, since this is not always feasible, ssh makes it easy to use an untrusted network by routing your traffic through a trusted end-point. All you need is an ssh client (OpenSSH, standard on most Linux/Mac systems or PuTTY for Windows), an HTTP/HTTPS proxy (optional), and clients that support SOCKS5 (most software these days). These techniques are new but I didn’t really learn them until I started working at cafés so it may be worth re-summarizing.

The steps are pretty straightforward.

  1. Enable dynamic port forwarding for ssh. This creates a SOCKS proxy on your localhost at a port you specify; this proxy will handle the connection forwarding, over the secure (authenticated and encrypted) ssh connection.

    I connect to our trusted server at work; if you don’t have a trusted server, you can try getting a free shell account. You can automatically enable dynamic port forwarding by setting DynamicForward in your ssh_config file (or creating a PuTTY profile) for your shell host.

  2. (Optional) Set up Polipo with a configuration file that points its parent proxy at the port you used for dynamic forwarding. I like using a separate web proxy so I can switch easily switch between tunneling through ssh or direct connection by just switching out the web proxy configuration without reconfiguring all my applications individually. A proxy also ensures that your DNS requests are not visible to the local insecure network.

  3. Configure all of your network applications to use the SOCKS proxy (or HTTP proxy). For application-specific instructions, you can view the Torify HOWTO; the “anonymizing” Tor network’s interface also uses an HTTP or SOCKS proxy, so the same instructions apply. (Unfortunately, Tor is neither secure (it has untrusted exit points) nor really anonymous (see any of Steven Murdoch’s papers about Tor) so I can’t recommend it. It’s slow too.) I tunnel Firefox, my Twitter client, and my IM client through the web proxy. If you choose not to use an HTTP proxy, Firefox and Pidgin both support directly talking to the SOCKS proxy.

Also, if you do not use a webmail sevice like GMail, make sure you configure your mail client to both read mail over SSL/TLS (e.g., secure IMAP) and to authenticate the outgoing mail server as well. I have been in a hotel that transparently redirected all outgoing mail traffic (port 25) into the void.

The result: all traffic to and from your laptop is secure from prying eyes. A side benefit is knowing that your traffic is exiting the Internet from a trusted host.