Thoughts on Systems

Emil Sit

How to Extract PlanetLab Geographic Data

During the course of a given week, I answer a lot of technical questions. They range from the friend asking, “What laptop should I buy?” to strangers with very specific questions about the source code used in my research. I rather enjoy solving technical questions and taking a line from Jon Udell’s ”Too busy to blog?” post, I’m going to start posting some useful answers on this blog. If you have a question, please send it along!

This week’s question comes by way of the planetlab-users mailing list. PlanetLab is “an open platform for developing, deploying, and accessing planetary-scale services.” It consists of now over 800 distributed noes, (over)used by systems and networking researchers to approximate real wide-area deployments and validate research ideas. For example, if you’ve ever used the Coral content distribution network or the CoDeen web proxy, then you have used PlanetLab.

Jeff Sedayao asked:

I’d like geographical data on nodes - I know that there is lat long data in the PLC, but I don’t seem to be able to find an API for getting it out. Ideally I like it to be queryable through comon so I can do queries like “find nodes that are usable within the following geographical areas” but I don’t see a queryable hook for that. If anyone can help with a pointer at getting this data out, I’d appreciate it.

The following code (here shortened for the web) does not use the PLC API; rather it makes use of an XML dump of the node database that PlanetLab publishes periodically. The node database dump groups nodes by site: this code extracts the geographic coordinates for each site using Python’s included XML parser, calculates the distance from center using geopy, filters out those that are further than range miles away. It then prints out the hosts at any remaining sites.

#!/usr/bin/env python
import xml.dom.minidom
from urllib import urlopen
# Requires geopy from http://exogen.case.edu/projects/geopy/
from geopy import distance

# Print things that are range miles from the center.
center = (32.877, -117.237)
range  = 100

fh = urlopen("https://www.planet-lab.org/xml/sites.xml")
sites = xml.dom.minidom.parse (fh)

for e in sites.childNodes[1].childNodes:
    try:
        lat = float (e.getAttribute ("LATITUDE"))
        long = float (e.getAttribute ("LONGITUDE"))
    except: continue
    a = (lat, long)

    d = distance.distance (center, a).miles
    if d > range: continue

    for h in e.childNodes:
        try: name = h.getAttribute("NAME").lower ()
        except: continue
        print name

The resulting list is suitable for passing to tools like vxargs.

PlanetLab’s list archives have the full thread with some other options presented as well.

Comments