DirtDB: A plugin-based network daemon to dig up Internet dirt

I’m currently working on DirtDB, an important component of the Cloud Mirror for installation at the 2010 Sundance Film Festival. DirtDB is a network daemon that answers requests via XML-RPC to “dig up dirt” for people on the Internet.

Its been up and working for two hours, and in that time I’ve developed a twitter plugin and an IMDb module. Next up is the National Sex Offender Registry and Flickr, with plenty more to come in the weeks ahead.

It expects that a small pool of available information about a victim is available in a MongoDB database. It then invokes a series of plugin modules, each of which use that corpus to mine the Internet (or local databases) for interesting information about the person. It augments the person’s MongoDB document with interesting finds. Because DirtDB is designed with the Cloud Mirror in mind, the plugins are biased to produce small snippets of text (for presentation in the victim’s thought bubble) and images (for augmentation of the victim’s badge).

But DirtDB is a general framework for human data-mining operations. It follows a blackboard pattern I’ve been fixated upon recently.

Developing DirtDB has exposed an interesting problem: plenty of people share a name. How then to distinguish you from all the other people who call themselves you? The solution lies in a technique that I have yet to relate to a formal model, but for which one certainly exists.

Assume that DirtDB knows some additional information about you, beyond merely your name. Say we know your email address. Even armed with this information, many Internet databases and APIs allow you to search by name, but not by email. Realizing, then that a name-only search is likely to bring up you and your dopplegangers, you might believe that knowing your email address is not useful in these cases. But in fact, using a name-only search, DirtDB can mine the internet for representations of many possible identities. Each name-only search yields a new proto-person with attributes that, though juicy, don’t conclusively belong to the victim.

Once a sufficient number of these proto-identities are built, DirtDB can try to collapse them by looking for correlations between their attributes. Did two name-only searches yield a similar physical address? Then those two identities are the same identity! Did one of those results include an email address that matches the one already on file? Then we have a chain of implication.

Now that I think about it, this sounds like “ABox reasoning” which I learned in semantic web research.

Read More...

A formula for accurate freeway traffic visualization

Maps are a tool to represent the distance between any two points. Distances are subject to a distortion induced by map projection, but lets set that aside, because we’re going to make a map of Los Angeles, over which projection errors are negligible.

I want to make a different type of traffic map–one that represents the time required to travel between points at fluctuating speed. It does so by distorting or stretching the map along freeways (actually, along any route for which we have traffic data) so that that road and all the map areas connected to it are made farther away.

The formula for doing this is as follows:

  • Get OpenStreetMap data for Los Angeles.
  • Map each point on the map to a vertex on a 2d mesh
  • Apply a surface deformation along each shortest path between points for which there exists traffic data (do any of the providers have a usable API for this stuff?). Note: we have to finely divide the mesh, so that high-traffic routes that are “contained” by low traffic routes are turned into zig-zags.
  • Apply the deformation on the grid back to an image of the original map.
  • Post the data live on the web, as a warning for those who dare to travel the infinite streets of Los Angeles in traffic on the evening of a long holiday weekend

Me… I’ll be biking it.

brainstormed with @kjkjerstin and @armedneutrality

Read More...

Egometrics [build day 1]

I was inspired by Dorkbot this afternoon, so once I could extract no more sympathy for my hangover I decided to get some work done. There is no better cure.

egometrics board build 1

Read More...

I am not mostly a geek

Or, Besmoke demo-ed at the BIL conference.


I am not mostly a geek. from eric gradman on Vimeo.

Read More...

Augmented Reality and Physical Simulation

For an upcoming interactive demo, I am experimenting with using augmented reality devices to manipulate virtual spaces.  In this experiment, I insert a “paddle” into the observed world.  I then drop some (virtual) cubes from the ceiling and demonstrate that the paddle interacts with these virtual objects.

This is one night’s work; soon you’ll see this flushed out into a more compelling interactive demo.

Read More...

Experiments in high-energy awesome

My current art piece/science fair project/experiment in high-energy awesome, as explained to Nightshade via IM.

2:05:19 PM Eric Gradman: two opponents face off over a table. there is an image projected on the table by a projector mounted overhead. the image features “bubbles” that are drifting around. players interact with the bubbles using their shadows. however the degree to which a given player’s shadow couples to the bubbles is a function of their brain’s beta signature, as read by an EEG (OCZ NIA)
2:05:59 PM Eric Gradman: each player is bathed in light from an overhead LED panel. if you’re fucking up, you’re bathed in red light. else blue.
2:06:35 PM Nightshade: hahaha
2:06:38 PM Eric Gradman: There’s really no rhyme or reason to how this particular configuration came about. Actually, it was going to be two separate projects which I slammed together out of sheer laziness.
2:07:03 PM Eric Gradman: And then gave myself the artificial deadline of next Thursday, just to keep things interesting.

Read More...