September 27, 2008

Open Source World Library

 Here’s a new section from the book beta, a topic I’ve been enthusing about recently.  Comments are welcome, as are pointers to any existing open-source projects which are similar to the idea.  If there are no such projects, well, someone should get cracking on writing one!  Oh, and if someone wants to make the logo described, that would be cool too 🙂 (thanks to krustad for the logo).

>A scene at Ephemerisle

>_"Aha", John said, as he sees the pirate flag with an image made from 1’s and 0’s.  "This is what I was telling you about!"._

>_"A place where you can copy some guy’s music?  What’s the big deal", replied Richard._

>_"Oh, it’s a lot more than that.  Just watch."_

>_John approached the desk, where cat-10 cabling lay everywhere, connecting ethernet hubs blinking crazily to a home-built rack of servers.  "It’s a linux-based redundant filesystem, based on Google’s GFS.  Like RAID, but managed in software.  Each bit is replicated several times so any one hard drive can fail.  The rack has 100 terabytes of storage – and I bet they’ll need more soon."_

>_He sat down, took out his laptop, and plugged it in.  "First I run the client program, and point it at my media folders – music, movies, and that sheet music I’ve been collecting from school.  It hashes every file, finds out what it doesn’t have, and copies it.  Meanwhile, we browse what it has…"_

>_"Woah", said Richard, as the menu popped up.  "Music, movies, books, academic papers, sheet music, code, DNA, fizzobs…that’s a lot of categories!  And what are fizzobs?"_

>_"Physical Objects – 3d models of useful things like, well, the parts to make a 3d object printer!_"

>_"Are they pirated, though?  Isn’t a lot of that stuff available online?"_

>_"Definitely", replied John.  "But the idea isn’t just to have things protected by copyright – it’s to collect all the useful knowledge of the human race, whether or not people want it to be free.  Some fizzobs are parts from commercial products, or scans of famous sculptures which are illegal on land, others are open-sourced.  The World Library collects them all.  Fancy a styrefoam Rodin for our dorm room?  Or should we spring for plastic?  I don’t think we could afford a granite one – information may be free, but raw materials sure aren’t!"_

>_"Forget the sculpture – do they really have every movie ever released in digital form?"_


The idea here is to develop a world library of digital media, managed by open-source software.  Typical forms of interaction would be:


* *Contributing*.  You bring your laptop or a portable hard drive, and hook it up to the network, where there is a large distributed filesystem.  You run an app and point it at your media directories.  It copies anything you have that it doesn’t.

* *Copying*.  It gives you a browseable / searchable interface to find and copy from the library.  This needs to be done with nice tools.  One way would be to piggyback on an existing system – for example, if you know the ISBN for a book, you can annotate pages with links to the library.  Similarly with movies and IMDB pages.  So the IMDB is your database and searching system.

* *Cleanup*.  The tricky parts are in metadata and data quality.  Metadata, so you can find the information you are looking for.  And to avoid duplicates – so you don’t get 25 versions of the same song / movie.  Hashing the contents prevents you from exact duplicates, but since music and movies can be encoded in many different ways, you may still get "soft duplicates".  There are algorithmic techniques for identifying these, but they are imperfect.  In the import stage, the library might reject MP3s that aren’t well ID3-tagged, just because there is going to be so much data it can afford to reject contributions with poor metadata.  It might also require users to do some data cleanup in return for copying from the library – say, identify an untagged movie.


We’re envisioning this as an open-source project, which has a lot of advantages.  There isn’t any single world library – not at the beginning.  Anyone can use this software to set up their own collection – whether at a LAN party, or at a conference or festival.  People all over the world would contribute to the code, so that functionality increases over time.  Given many different Libraries, one usual function would be to diff two Libraries, and put anything the first has that the second wants onto a hard drive, to be moved by sneakernet (ie physical carrying).  This lets Libraries sync up without information going over the network.  If the Libraries have fast enough network connections, they could also sync over an encrypted connection.


As in our story, this has the potential to extend well beyond the obvious areas like music and movies.  We are moving more and more into a world of information – where 3d printers can make any object, if they know it’s shape.  Where biotech hardware can create gene sequences.  Where both of these things – gene sequences and object models – are currently being patented, and reserved for the use of their discoverers.  The incredible potential of this technology to bring on a world of plenty, a world where you print a new part when one breaks, instead of sending to the factory, is threatened by these government-granted monopolies on information.


Exploring this fascinating topic in detail is outside the scope of this book, but suffice it to say that the only way to stop the World Library is the establishment of a draconian world government that would be far worse than letting information be free.  We’re going to build seasteads to help make sure that global fascism scenario doesn’t happen, and so the World Library is an inevitable consequence.  It’s time to think about how to use it, and how to encourage content creation in this new regime, rather than continuing to fight the inevitable.