Graphs are Sexy (and Bow-Ties are Cool)
“I can’t thank Facebook and Google enough for making the terms ‘social graph’ and ‘graph search’ and ‘knowledge graph’ popular.”
I’m talking to Dr. Jim Webber about Neo4j, the highly scalable open source graph database that is powering both oil producers in Scandinavia and silicon roundabout startups.
Facebook’s controversial Graph Search feature has been two years in the making, and was announced live last month. Facebook has on average one billion new posts added every day, with their posts index containing more than one trillion total posts, altogether comprising hundreds of terabytes of data. Graph Search indexes this data and returns real-time results to queries.
Dr. Webber – Chief Scientist for Neo4j – is frank about how the major players’ adoption of graphs have meant more attention for graph databases.
“I think they have done a lot of PR for graphs we couldn’t have managed as a smaller company. A lot of people draw inspiration from what those guys are doing, and would like to try and replicate some of those features in their own systems.”
But he’s quick to point out that Neo4j has been around a long time, longer than Facebook’s graph team have, and they have burned a lot of shoe leather in getting graph databases out there into the mainstream — and not just for social media startups.
To the uninitiated, a graph database is literally a database storing data in a graph, the most generic of data structures, capable of elegantly representing any kind of data in a highly accessible way. In a graph, every element contains a direct pointer to its adjacent element, avoiding costly global index lookups.
But who would use a graph database? Anyone and everyone who has connected data. From a two person bootstrapping social networking tech startup in someone’s shed, to Global 500 companies including HP and Cisco. Graph databases are being implemented by everyone from high profile blue chip organizations to what Dr. Webber refers to as the “the startup end of the spectrum.”
When people ask why they would want to use a graph database like Neo4j over a more traditional database, they only need to look as far as Shutl. Shutl are a London-based technology start-up (recently acquired by eBay), offering same day delivery within minutes of purchase or delivery in any one hour slot on a day of the customers choice.
From a service user point of view, the change in database for Shutl wasn’t about having an answer 10 milliseconds faster, because as Dr. Webber points out, in human time you can’t really tell the difference, but all of those additional milliseconds that were available to the database meant that customers were able to get a far richer experience. Shutl used the extra headroom from working in a graph to deliver customers 50 delivery options, compared to just one.