NoSQL: What’s the Buzz About Graph Database?
Hired Brains is a research, advisory and consulting firm founded by Neil Raden in 1985. It specializes analytics, data warehousing, business intelligence, big data, semantic technology, analytical architecture and tools and decision management.
I attended the NoSQLNow conference in San Jose and had the opportunity to speak one-on-one with a number of principals of NoSQL database concerns, including Emil Eifrém of Neo Technology. For those of you who aren’t familiar with the concept, graph databases are based on an arrangement of edges, properties and nodes with relationships between them, not rows and columns with primary and foreign key relationships. In practice this allows them to traverse graphs of information more efficiently than reading pages of data and finding the rows that match the query.
Interestingly, Graph Theory (Euler) predates Set Theory (Cantor/Dedekind) on which the relational model is based by over 150 years. Of historical interest, the development of the relational database at IBM was conceived as a method to get data out of databases, not get data in. This turned out, in the early 70’s to be a problem for IBM so they redirected Ted Codd’s efforts to making relational databases fast transaction processors. Enter the concept of “normal form,” a horribly misleading term that has side-railed a zillion projects by data modelers with a thin understanding of the concept insisting on “normal” purity no matter the cost. The rest is history. The whole DSS/BI/Analytics movement grew out of the fact that the relational databases were poor performers at non-transaction processing.
According to the NoSQL movement, and I’m not entirely convinced of this but I’m listening, the rigidity of a physical schema needed in relational databases is their undoing in an era of agility, speed and volume. Here is a quote from Wikipedia:
Compared with relational databases, graph databases are often faster for associative data sets, and map more directly to the structure of object-oriented applications. They can scale more naturally to large data sets as they do not typically require expensive join operations. As they depend less on a rigid schema, they are more suitable to manage ad-hoc and changing data with evolving schemas. Conversely, relational databases are typically faster at performing the same operation on large numbers of data elements.
The key characteristic of graph databases is this notion if index-free adjacency, meaning, each node knows the location of its adjacent nodes so an index is unnecessary. Obviously, a semantic interpretation of this is that the graph is a representation of relationship. Paradoxically, there are no relationships in a “relational” database, they are applied at run time from the query.