When NoSQL Makes Sense
Doug Henschen of InformationWeek Software, talks about NoSQL Databases, how they work, and when to use them, including Neo4j!
Scalability and flexibility. These are the two key attributes of NoSQL databases, the ones that have made them big data darlings. NoSQL databases haven’t quite reached the hype heights of the Hadoop data management framework, but they’re drawing a lot of attention and experimentation. Choose wisely among the many and varied NoSQL options, or the trade-offs needed to get scalability and flexibility might be your project’s undoing.
The label NoSQL covers a diverse collection of databases that tend to have at least two elements in common: distributed computing architectures and schemaless design. The databases are scalable because they were built to store and manage data distributed across (typically) x86 commodity server clusters that can be easily scaled out by adding more machines. They’re flexible because, unlike relational databases, NoSQL databases don’t require a predefined schema (a.k.a. data model) that demands one way to manage data in columns and rows. Under relational databases, those data models get ever more difficult to change as the database grows. That rigid data model becomes a problem if a company’s evolving business model requires it to use data in a way it never anticipated.
Whereas relational databases are general-purpose platforms, NoSQL databases have been developed to tackle particular, often extreme challenges. Amazon.com in 2007 came up with the Dynamo database to keep its massive, global e-commerce site always up and running. (It now sells DynamoDB as an Amazon Web Services online service.) Dynamo helped inspire Facebook’s development of Cassandra, which it then contributed to open source. Relational databases just weren’t designed to handle the quantity of data, number of users and ever-changing data requirements of outfits such as Amazon and Facebook.
Today, there are four important classes of NoSQL databases: key-value stores such as Riak and Redis; document databases such as MongoDB and Couchbase; wide-column databases such as Cassandra and HBase (the latter is part of the Hadoop framework); and graph databases such as Neo4j and Allegro. All of those databases are well-known among Internet startups and established Web-scale companies.