Neo4j Scales for the Web and the Enterprise
Thousands of companies – with more than 30 of the Global 2000 – rely on Neo4j to enable mission-critical applications.
Whether serving up online recommendations to millions of web users, managing master data hierarchies, or routing millions of packages per day in real time, Neo4j is built to perform at scale. Neo4j’s native graph architecture combines blazing-fast connected queries with near-linear read scalability, combining high-availability clustering with robust transactional guarantees.
With Neo4j Enterprise, you get:
- High-Performance Cache with queries running up to 10x faster than the Community Edition under high query workloads and with large graphs
- Horizontal Scalability with Neo4j Clustering provides predictable scalabiilty, by guaranteeing that every query runs in-instance, with maximum performance
- High-availability and Online Backups safeguard your data so that your business can keep running under a wide range of scenario
- Cache-based Sharding using Neo4j’s clustering technology, which lets you shard your graph in memory without sharding your data on disk
- Advanced Monitoring provides operational metrics not available in the Community Edition, making it easier to manage your system and keep it running healthy
“Minutes to Milliseconds” Performance
Neo4j handles connected data queries up to a thousand times faster than other kinds of databases. Even on very modest hardware, Neo4j can handle millions of traversals per second between nodes in a graph on a single machine, and many thousands of transactional writes per second. This extreme speed is the result of an architecture that is natively engineered to store and process graph data.
Instead of chaining together index lookups to relate data for each traversal, the Neo4j query engine leverages a unique feature of Neo4j’s native graph storage engine known as “index-free adjacency”. This highly-optimized means of retrieving connected data lets the database traverse a graph using direct pointer lookups for each hop, bypassing indexes altogether. This is not only orders-of-magnitude faster than joining with an index; it also provides constant-time traversal characteristics for the majority of data. Unlike relational and other databases, most queries don’t slow down as the database gets bigger. Further, by ensuring that each instance has access to all of the data in the graph, we ensure that query time remains fast, predictable, and constant as you add instances to the cluster.
Neo4j Clustering Architecture
Neo4j’s horizontal scaling architecture offers the best of several worlds. It is finely honed to target the perfect sweet spot of low latency and horizontally-scalable read performance with consistent query response times as the cluster grows, leveraging 10+ years of experience with real-world graph applications. The result is an architecture that optimizes for scalability, reliability, and transactional performance, for the vast majority of graphs in existence today.
Neo4j Enterprise Edition lets you shard your cache without sharding your data.
When you load-balance your transactions, each instance keeps a different part of the graph in cache. This allows applications to carry out distributed in-memory processing across very large graphs, with each instance having access to the data it needs in memory. This ensures that queries remain both fast and predictable. Thanks to this approach, Neo4j users are able to achieve massive near-linear read scalability for clusters into the dozens of instances. Neo4j’s cluster coordinator is one of the few distributed technologies outside of Google’s data centers to use the Paxos algorithm to ensure smooth cluster operation even in the most demanding of environments.
Also, Enterprise Edition features such as online backup and high-performance cache further help administrators to ensure availability and performance with large graphs. Neo4j clustering supports near-linear read scaling and constant-time query performance across large clusters.
The applicability of graph databases is amazingly diverse. Neo’s production customers include several of the world’s largest telecommunications companies, the world’s second-largest professional social network, a web site that has over half of Facebook’s social graph, one of the world’s largest logistics companies (which routes over 5M packages per day in real time, with peaks of 3000 routing operations per second in real time), and many more. Graphs are everywhere.
Neo4j offers a variety of features to enable high levels of availability and horizontal (as well as vertical) scaling, for your graph.
Download » Understanding Neo4j Scalability
Hardware Sizing Calculator
Get a rough machine sizing estimate for your Neo4j database, taking into account factors such as the size of your graph, and anticipated access patterns.
Inside the Neo4j Labs
The Linked Data Benchmark Council (LDBC) is a project funded by the European Union, chartered with the development of such a benchmark. LDBC embarked last year on the development of standard industry benchmarks for graphs, that are similar to the TPC benchmarks in the relational world.
Performance in the Cloud
How does Neo4j run in the Cloud? A number of Neo4j customers are running Neo4j on platforms like EC2 and Azure. FiftyThree’s “Made with Paper” stream supports loads of 10K queries per second on EC2. Watch Evolving Neo4j for the Cloud to see this customer talks about getting Neo4j to run in a massive-scale distributed cluster spanning three Amazon global regions: North America, Europe, and Asia.