Seed a cluster
Introduction
Regardless of whether you are just playing around with Neo4j or setting up a production environment, you likely have some existing data that you want to transfer into your newly created Causal cluster. Neo4j supports seeding a cluster from a database dump, a database backup, or from another data source (with the Import tool). For more information about the different backup options and how to use the Neo4j Import tool, see Backup and restore options and Neo4j Admin tool.
It is possible to seed a cluster with a single database or multiple, including a full DBMS.
Any seeding that includes restoring the system
database needs to be done offline, but any other databases can be seeded online.
The databases that you want to seed and the Neo4j cluster must be of the same version. |
The process for seeding a cluster is essentially the same for clusters with Single and Read Replica instances as for clusters with Core (and optional Read Replica) instances. However, using a designated seeder is only applicable to clusters with Core instances. The seeding is usually performed on primary instances only but it is possible to seed a Read Replica instance, yet it is not necessary unless for performance reasons.
Seed a cluster from a database dump (offline)
This could be an offline backup (i.e. a dump) from a standalone Neo4j instance or a cluster member (e.g., an existing Read Replica instance).
The following example seeds a newly created cluster with an example DBMS consisting of the system
database and the default database, neo4j
from a dump.
If you want to seed a single user database, follow the steps in Seed a cluster from a database backup (online) further on.
This scenario is useful in disaster recovery where some servers have retained their data during a catastrophic event. |
Moving files and directories manually in or out of a Neo4j installation is not recommended and considered unsupported. |
-
Create a new Neo4j Core-only cluster following the instructions in Configure a cluster with Core instances but do not start any of the members. (If you have started any of the cluster members, stop and unbind each started member.)
-
Use
neo4j-admin load
to seed each of the Core members in the cluster.The examples assume that you are restoring one user database with the default name of
neo4j
and thesystem
database, containing the replicated configuration state. Modify the command line arguments to match your exact setup.neo4j-01$ ./bin/neo4j-admin load --from=/path/to/system.dump --database=system neo4j-01$ ./bin/neo4j-admin load --from=/path/to/neo4j.dump --database=neo4j neo4j-02$ ./bin/neo4j-admin load --from=/path/to/system.dump --database=system neo4j-02$ ./bin/neo4j-admin load --from=/path/to/neo4j.dump --database=neo4j neo4j-03$ ./bin/neo4j-admin load --from=/path/to/system.dump --database=system neo4j-03$ ./bin/neo4j-admin load --from=/path/to/neo4j.dump --database=neo4j
-
Start each cluster member.
neo4j-01$ ./bin/neo4j start neo4j-02$ ./bin/neo4j start neo4j-03$ ./bin/neo4j start
The cluster forms and the replicated Neo4j DBMS deployment comes online.
The system database contains information about the databases that should exist in your Neo4j DBMS.
If a database does not exist in your system database (because it has not been created previously), it must be created with |
Seed a cluster from a database backup (online)
These scenarios are useful when you want to restore a database in a running cluster. |
If you have a running Neo4j database that you want to seed in a running cluster, use neo4j-admin backup
to create a database backup.
This could be a backup from a standalone Neo4j instance or another cluster member (e.g., an existing Read Replica).
Neo4j supports two types of seeding in a running cluster.
You can either transfer the database backup to each Core instance or transfer it only to one Core instance and then use the CREATE DATABASE
Cypher command to seed the cluster.
For more information on the CREATE DATABASE
syntax and options, see Cypher Manual → Creating databases.
Moving files and directories manually in or out of a Neo4j installation is not recommended and considered unsupported. |
Restore a database on each Core instance
Transfer the database backup to each Core instance in the cluster using the neo4j-admin restore
command and then use CREATE DATABASE
to restore it.
This example uses a user database called movies1
.
-
To ensure that the
movies1
database does not exist in the cluster, on one of the Core members, use Cypher Shell and runDROP DATABASE movies1
. Use thesystem
database to connect. The command is automatically routed to the appropriate Core instance and from there to the other cluster members.DROP DATABASE movies1;
Dropping a database also deletes the users and roles associated with it.
If you cannot drop the database because your seeds include the
system
database (which cannot be dropped), you must runneo4j-admin unbind
. However, this removes the cluster state of the Core instance and in turn the instance needs to be restarted in order to join the cluster. Thus, you are no longer restoring a database in a running cluster. See Seed a cluster from a database dump (offline) instead for instructions on how to seed an offline cluster. -
Restore the database on each Core member in the cluster.
neo4j@core1$ ./bin/neo4j-admin restore --from=/path/to/movies1-backup-dir --database=movies1 neo4j@core2$ ./bin/neo4j-admin restore --from=/path/to/movies1-backup-dir --database=movies1 neo4j@core3$ ./bin/neo4j-admin restore --from=/path/to/movies1-backup-dir --database=movies1
However, restoring a database does not automatically create it.
-
On one of the Core instances, run
CREATE DATABASE movies1
against thesystem
database to create themovies1
database. The command is automatically routed to the appropriate Core instance and from there to the other cluster members.CREATE DATABASE movies1;
0 rows ready to start consuming query after 701 ms, results consumed after another 0 ms
-
Verify that the
movies1
database is online on all members.SHOW DATABASES;
+---------------------------------------------------------------------------------------------------------------------------+ | name | aliases | access | address | role | requestedStatus | currentStatus | error | default | home | +---------------------------------------------------------------------------------------------------------------------------+ | "neo4j" | [] | "read-write" | "core1:7687" | "leader" | "online" | "online" | "" | TRUE | TRUE | | "neo4j" | [] | "read-write" | "core3:7687" | "follower" | "online" | "online" | "" | TRUE | TRUE | | "neo4j" | [] | "read-write" | "core2:7687" | "follower" | "online" | "online" | "" | TRUE | TRUE | | "movies1"| [] | "read-write" | "core1:7687" | "leader" | "online" | "online" | "" | FALSE | FALSE | | "movies1"| [] | "read-write" | "core3:7687" | "follower" | "online" | "online" | "" | FALSE | FALSE | | "movies1"| [] | "read-write" | "core2:7687" | "follower" | "online" | "online" | "" | FALSE | FALSE | | "system" | [] | "read-write" | "core1:7687" | "follower" | "online" | "online" | "" | FALSE | FALSE | | "system" | [] | "read-write" | "core3:7687" | "follower" | "online" | "online" | "" | FALSE | FALSE | | "system" | [] | "read-write" | "core2:7687" | "leader" | "online" | "online" | "" | FALSE | FALSE | +---------------------------------------------------------------------------------------------------------------------------+ 9 rows available after 3 ms, consumed after another 1 ms
Restore a database using a designated seeder
With a seeder, you transfer the database backup to one Core instance in the cluster using the neo4j-admin restore
command.
Then you use that member as a designated seeder to create the backed-up database on the other cluster members.
This example uses a user database called movies1
and a cluster that consists of three Core instances.
The movies1
database does not exist on any of the cluster members.
If a database with the same name as your backup already exists in your cluster, see step 1 in Restore a database on each Core instance for details on how to drop it.
-
Restore the
movies1
database on one of the Core instances. In this example, thecore1
member is used.neo4j@core1$ ./bin/neo4j-admin restore --from=/path/to/movies1-backup-dir --database=movies1
-
Find the server ID of
core1
by logging in to Cypher Shell and runningdbms.cluster.overview()
. Use any database to connect.CALL dbms.cluster.overview();
+----------------------------------------------------------------------------------------------------------------------------------------+ | id | addresses | databases | groups | +----------------------------------------------------------------------------------------------------------------------------------------+ | "8e07406b-90b3-4311-a63f-85c45af63583" | ["bolt://core1:7687", "http://core1:7474"] | {neo4j: "LEADER", system: "FOLLOWER"} | [] | | "aeb6debe-d3ea-4644-bd68-304236f3813b" | ["bolt://core3:7687", "http://core3:7474"] | {neo4j: "FOLLOWER", system: "FOLLOWER"} | [] | | "b99ff25e-dc64-4c9c-8a50-ebc1aa0053cf" | ["bolt://core2:7687", "http://core2:7474"] | {neo4j: "FOLLOWER", system: "LEADER"} | [] | +----------------------------------------------------------------------------------------------------------------------------------------+
-
On one of the Core instances, use the
system
database and create the databasemovies1
using the server ID ofcore1
. The command is automatically routed to the appropriate Core instance and from there to the other cluster members. If themovies1
database is of considerable size, the execution of the command can take some time.CREATE DATABASE movies1 OPTIONS {existingData: 'use', existingDataSeedInstance: '8e07406b-90b3-4311-a63f-85c45af63583'};
0 rows ready to start consuming query after 701 ms, results consumed after another 0 ms
-
Verify that the
movies1
database is online on all cluster members.SHOW DATABASES;
+---------------------------------------------------------------------------------------------------------------------------+ | name | aliases | access | address | role | requestedStatus | currentStatus | error | default | home | +---------------------------------------------------------------------------------------------------------------------------+ | "neo4j" | [] | "read-write" | "core1:7687" | "leader" | "online" | "online" | "" | TRUE | TRUE | | "neo4j" | [] | "read-write" | "core3:7687" | "follower" | "online" | "online" | "" | TRUE | TRUE | | "neo4j" | [] | "read-write" | "core2:7687" | "follower" | "online" | "online" | "" | TRUE | TRUE | | "movies1"| [] | "read-write" | "core1:7687" | "leader" | "online" | "online" | "" | FALSE | FALSE | | "movies1"| [] | "read-write" | "core3:7687" | "follower" | "online" | "online" | "" | FALSE | FALSE | | "movies1"| [] | "read-write" | "core2:7687" | "follower" | "online" | "online" | "" | FALSE | FALSE | | "system" | [] | "read-write" | "core1:7687" | "follower" | "online" | "online" | "" | FALSE | FALSE | | "system" | [] | "read-write" | "core3:7687" | "follower" | "online" | "online" | "" | FALSE | FALSE | | "system" | [] | "read-write" | "core2:7687" | "leader" | "online" | "online" | "" | FALSE | FALSE | +---------------------------------------------------------------------------------------------------------------------------+ 9 rows available after 3 ms, consumed after another 1 ms
Seed a cluster using the import tool
To create a cluster based on imported data, it is recommended to first import the data into a standalone Neo4j DBMS and then use an offline backup to seed the cluster.
-
Import the data.
-
Deploy a standalone Neo4j DBMS.
-
Import the data using the import tool.
-
-
Use
neo4j-admin dump
to create an offline backup of theneo4j
database. -
Seed a new cluster using the instructions in Seed a cluster from a database dump (offline).
Skip the
system
database in this scenario since it is not needed.