Neo4j deployments automation on Google Cloud Platform (GCP)

Automate Neo4j deployment when you want to integrate Neo4j into your CI/CD pipeline to be able to create/destroy instances temporarily, or to spin up a sample instance.

Prerequisites

Google Cloud Deployment Manager

Neo4j provides Deployment Manager templates for Neo4j Causal Cluster (highly available clusters), and VM images for Neo4j Enterprise standalone. Deployment Manager is a recipe that tells GCP how to deploy a whole set of interrelated resources. By deploying all of this as a stack you can keep all of your resources together, and delete just one thing when you are done.

Creating a Deployment Manager stack

Depending on what Neo4j edition you want to deploy, you create a Deployment Manager stack by running a bash script.

Each script contains the following configurations:

  • The URL of the Neo4j stack template that tells GCP what to deploy.

  • Various parameters that control how much hardware you want to use.

  • MACHINE - the GCP machine type you want to launch, which controls how much hardware you will be giving to your database.

  • DISK_TYPE and DISK_SIZE- controls whether Neo4j uses standard spinning magnetic platters (pd-standard) or SSD disks (pd-ssd), and how many GB of storage you want to allocate. Note that with some disk sizes, GCP warns that the root partition type may need to be resized if the underlying OS does not support the disk size. This warning can be ignored, because the underlying OS will recognize any disk size.

  • ZONE - specifies where to deploy Neo4j.

  • PROJECT - the project ID you want to deploy on GCP.

Deploying Neo4j Enterprise Edition with a Causal Cluster

To deploy Neo4j Enterprise Edition with a Causal Cluster, use the Causal Cluster template.

You indicate how many core servers and read replicas you want in your cluster by configuring the CORES and READ_REPLICAS parameters.
#!/bin/bash
export NAME=neo4j-cluster
PROJECT=my-gcp-project-ID
MACHINE=n1-standard-2
DISK_TYPE=pd-ssd
DISK_SIZE=64
ZONE=us-east1-b
CORES=3
READ_REPLICAS=0
NEO4J_VERSION=4.4.34
TEMPLATE_URL=https://storage.googleapis.com/neo4j-deploy/$NEO4J_VERSION/causal-cluster/neo4j-causal-cluster.jinja
OUTPUT=$(gcloud deployment-manager deployments create $NAME \
   --project $PROJECT \
   --template "$TEMPLATE_URL" \
   --properties "zone:'$ZONE',clusterNodes:'$CORES',readReplicas:'$READ_REPLICAS',bootDiskSizeGb:$DISK_SIZE,bootDiskType:'$DISK_TYPE',machineType:'$MACHINE'")
echo $OUTPUT
PASSWORD=$(echo $OUTPUT | perl -ne 'm/password\s+([^\s]+)/; print $1;')
IP=$(echo $OUTPUT | perl -ne 'm/vm1URL\s+https:\/\/([^\s]+):/; print $1; ')
echo NEO4J_URI=bolt+routing://$IP
echo NEO4J_PASSWORD=$PASSWORD
echo STACK_NAME=$NAME

After you configure the parameters of what you are deploying, you call to gcloud deployment-manager deployments create to do the work. The variable OUTPUT contains all the information about your deployment. Then, you use perl to pull out the password and IP address of your new deployment, because it will have a strong randomly assigned password.

This command blocks and does not succeed until the entire stack is deployed and ready. This means that by the time you get the IP address back, your Neo4j is up. If you lose these stack outputs (IP, password, and so on), you can find them in your Deployment Manager window within the GCP console.

To delete your deployment, take note of the STACK_NAME and use the utility script:

#!/bin/bash
PROJECT=my-google-project-id
if [ -z $1 ] ; then
   echo "Usage: call me with deployment name"
   exit 1
fi
gcloud -q deployment-manager deployments delete $1 --project $PROJECT
# OPTIONAL!  Destroy the disk
# gcloud --quiet compute disks delete $(gcloud compute disks list --project $PROJECT --filter="name~'$1'" --uri)
When you delete Neo4j stacks on GCP, the GCP disks are left behind, to make it hard for you to accidentally destroy your valuable data. To completely clean up your disks, uncomment the last line of the script.

Deploying Neo4j Enterprise (or Community) Edition in standalone mode

To deploy Neo4j Enterprise Edition in standalone mode, create a simple VM and configure its firewall/security rules. It will not have high-availability failover capabilities, but it is a very fast way to get started.

You choose a random password by running some random bytes through a hash. The script also provides an example of polling and waiting until the VM service comes up, and then changing the Neo4j default password.

The launcher-public project on GCP hosts Neo4j’s VM images for GCP. In the example script, neo4j-enterprise-1–3–5–3-apoc is used, but other versions are also available. By substituting a different image name here, you can use this same technique to run Neo4j Community Edition in standalone mode.

#!/bin/bash
export PROJECT=my-gcp-project-id
export MACHINE=n1-standard-2
export DISK_TYPE=pd-ssd
export DISK_SIZE=64GB
export ZONE=us-east1-b
export NEO4J_VERSION=4.4.34
export PASSWORD=$(head -n 20 /dev/urandom | md5)
export STACK_NAME=neo4j-standalone
export IMAGE=neo4j-enterprise-1-3-5-3-apoc
# Setup firewalling.
echo "Creating firewall rules"
gcloud compute firewall-rules create "$STACK_NAME" \
    --allow tcp:7473,tcp:7687 \
    --source-ranges 0.0.0.0/0 \
    --target-tags neo4j \
    --project $PROJECT
if [ $? -ne 0 ] ; then
   echo "Firewall creation failed.  Bailing out"
   exit 1
fi
echo "Creating instance"
OUTPUT=$(gcloud compute instances create $STACK_NAME \
   --project $PROJECT \
   --image $IMAGE \
   --tags neo4j \
   --machine-type $MACHINE \
   --boot-disk-size $DISK_SIZE \
   --boot-disk-type $DISK_TYPE \
   --image-project launcher-public)
echo $OUTPUT
# Pull out the IP addresses, and toss out the private internal one (10.*)
IP=$(echo $OUTPUT | grep -oE '((1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])\.){3}(1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])' | grep --invert-match "^10\.")
echo "Discovered new machine IP at $IP"
tries=0
while true ; do
   OUTPUT=$(echo "CALL dbms.changePassword('$PASSWORD');" | cypher-shell -a $IP -u neo4j -p "neo4j" 2>&1)
   EC=$?
   echo $OUTPUT

   if [ $EC -eq 0 ]; then
     echo "Machine is up ... $tries tries"
   break
fi
  if [ $tries -gt 30 ] ; then
    echo STACK_NAME=$STACK_NAME
    echo "Machine is not coming up, giving up"
    exit 1
  fi
  tries=$(($tries+1))
  echo "Machine is not up yet ... $tries tries"
  sleep 1;
done
echo NEO4J_URI=bolt://$IP:7687
echo NEO4J_PASSWORD=$PASSWORD
echo STACK_NAME=$STACK_NAME
exit 0

To delete your deployment, take note of the STACK_NAME and use the utility script:

#!/bin/bash
export PROJECT=my-google-project-id
if [ -z $1 ] ; then
   echo "Missing argument"
   exit 1
fi
echo "Deleting instance and firewall rules"
gcloud compute instances delete --quiet "$1" --project "$PROJECT" && gcloud compute firewall-rules --quiet delete "$1" --project "$PROJECT"
exit $?