Set up the Gremlin console to connect to a Neptune DB instance
The Gremlin Console allows you to experiment with TinkerPop graphs and queries in a REPL (read-eval-print loop) environment.
Installing the Gremlin console and connecting to it in the usual way
You can use the Gremlin Console to connect to a remote graph database. The following section walks you through installing and configuring the Gremlin Console to connect remotely to a Neptune DB instance. You must follow these instructions from an Amazon EC2 instance in the same virtual private cloud (VPC) as your Neptune DB instance.
For help connecting to Neptune with SSL/TLS (which is required), see SSL/TLS configuration.
Note
If you have IAM authentication enabled on your Neptune DB cluster, follow the instructions in Connecting to Neptune Using the Gremlin Console with Signature Version 4 Signing to install the Gremlin console rather than the instructions here.
To install the Gremlin Console and connect to Neptune
-
The Gremlin Console binaries require Java 8 or Java 11. These instructions assume usage of Java 11. You can install Java 11 on your EC2 instance as follow:
-
If you're using Amazon Linux 2 (AL2)
: sudo amazon-linux-extras install java-openjdk11
-
If you're using Amazon Linux 2023 (AL2023):
sudo yum install java-11-amazon-corretto-devel
-
For other distributions, use whichever of the following is appropriate:
sudo yum install java-11-openjdk-devel
or:
sudo apt-get install openjdk-11-jdk
-
-
Enter the following to set Java 11 as the default runtime on your EC2 instance.
sudo /usr/sbin/alternatives --config java
When prompted, enter the number for Java 11.
-
Download the appropriate version of the Gremlin console from the Apache web site. You can check the engine release page for the Neptune engine version you are currently running to determine which Gremlin version it supports. For example, for version 3.6.5, you can download the Gremlin console
from the Apache Tinkerpop3 website onto your EC2 instance like this: wget https://archive.apache.org/dist/tinkerpop/3.6.5/apache-tinkerpop-gremlin-console-3.6.5-bin.zip
-
Unzip the Gremlin Console zip file.
unzip apache-tinkerpop-gremlin-console-3.6.5-bin.zip
-
Change directories into the unzipped directory.
cd apache-tinkerpop-gremlin-console-3.6.5
-
In the
conf
subdirectory of the extracted directory, create a file namedneptune-remote.yaml
with the following text. Replaceyour-neptune-endpoint
with the hostname or IP address of your Neptune DB instance. The square brackets ([ ]
) are required.Note
For information about finding the hostname of your Neptune DB instance, see the Connecting to Amazon Neptune Endpoints section.
hosts: [
your-neptune-endpoint
] port: 8182 connectionPool: { enableSsl: true } serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1, config: { serializeResultToString: true }} -
In a terminal, navigate to the Gremlin Console directory (
apache-tinkerpop-gremlin-console-3.6.5
), and then enter the following command to run the Gremlin Console.bin/gremlin.sh
You should see the following output:
\,,,/ (o o) -----oOOo-(3)-oOOo----- plugin activated: tinkerpop.server plugin activated: tinkerpop.utilities plugin activated: tinkerpop.tinkergraph gremlin>
You are now at the
gremlin>
prompt. You will enter the remaining steps at this prompt. -
At the
gremlin>
prompt, enter the following to connect to the Neptune DB instance.:remote connect tinkerpop.server conf/neptune-remote.yaml
-
At the
gremlin>
prompt, enter the following to switch to remote mode. This sends all Gremlin queries to the remote connection.:remote console
-
Enter the following to send a query to the Gremlin Graph.
g.V().limit(1)
-
When you are finished, enter the following to exit the Gremlin Console.
:exit
Note
Use a semicolon (;
) or a newline character (\n
) to separate
each statement.
Each traversal preceding the final traversal must end in next()
to be
executed. Only the data from the final traversal is returned.
For more information on the Neptune implementation of Gremlin, see Gremlin standards compliance in Amazon Neptune.
An alternate way to connect to the Gremlin console
Drawbacks of the normal connection approach
The most common way to connect to the Gremlin console is the one explained above,
using commands like this at the gremlin>
prompt:
gremlin> :remote connect tinkerpop.server conf/
(file name)
.yaml gremlin> :remote console
This works well, and lets you send queries to Neptune. However, it takes the Groovy script engine out of the loop, so Neptune treats all queries as pure Gremlin. This means that the following query forms fail:
gremlin> 1 + 1 gremlin> x = g.V().count()
The closest you can get to using a variable when connected this way is to use
the result
variable maintained by the console and send the query using
:>
, like this:
gremlin> :remote console ==>All scripts will now be evaluated locally - type ':remote console' to return to remote mode for Gremlin Server - [krl-1-cluster.cluster-ro-cm9t6tfwbtsr.us-east-1.neptune.amazonaws.com/172.31.19.217:8182] gremlin> :> g.V().count() ==>4249 gremlin> println(result) [result{object=4249 class=java.lang.Long}] gremlin> println(result['object']) [4249]
A different way to connect
You can also connect to the Gremlin console in a different way, which you may find nicer, like this:
gremlin> g = traversal().withRemote('conf/neptune.properties')
Here neptune.properties
takes this form:
gremlin.remote.remoteConnectionClass=org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection gremlin.remote.driver.clusterFile=conf/my-cluster.yaml gremlin.remote.driver.sourceName=g
The my-cluster.yaml
file should look like this:
hosts: [
my-cluster-abcdefghijk.us-east-1.neptune.amazonaws.com
] port: 8182 serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1, config: { serializeResultToString: false } } connectionPool: { enableSsl: true }
Configuring the Gremlin console connection like that lets you make the following kinds of queries successfully:
gremlin> 1+1 ==>2 gremlin> x=g.V().count().next() ==>4249 gremlin> println("The answer was ${x}") The answer was 4249
You can avoid displaying the result, like this:
gremlin> x=g.V().count().next();[] gremlin> println(x) 4249
All the usual ways of querying (without the terminal step) continue to work. For example:
gremlin> g.V().count() ==>4249
You can even use the g.io().read()