Set up the Gremlin console to connect to a Neptune DB instance - Amazon Neptune
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Set up the Gremlin console to connect to a Neptune DB instance

The Gremlin Console allows you to experiment with TinkerPop graphs and queries in a REPL (read-eval-print loop) environment.

Installing the Gremlin console and connecting to it in the usual way

You can use the Gremlin Console to connect to a remote graph database. The following section walks you through installing and configuring the Gremlin Console to connect remotely to a Neptune DB instance. You must follow these instructions from an Amazon EC2 instance in the same virtual private cloud (VPC) as your Neptune DB instance.

For help connecting to Neptune with SSL/TLS (which is required), see SSL/TLS configuration.

Note

If you have IAM authentication enabled on your Neptune DB cluster, follow the instructions in Connecting to Neptune Using the Gremlin Console with Signature Version 4 Signing to install the Gremlin console rather than the instructions here.

To install the Gremlin Console and connect to Neptune
  1. The Gremlin Console binaries require Java 8 or Java 11. These instructions assume usage of Java 11. You can install Java 11 on your EC2 instance as follow:

    • If you're using Amazon Linux 2 (AL2):

      sudo amazon-linux-extras install java-openjdk11
    • If you're using Amazon Linux 2023 (AL2023):

      sudo yum install java-11-amazon-corretto-devel
    • For other distributions, use whichever of the following is appropriate:

      sudo yum install java-11-openjdk-devel

      or:

      sudo apt-get install openjdk-11-jdk
  2. Enter the following to set Java 11 as the default runtime on your EC2 instance.

    sudo /usr/sbin/alternatives --config java

    When prompted, enter the number for Java 11.

  3. Download the appropriate version of the Gremlin console from the Apache web site. You can check the engine release page for the Neptune engine version you are currently running to determine which Gremlin version it supports. For example, for version 3.6.5, you can download the Gremlin console from the Apache Tinkerpop3 website onto your EC2 instance like this:

    wget https://archive.apache.org/dist/tinkerpop/3.6.5/apache-tinkerpop-gremlin-console-3.6.5-bin.zip
  4. Unzip the Gremlin Console zip file.

    unzip apache-tinkerpop-gremlin-console-3.6.5-bin.zip
  5. Change directories into the unzipped directory.

    cd apache-tinkerpop-gremlin-console-3.6.5
  6. In the conf subdirectory of the extracted directory, create a file named neptune-remote.yaml with the following text. Replace your-neptune-endpoint with the hostname or IP address of your Neptune DB instance. The square brackets ([ ]) are required.

    Note

    For information about finding the hostname of your Neptune DB instance, see the Connecting to Amazon Neptune Endpoints section.

    hosts: [your-neptune-endpoint] port: 8182 connectionPool: { enableSsl: true } serializer: { className: org.apache.tinkerpop.gremlin.util.ser.GraphBinaryMessageSerializerV1, config: { serializeResultToString: true }}
    Note

    Serializers were moved from the gremlin-driver module to the new gremlin-util module in version 3.7.0. The package changed from org.apache.tinkerpop.gremlin.driver.ser to org.apache.tinkerpop.gremlin.util.ser.

  7. In a terminal, navigate to the Gremlin Console directory (apache-tinkerpop-gremlin-console-3.6.5), and then enter the following command to run the Gremlin Console.

    bin/gremlin.sh

    You should see the following output:

    \,,,/ (o o) -----oOOo-(3)-oOOo----- plugin activated: tinkerpop.server plugin activated: tinkerpop.utilities plugin activated: tinkerpop.tinkergraph gremlin>

    You are now at the gremlin> prompt. You will enter the remaining steps at this prompt.

  8. At the gremlin> prompt, enter the following to connect to the Neptune DB instance.

    :remote connect tinkerpop.server conf/neptune-remote.yaml
  9. At the gremlin> prompt, enter the following to switch to remote mode. This sends all Gremlin queries to the remote connection.

    :remote console
  10. Enter the following to send a query to the Gremlin Graph.

    g.V().limit(1)
  11. When you are finished, enter the following to exit the Gremlin Console.

    :exit
Note

Use a semicolon (;) or a newline character (\n) to separate each statement.

Each traversal preceding the final traversal must end in next() to be executed. Only the data from the final traversal is returned.

For more information on the Neptune implementation of Gremlin, see Gremlin standards compliance in Amazon Neptune.

An alternate way to connect to the Gremlin console

Drawbacks of the normal connection approach

The most common way to connect to the Gremlin console is the one explained above, using commands like this at the gremlin> prompt:

gremlin> :remote connect tinkerpop.server conf/(file name).yaml gremlin> :remote console

This works well, and lets you send queries to Neptune. However, it takes the Groovy script engine out of the loop, so Neptune treats all queries as pure Gremlin. This means that the following query forms fail:

gremlin> 1 + 1 gremlin> x = g.V().count()

The closest you can get to using a variable when connected this way is to use the result variable maintained by the console and send the query using :>, like this:

gremlin> :remote console ==>All scripts will now be evaluated locally - type ':remote console' to return to remote mode for Gremlin Server - [krl-1-cluster.cluster-ro-cm9t6tfwbtsr.us-east-1.neptune.amazonaws.com/172.31.19.217:8182] gremlin> :> g.V().count() ==>4249 gremlin> println(result) [result{object=4249 class=java.lang.Long}] gremlin> println(result['object']) [4249]

 

A different way to connect

You can also connect to the Gremlin console in a different way, which you may find nicer, like this:

gremlin> g = traversal().withRemote('conf/neptune.properties')

Here neptune.properties takes this form:

gremlin.remote.remoteConnectionClass=org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection gremlin.remote.driver.clusterFile=conf/my-cluster.yaml gremlin.remote.driver.sourceName=g

The my-cluster.yaml file should look like this:

hosts: [my-cluster-abcdefghijk.us-east-1.neptune.amazonaws.com] port: 8182 serializer: { className: org.apache.tinkerpop.gremlin.util.ser.GraphBinaryMessageSerializerV1, config: { serializeResultToString: false } } connectionPool: { enableSsl: true }
Note

Serializers were moved from the gremlin-driver module to the new gremlin-util module in version 3.7.0. The package changed from org.apache.tinkerpop.gremlin.driver.ser to org.apache.tinkerpop.gremlin.util.ser.

Configuring the Gremlin console connection like that lets you make the following kinds of queries successfully:

gremlin> 1+1 ==>2 gremlin> x=g.V().count().next() ==>4249 gremlin> println("The answer was ${x}") The answer was 4249

You can avoid displaying the result, like this:

gremlin> x=g.V().count().next();[] gremlin> println(x) 4249

All the usual ways of querying (without the terminal step) continue to work. For example:

gremlin> g.V().count() ==>4249

You can even use the g.io().read() step to load a file with this kind of connection.