Gremlin node classification queries in Neptune ML - Amazon Neptune
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Gremlin node classification queries in Neptune ML

For Gremlin node classification in Neptune ML:

  • The model is trained on one property of the vertices. The set of unique values of this property are referred to as a set of node classes, or simply, classes.

  • The node class or categorical property value of a vertex's property can be inferred from the node classification model. This is useful where this property is not already attached to the vertex.

  • In order to fetch one or more classes from a node classification model, you need to use the with() step with the predicate Neptune#ml.classification to configure the properties() step. The output format is similar to what you would expect if those were vertex properties.

Note

Node classification only works with string property values. That means that numerical property values such as 0 or 1 are not supported, although the string equivalents "0" and "1" are. Similarly, the Boolean property values true and false don't work, but "true" and "false" do.

Here is a sample node classification query:

g.with( "Neptune#ml.endpoint","node-classification-movie-lens-endpoint" ) .with( "Neptune#ml.iamRoleArn","arn:aws:iam::0123456789:role/sagemaker-role" ) .with( "Neptune#ml.limit", 2 ) .with( "Neptune#ml.threshold", 0.5D ) .V( "movie_1", "movie_2", "movie_3" ) .properties("genre").with("Neptune#ml.classification")

The output of this query would look something like the following:

==>vp[genre->Action]
==>vp[genre->Crime]
==>vp[genre->Comedy]

In the query above, the V() and properties() steps are used as follows:

The V() step contains the set of vertices for which you want to fetch the classes from the node-classification model:

.V( "movie_1", "movie_2", "movie_3" )

The properties() step contains the key on which the model was trained, and has .with("Neptune#ml.classification") to indicate that this is a node classification ML inference query.

Multiple property keys are not currently supported in a properties().with("Neptune#ml.classification") step. For example, the following query results in an exception:

g.with("Neptune#ml.endpoint", "node-classification-movie-lens-endpoint") .with("Neptune#ml.iamRoleArn","arn:aws:iam::0123456789:role/sagemaker-role") .V( "movie_1", "movie_2", "movie_3" ) .properties("genre", "other_label").with("Neptune#ml.classification")

For the specific error message, see the list of Neptune ML exceptions.

A properties().with("Neptune#ml.classification") step can be used in combination with any of the following steps:

  • value()

  • value().is()

  • hasValue()

  • has(value,"")

  • key()

  • key().is()

  • hasKey()

  • has(key,"")

  • path()

Other node-classification queries

If both the inference endpoint and the corresponding IAM role have been saved in your DB cluster parameter group, a node-classification query can be as simple as this:

g.V("movie_1", "movie_2", "movie_3").properties("genre").with("Neptune#ml.classification")

You can mix vertex properties and classes in a query using the union() step:

g.with("Neptune#ml.endpoint","node-classification-movie-lens-endpoint") .with("Neptune#ml.iamRoleArn","arn:aws:iam::0123456789:role/sagemaker-role") .V( "movie_1", "movie_2", "movie_3" ) .union( properties("genre").with("Neptune#ml.classification"), properties("genre") )

You can also make an unbounded query such as this:

g.with("Neptune#ml.endpoint","node-classification-movie-lens-endpoint") .with("Neptune#ml.iamRoleArn","arn:aws:iam::0123456789:role/sagemaker-role") .V() .properties("genre").with("Neptune#ml.classification")

You can retrieve the node classes together with vertices using the select() step together with the as() step:

g.with("Neptune#ml.endpoint","node-classification-movie-lens-endpoint") .with("Neptune#ml.iamRoleArn","arn:aws:iam::0123456789:role/sagemaker-role") .V( "movie_1", "movie_2", "movie_3" ).as("vertex") .properties("genre").with("Neptune#ml.classification").as("properties") .select("vertex","properties")

You can also filter on node classes, as illustrated in these examples:

g.with("Neptune#ml.endpoint", "node-classification-movie-lens-endpoint") .with("Neptune#ml.iamRoleArn","arn:aws:iam::0123456789:role/sagemaker-role") .V( "movie_1", "movie_2", "movie_3" ) .properties("genre").with("Neptune#ml.classification") .has(value, "Horror") g.with("Neptune#ml.endpoint","node-classification-movie-lens-endpoint") .with("Neptune#ml.iamRoleArn","arn:aws:iam::0123456789:role/sagemaker-role") .V( "movie_1", "movie_2", "movie_3" ) .properties("genre").with("Neptune#ml.classification") .has(value, P.eq("Action")) g.with("Neptune#ml.endpoint","node-classification-movie-lens-endpoint") .with("Neptune#ml.iamRoleArn","arn:aws:iam::0123456789:role/sagemaker-role") .V( "movie_1", "movie_2", "movie_3" ) .properties("genre").with("Neptune#ml.classification") .has(value, P.within("Action", "Horror"))

You can get a node classification confidence score using the Neptune#ml.score predicate:

g.with("Neptune#ml.endpoint","node-classification-movie-lens-endpoint") .with("Neptune#ml.iamRoleArn","arn:aws:iam::0123456789:role/sagemaker-role") .V( "movie_1", "movie_2", "movie_3" ) .properties("genre", "Neptune#ml.score").with("Neptune#ml.classification")

The response would look like this:

==>vp[genre->Action]
==>vp[Neptune#ml.score->0.01234567]
==>vp[genre->Crime]
==>vp[Neptune#ml.score->0.543210]
==>vp[genre->Comedy]
==>vp[Neptune#ml.score->0.10101]

Using inductive inference in a node classification query

Supposing you were to add a new node to an existing graph, in a Jupyter notebook, like this:

%%gremlin g.addV('label1').property(id,'101').as('newV') .V('1').as('oldV1') .V('2').as('oldV2') .addE('eLabel1').from('newV').to('oldV1') .addE('eLabel2').from('oldV2').to('newV')

You could then use an inductive inference query to get a genre and confidence score that reflected the new node:

%%gremlin g.with("Neptune#ml.endpoint", "nc-ep") .with("Neptune#ml.iamRoleArn", "arn:aws:iam::123456789012:role/NeptuneMLRole") .V('101').properties("genre", "Neptune#ml.score") .with("Neptune#ml.classification") .with("Neptune#ml.inductiveInference")

If you ran the query several times, however, you might get somewhat different results:

# First time ==>vp[genre->Action] ==>vp[Neptune#ml.score->0.12345678] # Second time ==>vp[genre->Action] ==>vp[Neptune#ml.score->0.21365921]

You could make the same query deterministic:

%%gremlin g.with("Neptune#ml.endpoint", "nc-ep") .with("Neptune#ml.iamRoleArn", "arn:aws:iam::123456789012:role/NeptuneMLRole") .V('101').properties("genre", "Neptune#ml.score") .with("Neptune#ml.classification") .with("Neptune#ml.inductiveInference") .with("Neptune#ml.deterministic")

In that case, the results would be roughly the same every time:

# First time ==>vp[genre->Action] ==>vp[Neptune#ml.score->0.12345678] # Second time ==>vp[genre->Action] ==>vp[Neptune#ml.score->0.12345678]