Gremlin node regression queries in Neptune ML - Amazon Neptune
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Gremlin node regression queries in Neptune ML

Node regression is similar to node classification, except that the value inferred from the regression model for each node is numeric. You can use the same Gremlin queries for node regression as for node classification except for the following differences:

  • Again, in Neptune ML, nodes refer to vertices.

  • The properties() step takes the form, properties().with("Neptune#ml.regression")instead of properties().with("Neptune#ml.classification").

  • The "Neptune#ml.limit" and "Neptune#ml.threshold" predicates are not applicable.

  • When you filter on the value, you have to specify a numeric value.

Here is a sample vertex classification query:

g.with("Neptune#ml.endpoint","node-regression-movie-lens-endpoint") .with("Neptune#ml.iamRoleArn", "arn:aws:iam::0123456789:role/sagemaker-role") .V("movie_1","movie_2","movie_3") .properties("revenue").with("Neptune#ml.regression")

You can filter on the value inferred using a regression model, as illustrated in the following examples:

g.with("Neptune#ml.endpoint","node-regression-movie-lens-endpoint") .with("Neptune#ml.iamRoleArn","arn:aws:iam::0123456789:role/sagemaker-role") .V("movie_1","movie_2","movie_3") .properties("revenue").with("Neptune#ml.regression") .value().is(P.gte(1600000)) g.with("Neptune#ml.endpoint","node-regression-movie-lens-endpoint") .with("Neptune#ml.iamRoleArn","arn:aws:iam::0123456789:role/sagemaker-role") .V("movie_1","movie_2","movie_3") .properties("revenue").with("Neptune#ml.regression") .hasValue(P.lte(1600000D))

Using inductive inference in a node regression query

Supposing you were to add a new node to an existing graph, in a Jupyter notebook, like this:

%%gremlin g.addV('label1').property(id,'101').as('newV') .V('1').as('oldV1') .V('2').as('oldV2') .addE('eLabel1').from('newV').to('oldV1') .addE('eLabel2').from('oldV2').to('newV')

You could then use an inductive inference query to get a rating that took into account the new node:

%%gremlin g.with("Neptune#ml.endpoint", "nr-ep") .with("Neptune#ml.iamRoleArn", "arn:aws:iam::123456789012:role/NeptuneMLRole") .V('101').properties("rating") .with("Neptune#ml.regression") .with("Neptune#ml.inductiveInference")

Because the query is not deterministic, it might return somewhat different results if you run it several times, based on the neighborhood:

# First time ==>vp[rating->9.1] # Second time ==>vp[rating->8.9]

If you need more consistent results, you could make the query deterministic:

%%gremlin g.with("Neptune#ml.endpoint", "nc-ep") .with("Neptune#ml.iamRoleArn", "arn:aws:iam::123456789012:role/NeptuneMLRole") .V('101').properties("rating") .with("Neptune#ml.regression") .with("Neptune#ml.inductiveInference") .with("Neptune#ml.deterministic")

Now the results will be roughly the same every time:

# First time ==>vp[rating->9.1] # Second time ==>vp[rating->9.1]