The Amazon Neptune alternative query engine (DFE)
Amazon Neptune has an alternative query engine known as the DFE that uses DB instance resources such as CPU cores, memory, and I/O more efficiently than the original Neptune engine.
Note
With large data sets, the DFE engine may not run well on t3 instances.
The DFE engine runs SPARQL, Gremlin and openCypher queries, and supports a wide variety of plan types, including left-deep, bushy, and hybrid ones. Plan operators can invoke both compute operations, which run on a reserved set of compute cores, and I/O operations, each of which runs on its own thread in an I/O thread pool.
The DFE uses pre-generated statistics about your Neptune graph data to make informed decisions about how to structure queries. See DFE statistics for information about how these statistics are generated.
The choice of plan type and the number of compute threads used is made automatically based on pre-generated statistics and on the resources that are available in the Neptune head node. The order of results is not predetermined for plans that have internal compute parallelism.
Controlling where the Neptune DFE engine is used
By default, the neptune_dfe_query_engine
instance parameter of an instance is set to viaQueryHint
, which causes
the DFE engine to be used only for openCypher queries and for Gremlin and SPARQL queries
that explicitly include the useDFE
query hint set to true
.
You can fully enable the DFE engine so that it is used wherever possible by setting
the neptune_dfe_query_engine
instance parameter to enabled
.
You can also disable the DFE by including the useDFE
query hint
for a particular Gremlin query or
SPARQL query. This query hint lets
you prevent the DFE from executing that particular query.
You can determine whether or not the DFE is enabled in an instance using an Instance Status call, like this:
curl -G https://
your-neptune-endpoint
:port
/status
The status response then specifies whether the DFE is enabled or not:
{ "status":"healthy", "startTime":"Wed Dec 29 02:29:24 UTC 2021", "dbEngineVersion":"development", "role":"writer", "dfeQueryEngine":"viaQueryHint", "gremlin":{"version":"tinkerpop-3.5.2"}, "sparql":{"version":"sparql-1.1"}, "opencypher":{"version":"Neptune-9.0.20190305-1.0"}, "labMode":{ "ObjectIndex":"disabled", "ReadWriteConflictDetection":"enabled" }, "features":{ "ResultCache":{"status":"disabled"}, "IAMAuthentication":"disabled", "Streams":"disabled", "AuditLog":"disabled" }, "settings":{"clusterQueryTimeoutInMs":"120000"} }
The Gremlin explain
and profile
results tell you
whether a query is being executed by the DFE. See Information contained in a Gremlin explain report for explain
and DFE profile reports
for profile
.
Similarly, SPARQL explain
tells you whether a SPARQL query is being
executed by the DFE. See Example of SPARQL explain output when the DFE is enabled
and DFENode operator
for more details.
Query constructs supported by the Neptune DFE
Currently, the Neptune DFE supports a subset of SPARQL and Gremlin query constructs.
For SPARQL, this is the subset of conjunctive basic graph
patterns
For Gremlin, it is generally the subset of queries that contain a chain of traversals which do not contain some of the more complex steps.
You can find out whether one of your queries is being executed in whole or in part by the DFE as follows:
-
In Gremlin,
explain
andprofile
results tell you what parts of a query are being executed by the DFE, if any. See Information contained in a Gremlin explain report forexplain
and DFE profile reports forprofile
. Also, see Tuning Gremlin queries using explain and profile.Details about Neptune engine support for individual Gremlin steps are documented in Gremlin step support.
Similarly, SPARQL
explain
tells you whether a SPARQL query is being executed by the DFE. See Example of SPARQL explain output when the DFE is enabled and DFENode operator for more details.