Neptune Best Practices Using SPARQL - Amazon Neptune
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Neptune Best Practices Using SPARQL

Follow these best practices when using the SPARQL query language with Neptune. For information about using SPARQL in Neptune, see Accessing the Neptune graph with SPARQL.

Querying All Named Graphs by Default

Amazon Neptune associates every triple with a named graph. The default graph is defined as the union of all named graphs.

If you submit a SPARQL query without explicitly specifying a graph via the GRAPH keyword or constructs such as FROM NAMED, Neptune always considers all triples in your DB instance. For example, the following query returns all triples from a Neptune SPARQL endpoint:

SELECT * WHERE { ?s ?p ?o }

Triples that appear in more than one graph are returned only once.

For information about the default graph specification, see the RDF Dataset section of the SPARQL 1.1 Query Language specification.

Specifying a Named Graph for Load

Amazon Neptune associates every triple with a named graph. If you don't specify a named graph when loading, inserting, or updating triples, Neptune uses the fallback named graph defined by the URI, http://aws.amazon.com/neptune/vocab/v01/DefaultNamedGraph.

If you are using the Neptune bulk loader, you can specify the named graph to use for all triples (or quads with the fourth position blank) by using the parserConfiguration: namedGraphUri parameter. For information about the Neptune loader Load command syntax, see Neptune Loader Command.

Choosing Between FILTER, FILTER...IN, and VALUES in Your Queries

There are three basic ways to inject values in SPARQL queries:   FILTER,   FILTER...IN,   and   VALUES.

For example, suppose that you want to look up the friends of multiple people within a single query. Using FILTER, you might structure your query as follows:

PREFIX ex: <https://www.example.com/> PREFIX foaf : <http://xmlns.com/foaf/0.1/> SELECT ?s ?o WHERE {?s foaf:knows ?o. FILTER (?s = ex:person1 || ?s = ex:person2)}

This returns all the triples in the graph that have ?s bound to ex:person1 or ex:person2 and have an outgoing edge labeled foaf:knows.

You can also create a query using FILTER...IN that returns equivalent results:

PREFIX ex: <https://www.example.com/> PREFIX foaf : <http://xmlns.com/foaf/0.1/> SELECT ?s ?o WHERE {?s foaf:knows ?o. FILTER (?s IN (ex:person1, ex:person2))}

You can also create a query using VALUES that in this case also returns equivalent results:

PREFIX ex: <https://www.example.com/> PREFIX foaf : <http://xmlns.com/foaf/0.1/> SELECT ?s ?o WHERE {?s foaf:knows ?o. VALUES ?s {ex:person1 ex:person2}}

Although in many cases these queries are semantically equivalent, there are some cases where the two FILTER variants differ from the VALUES variant:

  • The first case is when you inject duplicate values, such as injecting the same person twice. In that case, the VALUES query includes the duplicates in your result. You can explicitly eliminate such duplicates by adding a DISTINCT to the SELECT clause. But there might be situations where you actually want duplicates in the query results for redundant value injection.

    However, the FILTER and FILTER...IN versions extract the value only once when the same value appears multiple times.

  • The second case is related to the fact that VALUES always performs an exact match, whereas FILTER might apply type promotion and do fuzzy matching in some cases.

    For instance, when you include a literal such as "2.0"^^xsd:float in your values clause, a VALUES query exactly matches this literal, including literal value and data type.

    By contrast, FILTER produces a fuzzy match for these numeric literals. The matches could include literals with the same value but different numeric data types, such as xsd:double.

    Note

    There is no difference between the FILTER and VALUES behavior when enumerating string literals or URIs.

The differences between FILTER and VALUES can affect optimization and the resulting query evaluation strategy. Unless your use case requires fuzzy matching, we recommend using VALUES because it avoids looking at special cases related to type casting. As a result, VALUES often produces a more efficient query that runs faster and is less expensive.