Bulk add vertices and edges in batches
Every query to the Neptune DB runs in the scope of a single transaction, unless you use a session. This means that if you need to insert a lot of data using gremlin queries, batching them together in a batch size of 50-100 improves performance by reducing the number of transactions created for the load.
As an example, adding 5 vertices to the database would look like this:
// Create a GraphTraversalSource for the remote connection final GraphTraversalSource g = traversal().withRemote(DriverRemoteConnection.using(cluster)); // Add 5 vertices in a single query g.addV("Person").property(T.id, "P1") .addV("Person").property(T.id, "P2") .addV("Person").property(T.id, "P3") .addV("Person").property(T.id, "P4") .addV("Person").property(T.id, "P5").iterate();
Similarly, you can batch-add edges using addE. Use
V() to reference existing vertices as the source and target
of each edge:
// Add edges in a single batched query g.V("P1").addE("knows").to(V("P2")) .V("P2").addE("knows").to(V("P3")) .V("P3").addE("knows").to(V("P4")) .V("P4").addE("knows").to(V("P5")).iterate();
You can also combine vertex and edge creation in a single batch. Use
as() to label newly created vertices so you can reference them
when adding edges in the same traversal:
// Add vertices and edges together in a single query g.addV("Person").property(T.id, "P1").as("p1") .addV("Person").property(T.id, "P2").as("p2") .addV("Person").property(T.id, "P3").as("p3") .addE("knows").from("p1").to("p2") .addE("knows").from("p2").to("p3").iterate();