Parameters used to control the Neptune export process - Amazon Neptune
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Parameters used to control the Neptune export process

Whether you are using the Neptune-Export service or the neptune-export command line utility, the parameters you use to control the export are mostly the same. They contain a JSON object passed to the Neptune-Export endpoint or to neptune-export on the command line.

The object passed in to the export process has up to five top-level fields:

-d '{ "command" : "(either export-pg or export-rdf)", "outputS3Path" : "s3:/(your Amazon S3 bucket)/(path to the folder for exported data)", "jobsize" : "(for Neptune-Export service only)", "params" : { (a JSON object that contains export-process parameters) }, "additionalParams": { (a JSON object that contains parameters for training configuration) } }'
Contents

The command parameter

The command top-level parameter determines whether to export property-graph data or RDF data. If you omit the command parameter, the export process defaults to exporting property-graph data.

  • export-pg   –   Export property-graph data.

  • export-rdf   –   Export RDF data.

The outputS3Path parameter

The outputS3Path top-level parameter is required, and must contain the URI of an Amazon S3 location to which the exported files can be published:

"outputS3Path" : "s3://(your Amazon S3 bucket)/(path to output folder)"

The value must begin with s3://, followed by a valid bucket name and optionally a folder path within the bucket.

The jobSize parameter

The jobSize top-level parameter is only used with the the Neptune-Export service, not with the neptune-export command line utility, and is optional. It lets you characterize the size of the export job you are starting, which helps determine the amount of compute resources devoted to the job and its maximum concurrency level.

"jobsize" : "(one of four size descriptors)"

The four valid size descriptors are:

  • small   –   Maximum concurrency: 8. Suitable for storage volumes up to 10 GB.

  • medium   –   Maximum concurrency: 32. Suitable for storage volumes up to 100 GB.

  • large   –   Maximum concurrency: 64. Suitable for storage volumes over 100 GB but less than 1 TB.

  • xlarge   –   Maximum concurrency: 96. Suitable for storage volumes over 1 TB.

By default, an export initiated on the Neptune-Export service runs as a small job.

The performance of an export depends not only on the jobSize setting, but also on the number of database instances that you're exporting from, the size of each instance, and the effective concurrency level of the job.

For property-graph exports, you can configure the number of database instances using the cloneClusterReplicaCount parameter, and you can configure the job's effective concurrency level using the concurrency parameter.

The params object

The params top-level parameter is a JSON object that contains parameters that you use to control the export process itself, as explained in Export parameter fields in the params top-level JSON object. Some of the fields in the params object are specific to property-graph exports, some to RDF.

The additionalParams object

The additionalParams top-level parameter is a JSON object that contains parameters you can use to control actions that are applied to the data after it has been exported. At present, additionalParams is used only for exporting training data for Neptune ML.