Using the Neptune-Export service to export Neptune data - Amazon Neptune
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China.

Using the Neptune-Export service to export Neptune data

You can use the following steps to export data from your Neptune DB cluster to Amazon S3 using the Neptune-Export service:

Installing the Neptune-Export service

Use an Amazon CloudFormation template to create the stack:

To install the Neptune-Export service

  1. Launch the Amazon CloudFormation stack on the Amazon CloudFormation console by choosing one of the Launch Stack buttons in the following table:

    Region View View in Designer Launch
    US East (N. Virginia) View View in Designer
    US East (Ohio) View View in Designer
    US West (N. California) View View in Designer
    US West (Oregon) View View in Designer
    Canada (Central) View View in Designer
    South America (São Paulo) View View in Designer
    Europe (Stockholm) View View in Designer
    Europe (Ireland) View View in Designer
    Europe (London) View View in Designer
    Europe (Paris) View View in Designer
    Europe (Frankfurt) View View in Designer
    Middle East (Bahrain) View View in Designer
    Africa (Cape Town) View View in Designer
    Asia Pacific (Hong Kong) View View in Designer
    Asia Pacific (Tokyo) View View in Designer
    Asia Pacific (Seoul) View View in Designer
    Asia Pacific (Singapore) View View in Designer
    Asia Pacific (Sydney) View View in Designer
    Asia Pacific (Mumbai) View View in Designer
    China (Beijing) View View in Designer
    China (Ningxia) View View in Designer
    Amazon GovCloud (US-West) View View in Designer
    Amazon GovCloud (US-East) View View in Designer
  2. On the Select Template page, choose Next.

  3. On the Specify Details page, the template, set the following parameters:

    • VPC   –   The easiest way to set up the Neptune-Export service is to install it in the same Amazon VPC as your Neptune database. If you want to install it in a separate VPC you can use VPC peering to establish connectivity between the Neptune DB cluster's VPC and the Neptune-Export service VPC.

    • Subnet1   –   The Neptune-Export service must be installed in a subnet in your VPC that allows outbound IPv4 HTTPS traffic from the subnet to the internet. This is so that the Neptune-Export service can call the Amazon Batch API to create and run an export job.

      If you created your Neptune cluster using the CloudFormation template on the Create a DB cluster page in the Neptune documentation, you can use the PrivateSubnet1 and PrivateSubnet2 outputs from that stack to populate this and the next parameter.

    • Subnet2   –   A second subnet in the VPC that allows outbound IPv4 HTTPS traffic from the subnet to the internet.

    • EnableIAM   –   Set this to true to secure the Neptune-Endpoint API using Amazon Identity and Access Management (IAM). We recommend that you do so.

      If you do enable IAM authentication, you must Sigv4 sign all HTTPS requests to the endpoint. You can use a tool such as awscurl to sign requests on your behalf.

    • VPCOnly   –   Setting this to true makes the export endpoint VPC-only, so that you can only access it from within the VPC where the Neptune-Export service is installed. This restricts the Neptune-Export API to being used only from within that VPC.

      We recommend that you set VPCOnly to true.

    • NumOfFilesULimit   –   Specify a value between 10,000 and 1,000,000 for nofile in the ulimits container property. The default is 10,000, and we recommend keeping the default unless your graph contains a large number of unique labels.

    • PrivateDnsEnabled (Boolean)   –   Indicates whether to associate a private hosted zone with the specified VPC or not. The default value is true.

      When a VPC endpoint is created with this flag enabled, all API Gateway traffic is routed through the VPC endpoint, and public API Gateway endpoint calls becomes disabled. If you set PrivateDnsEnabled to false, the public API Gateway endpoint is enabled, but the Neptune export service cannot be connected through the private DNS endpoint. You can then use a public DNS endpoint for the VPC endpoint to call the export service, as detailed here.

  4. Choose Next.

  5. On the Options page, choose Next.

  6. On the Review page, select the first check box to acknowledge that Amazon CloudFormation will create IAM resources. Select the second check box to acknowledge CAPABILITY_AUTO_EXPAND for the new stack.

    Note

    CAPABILITY_AUTO_EXPAND explicitly acknowledges that macros will be expanded when creating the stack, without prior review. Users often create a change set from a processed template so that the changes made by macros can be reviewed before actually creating the stack. For more information, see the Amazon CloudFormation CreateStack API.

    Then choose Create.

Enable access to Neptune from Neptune-Export

After the Neptune-Export installation has completed, update your Neptune VPC security group to allow access from Neptune-Export. When the Neptune-Export Amazon CloudFormation stack has been created, the Outputs tab includes a NeptuneExportSecurityGroup ID. Update your Neptune VPC security group to allow access from this Neptune-Export security group.

Enable access to the Neptune-Export endpoint from a VPC-based EC2 instance

If you make your Neptune-Export endpoint VPC-only, you can only access it from within the VPC in which the Neptune-Export service is installed. To allow connectivity from an Amazon EC2 instance in the VPC from which you can make Neptune-Export API calls, attach the NeptuneExportSecurityGroup created by the Amazon CloudFormation stack to that Amazon EC2 instance.

Run a Neptune-Export job using the Neptune-Export API

The Outputs tab of the Amazon CloudFormation stack also includes the NeptuneExportApiUri. Use this URI whenever you send a request to the Neptune-Export endpoint.

Run an export job

  • Be sure that the user or role under which the export runs has been granted execute-api:Invoke permission.

  • If you set the EnableIAM parameter to true in the Amazon CloudFormation stack when you installed Neptune-Export, you need to Sigv4 sign all requests to the Neptune-Export API. We recommend using awscurl to make requests to the API. All the examples here assume that IAM auth is enabled.

  • If you set the VPCOnly parameter to true in the Amazon CloudFormation stack when you installed Neptune-Export, you must call the Neptune-Export API from within the VPC, typically from an Amazon EC2 instance located in the VPC.

To start exporting data, send a request to the NeptuneExportApiUri endpoint with command and outputS3Path request parameters and an endpoint export parameter.

The following is an example of a request that exports property-graph data from Neptune and publishes it to Amazon S3:

curl \ (your NeptuneExportApiUri) \ -X POST \ -H 'Content-Type: application/json' \ -d '{ "command": "export-pg", "outputS3Path": "s3://(your Amazon S3 bucket)/neptune-export", "params": { "endpoint": "(your Neptune endpoint DNS name)" } }'

Similarly, here is an example of a request that exports RDF data from Neptune to Amazon S3:

curl \ (your NeptuneExportApiUri) \ -X POST \ -H 'Content-Type: application/json' \ -d '{ "command": "export-rdf", "outputS3Path": "s3://(your Amazon S3 bucket)/neptune-export", "params": { "endpoint": "(your Neptune endpoint DNS name)" } }'

If you omit the command request parameter, by default Neptune-Export attempts to export property-graph data from Neptune.

If the previous command ran successfully, the output would look like this:

{ "jobName": "neptune-export-abc12345-1589808577790", "jobId": "c86258f7-a9c9-4f8c-8f4c-bbfe76d51c8f" }

Monitor the export job you just started

To monitor a running job, append its jobID to your NeptuneExportApiUri, something like this:

curl \ (your NeptuneExportApiUri)(the job ID)

If the service had not yet started the export job, the response would look like this:

{ "jobId": "c86258f7-a9c9-4f8c-8f4c-bbfe76d51c8f", "status": "pending" }

When you repeat the command after the export job has started, the response would look something like this:

{ "jobId": "c86258f7-a9c9-4f8c-8f4c-bbfe76d51c8f", "status": "running", "logs": "https://us-east-1.console.aws.amazon.com/cloudwatch/home?..." }

If you open the logs in CloudWatch Logs using the URI provided by the status call, you can then monitor the progress of the export in detail:

Screenshot of the CloudWatch Logs display.

Cancel a running export job

To cancel a running export job using the Amazon Web Services Management Console

  1. Open the Amazon Batch console at https://console.amazonaws.cn/batch/.

  2. Choose Jobs.

  3. Locate the running job that you want to cancel, based on its jobID.

  4. Select Cancel job.

To cancel a running export job using the Neptune export API:

Send an HTTP DELETE request to the NeptuneExportApiUri with the jobID appended, like this:

curl -X DELETE \ (your NeptuneExportApiUri) (the job ID)