Using the Neptune-Export service to export Neptune data
You can use the following steps to export data from your Neptune DB cluster to Amazon S3 using the Neptune-Export service:
Installing the Neptune-Export service
Use an Amazon CloudFormation template to create the stack:
To install the Neptune-Export service
-
Launch the Amazon CloudFormation stack on the Amazon CloudFormation console by choosing one of the Launch Stack buttons in the following table:
Region View View in Designer Launch US East (N. Virginia) View View in Designer US East (Ohio) View View in Designer US West (N. California) View View in Designer US West (Oregon) View View in Designer Canada (Central) View View in Designer South America (São Paulo) View View in Designer Europe (Stockholm) View View in Designer Europe (Ireland) View View in Designer Europe (London) View View in Designer Europe (Paris) View View in Designer Europe (Frankfurt) View View in Designer Middle East (Bahrain) View View in Designer Middle East (UAE) View View in Designer Israel (Tel Aviv) View View in Designer Africa (Cape Town) View View in Designer Asia Pacific (Hong Kong) View View in Designer Asia Pacific (Tokyo) View View in Designer Asia Pacific (Seoul) View View in Designer Asia Pacific (Singapore) View View in Designer Asia Pacific (Sydney) View View in Designer Asia Pacific (Mumbai) View View in Designer China (Beijing) View View in Designer China (Ningxia) View View in Designer Amazon GovCloud (US-West) View View in Designer Amazon GovCloud (US-East) View View in Designer On the Select Template page, choose Next.
-
On the Specify Details page, the template, set the following parameters:
-
VPC
– The easiest way to set up the Neptune-Export service is to install it in the same Amazon VPC as your Neptune database. If you want to install it in a separate VPC you can use VPC peeringto establish connectivity between the Neptune DB cluster's VPC and the Neptune-Export service VPC. -
Subnet1
– The Neptune-Export service must be installed in a subnet in your VPC that allows outbound IPv4 HTTPS traffic from the subnet to the internet. This is so that the Neptune-Export service can call the Amazon Batch APIto create and run an export job. If you created your Neptune cluster using the CloudFormation template on the Create a DB cluster page in the Neptune documentation, you can use the
PrivateSubnet1
andPrivateSubnet2
outputs from that stack to populate this and the next parameter. -
Subnet2
– A second subnet in the VPC that allows outbound IPv4 HTTPS traffic from the subnet to the internet. -
EnableIAM
– Set this totrue
to secure the Neptune-Endpoint API using Amazon Identity and Access Management (IAM). We recommend that you do so.If you do enable IAM authentication, you must
Sigv4
sign all HTTPS requests to the endpoint. You can use a tool such as awscurlto sign requests on your behalf. -
VPCOnly
– Setting this totrue
makes the export endpoint VPC-only, so that you can only access it from within the VPC where the Neptune-Export service is installed. This restricts the Neptune-Export API to being used only from within that VPC.We recommend that you set
VPCOnly
totrue
. -
NumOfFilesULimit
– Specify a value between 10,000 and 1,000,000 fornofile
in theulimits
container property. The default is 10,000, and we recommend keeping the default unless your graph contains a large number of unique labels. -
PrivateDnsEnabled
(Boolean) – Indicates whether to associate a private hosted zone with the specified VPC or not. The default value istrue
.When a VPC endpoint is created with this flag enabled, all API Gateway traffic is routed through the VPC endpoint, and public API Gateway endpoint calls becomes disabled. If you set
PrivateDnsEnabled
tofalse
, the public API Gateway endpoint is enabled, but the Neptune export service cannot be connected through the private DNS endpoint. You can then use a public DNS endpoint for the VPC endpoint to call the export service, as detailed here.
-
Choose Next.
On the Options page, choose Next.
-
On the Review page, select the first check box to acknowledge that Amazon CloudFormation will create IAM resources. Select the second check box to acknowledge
CAPABILITY_AUTO_EXPAND
for the new stack.Note
CAPABILITY_AUTO_EXPAND
explicitly acknowledges that macros will be expanded when creating the stack, without prior review. Users often create a change set from a processed template so that the changes made by macros can be reviewed before actually creating the stack. For more information, see the Amazon CloudFormation CreateStack API.Then choose Create.
Enable access to Neptune from Neptune-Export
After the Neptune-Export installation has completed, update your Neptune VPC security group to allow access
from Neptune-Export. When the Neptune-Export Amazon CloudFormation stack has been created, the Outputs
tab includes a NeptuneExportSecurityGroup
ID. Update your Neptune VPC security
group to allow access from this Neptune-Export security group.
Enable access to the Neptune-Export endpoint from a VPC-based EC2 instance
If you make your Neptune-Export endpoint VPC-only, you can only access it from within
the VPC in which the Neptune-Export service is installed. To allow connectivity from an
Amazon EC2 instance in the VPC from which you can make Neptune-Export API calls, attach the
NeptuneExportSecurityGroup
created by the Amazon CloudFormation stack to that Amazon EC2 instance.
Run a Neptune-Export job using the Neptune-Export API
The Outputs tab of the Amazon CloudFormation stack also includes the
NeptuneExportApiUri
. Use this URI whenever you send a request to
the Neptune-Export endpoint.
Run an export job
Be sure that the user or role under which the export runs has been granted
execute-api:Invoke
permission.If you set the
EnableIAM
parameter totrue
in the Amazon CloudFormation stack when you installed Neptune-Export, you need toSigv4
sign all requests to the Neptune-Export API. We recommend using awscurlto make requests to the API. All the examples here assume that IAM auth is enabled. If you set the
VPCOnly
parameter totrue
in the Amazon CloudFormation stack when you installed Neptune-Export, you must call the Neptune-Export API from within the VPC, typically from an Amazon EC2 instance located in the VPC.
To start exporting data, send a request to the NeptuneExportApiUri
endpoint with command
and outputS3Path
request parameters
and an endpoint
export parameter.
The following is an example of a request that exports property-graph data from Neptune and publishes it to Amazon S3:
curl \
(your NeptuneExportApiUri)
\ -X POST \ -H 'Content-Type: application/json' \ -d '{ "command": "export-pg", "outputS3Path": "s3://(your Amazon S3 bucket)
/neptune-export", "params": { "endpoint": "(your Neptune endpoint DNS name)
" } }'
Similarly, here is an example of a request that exports RDF data from Neptune to Amazon S3:
curl \
(your NeptuneExportApiUri)
\ -X POST \ -H 'Content-Type: application/json' \ -d '{ "command": "export-rdf", "outputS3Path": "s3://(your Amazon S3 bucket)
/neptune-export", "params": { "endpoint": "(your Neptune endpoint DNS name)
" } }'
If you omit the command
request parameter, by default Neptune-Export
attempts to export property-graph data from Neptune.
If the previous command ran successfully, the output would look like this:
{ "jobName": "neptune-export-abc12345-1589808577790", "jobId": "c86258f7-a9c9-4f8c-8f4c-bbfe76d51c8f" }
Monitor the export job you just started
To monitor a running job, append its jobID to your NeptuneExportApiUri
,
something like this:
curl \
(your NeptuneExportApiUri)
(the job ID)
If the service had not yet started the export job, the response would look like this:
{ "jobId": "c86258f7-a9c9-4f8c-8f4c-bbfe76d51c8f", "status": "pending" }
When you repeat the command after the export job has started, the response would look something like this:
{ "jobId": "c86258f7-a9c9-4f8c-8f4c-bbfe76d51c8f", "status": "running", "logs": "https://us-east-1.console.aws.amazon.com/cloudwatch/home?..." }
If you open the logs in CloudWatch Logs using the URI provided by the status call, you can then monitor the progress of the export in detail:
Cancel a running export job
To cancel a running export job using the Amazon Web Services Management Console
Open the Amazon Batch console at https://console.amazonaws.cn/batch/
. Choose Jobs.
Locate the running job that you want to cancel, based on its
jobID
.Select Cancel job.
To cancel a running export job using the Neptune export API:
Send an HTTP DELETE
request to the
NeptuneExportApiUri
with the jobID
appended, like this:
curl -X DELETE \
(your NeptuneExportApiUri)
(the job ID)