Oracle Data Pump and PostgreSQL pg_dump and pg_restore
With Amazon DMS, you can migrate data from source databases to target databases using Oracle Data Pump and PostgreSQL pg_dump and pg_restore. Oracle Data Pump is a utility for transferring data between Oracle databases, while PostgreSQL pg_dump and pg_restore create a backup of a PostgreSQL database.
| Feature compatibility | Amazon SCT / Amazon DMS automation level | Amazon SCT action code index | Key differences |
|---|---|---|---|
|
|
N/A |
N/A |
Non-compatible tool. |
Oracle usage
Oracle Data Pump is a utility for exporting and importing data from/to an Oracle database. It can be used to copy an entire database, entire schemas, or specific objects in a schema. Oracle Data Pump is commonly used as a part of a backup strategy for restoring individual database objects (specific records, tables, views, stored procedures, and so on) as opposed to snapshots or Oracle RMAN, which provides backup and recovery capabilities at the database level. By default (without using the sqlfile parameter during export), the dump file generated by Oracle Data Pump is binary (it can’t be opened using a text editor).
Oracle Data Pump supports:
-
Export Data from an Oracle database — The Data Pump
EXPDPcommand creates a binary dump file containing the exported database objects. Objects can be exported with data or with metadata only. Exports can be performed for specific timestamps or Oracle SCNs to ensure cross-object consistency. -
Import Data to an Oracle database — The Data Pump
IMPDPcommand imports objects and data from a specific dump file created with theEXPDPcommand. TheIMPDPcommand can filter on import (for example, only import certain objects) and remap object and schema names during import.
The term “Logical backup” refers to a dump file created by Oracle Data Pump.
Both EXPDP and IMPDP can only read/write dump files from file system paths that were pre-configured in the Oracle database as directories. During export/import, users must specify the logical directory name where the dump file should be created; not the actual file system path.
Examples
Use EXPDP to export the HR schema.
$ expdp system/**** directory=expdp_dir schemas=hr dumpfile=hr.dmp logfile=hr.log
The command contains the credentials to run Data Pump, the logical Oracle directory name for the dump file location (which maps in the database to a physical file system location), the schema name to export, the dump file name, and log file name.
Use IMPDP to import the HR a schema and rename to HR_COPY.
$ impdp system/**** directory=expdp_dir schemas=hr dumpfile=hr.dmp logfile=hr.log REMAP_SCHEMA=hr:hr_copy
The command contains the database credentials to run Data Pump, the logical Oracle directory for where the export dump file is located, the dump file name, the schema to export, the name for the dump file, the log file name, and the REMAP_SCHEMA parameter
For more information, see Oracle Data Pump
PostgreSQL usage
PostgreSQL provides native utilities — pg_dump and pg_restore can be used to perform logical database exports and imports with a degree of comparable functionality to the Oracle Data Pump utility. Such as for moving data between two databases and creating logical database backups.
-
pg_dumpis an equivalent to Oracle expdp -
pg_restoreis an equivalent to Oracle impdp
Amazon Aurora PostgreSQL supports data export and import using both pg_dump and pg_restore, but the binaries for both utilities will need to be placed on your local workstation or on an Amazon EC2 server as part of the PostgreSQL client binaries.
PostgreSQL dump files created using pg_dump can be copied, after export, to an Amazon S3 bucket as cloud backup storage or for maintaining the desired backup retention policy. Later, when dump files are needed for database restore, the dump files should be copied back to the desktop / server that has a PostgreSQL client (such as your workstation or an Amazon EC2 server) to issue the pg_restore command.
Starting with PostgreSQL 10, the following capabilities were added:
-
A schema can be excluded in
pg_dumporpg_restorecommands. -
Can create dumps with no blobs.
-
Allow to run
pg_dumpallby non-superusers, using the--no-role-passwordsoption. -
Create additional integrity option to ensure that the data is stored in disk using the
fsync()method.
Starting with PostgreSQL 11, pg_dump and pg_restore can export and import relationships between extensions and database objects established with ALTER … DEPENDS ON EXTENSION, which allows these objects to be dropped when extension is dropped with CASCADE option.
Note
pg_dump will create consistent backups even if the database is being used concurrently. pg_dump doesn’t block other users accessing the database (readers or writers).pg_dump only exports a single database, in order to backup global objects that are common to all databases in a cluster, such as roles and tablespaces, use pg_dumpall. PostgreSQL dump files can be both plain-text and custom format files.
Another option to export and import data from PostgreSQL database is to use COPY TO and COPY FROM commands. Starting with PostgreSQL 12 the COPY FROM command, that can be used to load data into DB, has support for filtering incoming rows with WHERE condition.
CREATE TABLE tst_copy(v TEXT); COPY tst_copy FROM '/home/postgres/file.csv' WITH (FORMAT CSV) WHERE v LIKE '%apple%';
Examples
Export data using pg_dump: Use a workstation or server with the PostgreSQL client installed in order to connect to the Aurora PostgreSQL instance in Amazon; providing the hostname (-h), database user name (-U) and database name (-d) while issuing the pg_dump command.
$ pg_dump -h hostname.rds.amazonaws.com -U username -d db_name -f dump_file_name.sql
The output file, dump_file_name.sql, will be stored on the server where the pg_dump command runs. You can later copy the output file to an Amazon S3 Bucket, if needed.
Run pg_dump and copy the backup file to an Amazon S3 bucket using pipe and the Amazon CLI.
$ pg_dump -h hostname.rds.amazonaws.com -U username -d db_name -f dump_file_name.sql | aws s3 cp - s3://pg-backup/pg_bck-$(date"+%Y-%m-%d-%H-%M-%S")
Restore data with pg_restore. Use a workstation or server with the PostgreSQL client installed to connect to the Aurora PostgreSQL instance providing the hostname (-h), database user name (-U), database name (-d) and the dump file to restore from while issuing the pg_restore command.
$ pg_restore -h hostname.rds.amazonaws.com -U username -d dbname_restore dump_file_name.sql
Copy the output file from the local server to an Amazon S3 Bucket using the Amazon CLI. Upload the dump file to Amazon S3 bucket.
$ aws s3 cp /usr/Exports/hr.dmp s3://my-bucket/backup-$(date "+%Y-%m-%d-%H-%M-%S")
The {-$(date "+%Y-%m-%d-%H-%M-%S")} format is valid on Linux servers only.
Download the output file from the Amazon S3 bucket.
$ aws s3 cp s3://my-bucket/backup-2017-09-10-01-10-10 /usr/Exports/hr.dmp
You can create a copy of an existing database without having to use pg_dump or pg_restore. Instead, use the template keyword to signify the database used as the source.
CREATE DATABASE mydb_copy TEPLATE mydb;
Summary
| Description | Oracle Data Pump | PostgreSQL Dump |
|---|---|---|
|
Export data to a local file |
expdp system/*** schemas=hr dumpfile=hr.dmp logfile=hr.log |
pg_dump -F c -h hostname.rds.amazonaws.com -U username -d hr -p 5432 > c:\Export\hr.dmp |
|
Export data to a remote file |
Create Oracle directory on remote storage mount or NFS directory called EXP_DIR. Use the export command: expdp system/*** schemas=hr directory=EXP_DIR dumpfile=hr.dmp logfile=hr.log |
Export: pg_dump -F c -h hostname.rds.amazonaws.com -U username -d hr -p 5432 > c:\Export\hr.dmp Upload to Amazon S3 aws s3 cp c:\Export\hr.dmp s3://my-bucket/backup-$(date"+%Y-%m-%d-%H-%M-%S") |
|
Import data to a new database with a new name |
impdp system/*** schemas=hr dumpfile=hr.dmp logfile=hr.log REMAP_SCHEMA=hr:hr_copy TRANSFORMM=OID:N |
pg_restore -h hostname.rds.amazonaws.com -U hr -d hr_restore -p 5432 c:\Expor\hr.dmp |
|
Exclude schemas |
expdp system/*** FULL=Y directory=EXP_DIR dumpfile=hr.dmp logfile=hr.log exclude=SCHEMA:"HR" |
pg_dump -F c -h hostname.rds.amazonaws.com -U username -d hr -p 5432 -N 'log_schema' c:\Export\hr_nolog.dmp |
For more information, see SQL Dump