Reading from Salesforce entities - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Reading from Salesforce entities

Prerequisite

A Salesforce sObject you would like to read from. You will need the object name such as Account or Case or Opportunity.

Example:

salesforce_read = glueContext.create_dynamic_frame.from_options( connection_type="salesforce", connection_options={ "connectionName": "connectionName", "ENTITY_NAME": "Account", "API_VERSION": "v60.0" }

Partitioning queries

You can provide the additional Spark options PARTITION_FIELD, LOWER_BOUND, UPPER_BOUND, and NUM_PARTITIONS if you want to utilize concurrency in Spark. With these parameters, the original query would be split into NUM_PARTITIONS number of sub-queries that can be executed by Spark tasks concurrently.

  • PARTITION_FIELD: the name of the field to be used to partition the query.

  • LOWER_BOUND: an inclusive lower bound value of the chosen partition field.

    For timestamp field, we accept the Spark timestamp format used in Spark SQL queries.

    Examples of valid values:

    "TIMESTAMP \"1707256978123\"" "TIMESTAMP ’2024-02-06 22:02:58.123 UTC'" "TIMESTAMP \"2018-08-08 00:00:00 Pacific/Tahiti\" "TIMESTAMP \"2018-08-08 00:00:00\"" "TIMESTAMP \"-123456789\" Pacific/Tahiti" "TIMESTAMP \"1702600882\""
  • UPPER_BOUND: an exclusive upper bound value of the chosen partition field.

  • NUM_PARTITIONS: the number of partitions.

Example:

salesforce_read = glueContext.create_dynamic_frame.from_options( connection_type="salesforce", connection_options={ "connectionName": "connectionName", "ENTITY_NAME": "Account", "API_VERSION": "v60.0", "PARTITION_FIELD": "SystemModstamp" "LOWER_BOUND": "TIMESTAMP '2021-01-01 00:00:00 Pacific/Tahiti'" "UPPER_BOUND": "TIMESTAMP '2023-01-10 00:00:00 Pacific/Tahiti'" "NUM_PARTITIONS": "10" }