Reading from Salesforce entities
Prerequisite
A Salesforce sObject you would like to read from. You will need the object name such as Account
or Case
or Opportunity
.
Example:
salesforce_read = glueContext.create_dynamic_frame.from_options( connection_type="salesforce", connection_options={ "connectionName": "connectionName", "ENTITY_NAME": "Account", "API_VERSION": "v60.0" }
Partitioning queries
You can provide the additional Spark options PARTITION_FIELD
, LOWER_BOUND
, UPPER_BOUND
, and NUM_PARTITIONS
if you want to utilize concurrency in Spark. With these parameters, the original query would be split into NUM_PARTITIONS
number of sub-queries that can be executed by Spark tasks concurrently.
PARTITION_FIELD
: the name of the field to be used to partition the query.LOWER_BOUND
: an inclusive lower bound value of the chosen partition field.For timestamp field, we accept the Spark timestamp format used in Spark SQL queries.
Examples of valid values:
"TIMESTAMP \"1707256978123\"" "TIMESTAMP ’2024-02-06 22:02:58.123 UTC'" "TIMESTAMP \"2018-08-08 00:00:00 Pacific/Tahiti\" "TIMESTAMP \"2018-08-08 00:00:00\"" "TIMESTAMP \"-123456789\" Pacific/Tahiti" "TIMESTAMP \"1702600882\""
UPPER_BOUND
: an exclusive upper bound value of the chosen partition field.NUM_PARTITIONS
: the number of partitions.
Example:
salesforce_read = glueContext.create_dynamic_frame.from_options( connection_type="salesforce", connection_options={ "connectionName": "connectionName", "ENTITY_NAME": "Account", "API_VERSION": "v60.0", "PARTITION_FIELD": "SystemModstamp" "LOWER_BOUND": "TIMESTAMP '2021-01-01 00:00:00 Pacific/Tahiti'" "UPPER_BOUND": "TIMESTAMP '2023-01-10 00:00:00 Pacific/Tahiti'" "NUM_PARTITIONS": "10" }