Reading from SAP OData entities
Prerequisite
A SAP OData object you would like to read from. You will need the object/EntitySet name, for example, /sap/opu/odata/sap/API_SALES_ORDER_SRV/A_SalesOrder.
Example:
sapodata_read = glueContext.create_dynamic_frame.from_options( connection_type="SAPOData", connection_options={ "connectionName": "connectionName", "ENTITY_NAME": "entityName" }, transformation_ctx=key)
SAP OData entity and field details:
Entity | Data type | Supported operators |
---|---|---|
Tables (dynamic entities) | String | =, !=, >, >=, <, <=, BETWEEN, LIKE |
Integer | =, !=, >, >=, <, <=, BETWEEN, LIKE | |
Long | =, !=, >, >=, <, <=, BETWEEN, LIKE | |
Double | =, !=, >, >=, <, <=, BETWEEN, LIKE | |
Date | =, !=, >, >=, <, <=, BETWEEN, LIKE | |
DateTime | =, !=, >, >=, <, <=, BETWEEN, LIKE | |
Boolean | =, != | |
Struct | =, !=, >, >=, <, <=, BETWEEN, LIKE |
Partitioning queries
Field-based partitioning:
You can provide the additional Spark options PARTITION_FIELD
,
LOWER_BOUND
, UPPER_BOUND
, and
NUM_PARTITIONS
if you want to utilize concurrency in Spark. With
these parameters, the original query would be split into NUM_PARTITIONS
number of sub-queries that can be executed by Spark tasks
concurrently. Integer, Date and DateTime fields support field-based partitioning in the SAP OData connector.
PARTITION_FIELD
: the name of the field to be used to partition the query.LOWER_BOUND
: an inclusive lower bound value of the chosen partition field.For the Datetime field, we accept the Spark timestamp format used in SPark SQL queries.
Examples of valid value:
"2000-01-01T00:00:00.000Z"
UPPER_BOUND
: an exclusive upper bound value of the chosen partition field.NUM_PARTITIONS
: the number of partitions.PARTITION_BY
: the type partitioning to be performed. "FIELD" is to be passed in case of field-based partitioning.
Example:
sapodata= glueContext.create_dynamic_frame.from_options( connection_type="sapodata", connection_options={ "connectionName": "connectionName", "ENTITY_NAME": "/sap/opu/odata/sap/SEPM_HCM_SCENARIO_SRV/EmployeeSet", "PARTITION_FIELD": "validStartDate" "LOWER_BOUND": "2000-01-01T00:00:00.000Z" "UPPER_BOUND": "2020-01-01T00:00:00.000Z" "NUM_PARTITIONS": "10", "PARTITION_BY": "FIELD" }, transformation_ctx=key)
Record-based partitioning:
The original query would be split into NUM_PARTITIONS
number of sub-queries that can be executed by Spark tasks concurrently.
Record-based partitioning is only supported for non-ODP entities, as pagination in ODP entities is supported through the next token/skip token.
PARTITION_BY
: the type partitioning to be performed. "COUNT" is to be passed in case of record-based partitioning.
sapodata= glueContext.create_dynamic_frame.from_options( connection_type="sapodata", connection_options={ "connectionName": "connectionName", "ENTITY_NAME": "/sap/opu/odata/sap/SEPM_HCM_SCENARIO_SRV/EmployeeSet", "NUM_PARTITIONS": "10", "PARTITION_BY": "COUNT" }, transformation_ctx=key)