Reading from SAP OData entities - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Reading from SAP OData entities

Prerequisite

A SAP OData object you would like to read from. You will need the object/EntitySet name, for example, /sap/opu/odata/sap/API_SALES_ORDER_SRV/A_SalesOrder.

Example:

sapodata_read = glueContext.create_dynamic_frame.from_options( connection_type="SAPOData", connection_options={ "connectionName": "connectionName", "ENTITY_NAME": "entityName" }, transformation_ctx=key)

SAP OData entity and field details:

Entity Data type Supported operators
Tables (dynamic entities) String =, !=, >, >=, <, <=, BETWEEN, LIKE
Integer =, !=, >, >=, <, <=, BETWEEN, LIKE
Long =, !=, >, >=, <, <=, BETWEEN, LIKE
Double =, !=, >, >=, <, <=, BETWEEN, LIKE
Date =, !=, >, >=, <, <=, BETWEEN, LIKE
DateTime =, !=, >, >=, <, <=, BETWEEN, LIKE
Boolean =, !=
Struct =, !=, >, >=, <, <=, BETWEEN, LIKE

Partitioning queries

Field-based partitioning:

You can provide the additional Spark options PARTITION_FIELD, LOWER_BOUND, UPPER_BOUND, and NUM_PARTITIONS if you want to utilize concurrency in Spark. With these parameters, the original query would be split into NUM_PARTITIONS number of sub-queries that can be executed by Spark tasks concurrently. Integer, Date and DateTime fields support field-based partitioning in the SAP OData connector.

  • PARTITION_FIELD: the name of the field to be used to partition the query.

  • LOWER_BOUND: an inclusive lower bound value of the chosen partition field.

    For the Datetime field, we accept the Spark timestamp format used in SPark SQL queries.

    Examples of valid value:

    "2000-01-01T00:00:00.000Z"
  • UPPER_BOUND: an exclusive upper bound value of the chosen partition field.

  • NUM_PARTITIONS: the number of partitions.

  • PARTITION_BY: the type partitioning to be performed. "FIELD" is to be passed in case of field-based partitioning.

Example:

sapodata= glueContext.create_dynamic_frame.from_options( connection_type="sapodata", connection_options={ "connectionName": "connectionName", "ENTITY_NAME": "/sap/opu/odata/sap/SEPM_HCM_SCENARIO_SRV/EmployeeSet", "PARTITION_FIELD": "validStartDate" "LOWER_BOUND": "2000-01-01T00:00:00.000Z" "UPPER_BOUND": "2020-01-01T00:00:00.000Z" "NUM_PARTITIONS": "10", "PARTITION_BY": "FIELD" }, transformation_ctx=key)

Record-based partitioning:

The original query would be split into NUM_PARTITIONS number of sub-queries that can be executed by Spark tasks concurrently.

Record-based partitioning is only supported for non-ODP entities, as pagination in ODP entities is supported through the next token/skip token.

  • PARTITION_BY: the type partitioning to be performed. "COUNT" is to be passed in case of record-based partitioning.

sapodata= glueContext.create_dynamic_frame.from_options( connection_type="sapodata", connection_options={ "connectionName": "connectionName", "ENTITY_NAME": "/sap/opu/odata/sap/SEPM_HCM_SCENARIO_SRV/EmployeeSet", "NUM_PARTITIONS": "10", "PARTITION_BY": "COUNT" }, transformation_ctx=key)