Reading from Google Analytics 4 entities

Prerequisites

A Google Analytics 4 object you would like to read from. Refer the supported entities table below to check the available entities.

Supported entities

Entity	Can be Filtered	Supports Limit	Supports Order By	Supports Select *	Supports Partitioning
Real-Time Report	Yes	Yes	Yes	Yes	No
Core Report	Yes	Yes	Yes	Yes	Yes

Example


googleAnalytics4_read = glueContext.create_dynamic_frame.from_options(
    connection_type="GoogleAnalytics4",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v1beta"
    }

Google Analytics 4 entity and field details

Entity	Field	Data Type	Supported Operators
Core Report	Dynamic Fields
Core Report	Dimension Fields	String	LIKE, =
Core Report	Dimension Fields	Date	LIKE, =
Core Report	Metric Fields	String	>, <, >=, <=, = BETWEEN
Core Report	Custom Dimension and Custom Metric Fields	String	NA
Real-Time Report	appVersion	String	LIKE, =
Real-Time Report	audienceId	String	LIKE, =
Real-Time Report	audienceName	String	LIKE, =
Real-Time Report	city	String	LIKE, =
Real-Time Report	cityId	String	LIKE, =
Real-Time Report	country	String	LIKE, =
Real-Time Report	countryId	String	LIKE, =
Real-Time Report	deviceCategory	String	LIKE, =
Real-Time Report	eventName	String	LIKE, =
Real-Time Report	minutesAgo	String	LIKE, =
Real-Time Report	platform	String	LIKE, =
Real-Time Report	streamId	String	LIKE, =
Real-Time Report	streamName	String	LIKE, =
Real-Time Report	unifiedScreenName	String	LIKE, =
Real-Time Report	activeUsers	String	>, <, >=, <=, = BETWEEN
Real-Time Report	conversions	String	>, <, >=, <=, = BETWEEN
Real-Time Report	eventCount	String	>, <, >=, <=, = BETWEEN
Real-Time Report	screenPageViews	String	>, <, >=, <=, = BETWEEN

Partitioning queries

Filter-based partition

Additional spark options PARTITION_FIELD, LOWER_BOUND, UPPER_BOUND, NUM_PARTITIONS can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into NUM_PARTITIONS number of sub-queries that can be executed by spark tasks concurrently.
- PARTITION_FIELD: the name of the field to be used to partition query.
- LOWER_BOUND: an inclusive lower bound value of the chosen partition field.
  
  For date, we accept the Spark date format used in Spark SQL queries. Example of valid values: "2024-02-06".
- UPPER_BOUND: an exclusive upper bound value of the chosen partition field.
- NUM_PARTITIONS: number of partitions.
Example
```
googleAnalytics4_read = glueContext.create_dynamic_frame.from_options(
    connection_type="GoogleAnalytics4",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v1beta",
        "PARTITION_FIELD": "date"
        "LOWER_BOUND": "2022-01-01"
        "UPPER_BOUND": "2024-01-02"
        "NUM_PARTITIONS": "10"
    }
```
Record-based partition

Additional spark options NUM_PARTITIONS can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into NUM_PARTITIONS number of sub-queries that can be executed by spark tasks concurrently.
- NUM_PARTITIONS: number of partitions.
Example
```
googleAnalytics4_read = glueContext.create_dynamic_frame.from_options(
    connection_type="GoogleAnalytics4",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v1beta",
        "NUM_PARTITIONS": "10"
    }
```

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Configuring Google Analytics 4 connections

Google Analytics 4 connection options