Processors that you can use
This section contains information about each processor that you can use in a log event transformer. The processors can be categorized into parsers, string mutators, JSON mutators, and date processors.
Configurable parser-type processors
parseJSON
The parseJSON processor parses JSON log events and
inserts extracted JSON key-value pairs under the destination. If you don't
specify a destination, the processor places the key-value pair under the root
node. When using parseJSON
as the first processor, you must parse
the entire log event using @message
as the source field. After the
initial JSON parsing, you can then manipulate specific fields in subsequent
processors.
The original @message
content is not changed, the new keys are
added to the message.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
source |
Path to the field in the log event that will be parsed. Use
dot notation to access child fields. For example,
store.book |
No |
|
Maximum length: 128 Maximum nested key depth: 3 |
destination |
The destination field of the parsed JSON |
No |
|
Maximum length: 128 Maximum nested key depth: 3 |
Example
Suppose an ingested log event looks like this:
{ "outer_key": { "inner_key": "inner_value" } }
Then if we have this parseJSON processor:
[ { "parseJSON": { "destination": "new_key" } } ]
The transformed log event would be the following.
{ "new_key": { "outer_key": { "inner_key": "inner_value" } } }
grok
Use the grok processor to parse and structure unstructured data using pattern matching. This processor can also extract fields from log messages.
Field | Description | Required? | Default | Limits | Notes |
---|---|---|---|---|---|
source |
Path of the field to apply Grok matching on |
No |
|
Maximum length: 128 Maximum nested key depth: 3 |
|
match |
The grok pattern to match against the log event |
Yes |
Maximum length: 512 Maximum grok patterns: 20 Some grok pattern types have individual usage limits. Any combination of the following patterns can be used as many as five times: {URI, URIPARAM, URIPATHPARAM, SPACE, DATA, GREEDYDATA, GREEDYDATA_MULTILINE} Grok patterns don't support type conversions. For common log format patterns (APACHE_ACCESS_LOG, NGINX_ACCESS_LOG, SYSLOG5424), only DATA, GREEDYDATA, or GREEDYDATA_MULTILINE patterns are supported to be included after the common log pattern. |
Structure of a Grok Pattern
This is the supported grok pattern structure:
%{PATTERN_NAME:FIELD_NAME}
-
PATTERN_NAME: Refers to a pre-defined regular expression for matching a specific type of data. Only predefined grok patterns from the supported grok patterns list
are supported. Creating custom patterns is not allowed. -
FIELD_NAME: Assigns a name to the extracted value.
FIELD_NAME
is optional, but if you don't specify this value then the extracted data will be dropped from the transformed log event. IfFIELD_NAME
uses dotted notation (e.g., "parent.child"), it is considered as a JSON path. -
Type Conversion: Explicit type conversions are not be supported. Use TypeConverter processor
to convert the datatype of any value extracted by grok.
To create more complex matching expressions, you can combine several grok
patterns. As many as 20 grok patterns can be combined to match a log event. For
example, this combination of patterns %{NUMBER:timestamp} [%{NUMBER:db}
%{IP:client_ip}:%{NUMBER:client_port}] %{GREEDYDATA:data}
can be used
to extract fields from a Redis slow log entry like this:
1629860738.123456 [0 127.0.0.1:6379] "SET" "key1" "value1"
Grok examples
Example 1: Use grok to extract a field from unstructured logs
Sample log:
293750 server-01.internal-network.local OK "[Thread-000] token generated"
Transformer used:
[ { "grok": { "match": "%{NUMBER:version} %{HOSTNAME:hostname} %{NOTSPACE:status} %{QUOTEDSTRING:logMsg}" } } ]
Output:
{ "version": "293750", "hostname": "server-01.internal-network.local", "status": "OK", "logMsg": "[Thread-000] token generated" }
Sample log:
23/Nov/2024:10:25:15 -0900 172.16.0.1 200
Transformer used:
[ { "grok": { "match": "%{HTTPDATE:timestamp} %{IPORHOST:clientip} %{NUMBER:response_status}" } } ]
Output:
{ "timestamp": "23/Nov/2024:10:25:15 -0900", "clientip": "172.16.0.1", "response_status": "200" }
Example 2: Use grok in combination with parseJSON to extract fields from a JSON log event
Sample log:
{ "timestamp": "2024-11-23T16:03:12Z", "level": "ERROR", "logMsg": "GET /page.html HTTP/1.1" }
Transformer used:
[ { "parseJSON": {} }, { "grok": { "source": "logMsg", "match": "%{WORD:http_method} %{NOTSPACE:request} HTTP/%{NUMBER:http_version}" } } ]
Output:
{ "timestamp": "2024-11-23T16:03:12Z", "level": "ERROR", "logMsg": "GET /page.html HTTP/1.1", "http_method": "GET", "request": "/page.html", "http_version": "1.1" }
Example 3: Grok pattern with dotted annotation in FIELD_NAME
Sample log:
192.168.1.1 GET /index.html?param=value 200 1234
Transformer used:
[ { "grok": { "match": "%{IP:client.ip} %{WORD:method} %{URIPATHPARAM:request.uri} %{NUMBER:response.status} %{NUMBER:response.bytes}" } } ]
Output:
{ "client": { "ip": "192.168.1.1" }, "method": "GET", "request": { "uri": "/index.html?param=value" }, "response": { "status": "200", "bytes": "1234" } }
Supported grok patterns
The following tables list the patterns that are supported by the
grok
processor.
General grok patterns
Grok Pattern | Description | Maximum pattern limit | Example |
---|---|---|---|
USERNAME or USER | Matches one or more characters that can include lowercase letters (a-z), uppercase letters (A-Z), digits (0-9), dots (.), underscores (_), or hyphens (-) | 20 |
Input: Pattern: Output: |
INT | Matches an optional plus or minus sign followed by one or more digits. | 20 |
Input: Pattern: Output: |
BASE10NUM | Matches an integer or a floating-point number with optional sign and decimal point | 20 |
Input: Pattern: Output: |
BASE16NUM | Matches decimal and hexadecimal numbers with an optional sign (+ or -) and an optional 0x prefix | 20 |
Input: Pattern: Output: |
POSINT | Matches whole positive integers without leading zeros, consisting of one or more digits (1-9 followed by 0-9) | 20 |
Input: Pattern: Output: |
NONNEGINT | Matches any whole numbers (consisting of one or more digits 0-9) including zero and numbers with leading zeros. | 20 |
Input: Pattern: Output: |
WORD | Matches whole words composed of one or more word characters (\w), including letters, digits, and underscores | 20 |
Input: Pattern: Output: |
NOTSPACE | Matches one or more non-whitespace characters. | 5 |
Input: Pattern: Output: |
SPACE | Matches zero or more whitespace characters. | 5 |
Input: Pattern: Output: |
DATA | Matches any character (except newline) zero or more times, non-greedy. | 5 |
Input: Pattern: Output: |
GREEDYDATA | Matches any character (except newline) zero or more times, greedy. | 5 |
Input: Pattern: Output: |
GREEDYDATA_MULTILINE | Matches any character (including newline) zero or more times, greedy. | 1 |
Input:
Pattern:
Output: |
QUOTEDSTRING | Matches quoted strings (single or double quotes) with escaped characters. | 20 |
Input: Pattern: Output: |
UUID | Matches a standard UUID format: 8 hexadecimal characters, followed by three groups of 4 hexadecimal characters, and ending with 12 hexadecimal characters, all separated by hyphens. | 20 |
Input:
Pattern: Output: |
URN | Matches URN (Uniform Resource Name) syntax. | 20 |
Input: Pattern: Output: |
Amazon grok patterns
Pattern | Description | Maximum pattern limit | Example |
---|---|---|---|
ARN |
Matches Amazon Amazon Resource Names (ARNs), capturing
the partition ( |
5 |
Input:
Pattern: Output: |
Networking grok patterns
Grok Pattern | Description | Maximum pattern limit | Example |
---|---|---|---|
CISCOMAC | Matches a MAC address in 4-4-4 hexadecimal format. | 20 |
Input: Pattern: Output: |
WINDOWSMAC | Matches a MAC address in hexadecimal format with hyphens | 20 |
Input: Pattern: Output: |
COMMONMAC | Matches a MAC address in hexadecimal format with colons. | 20 |
Input: Pattern: Output: |
MAC | Matches one of CISCOMAC, WINDOWSMAC or COMMONMAC grok patterns | 20 |
Input: Pattern: Output: |
IPV6 | Matches IPv6 addresses, including compressed forms and IPv4-mapped IPv6 addresses. | 5 |
Input:
Pattern: Output: |
IPV4 | Matches an IPv4 address. | 20 |
Input: Pattern: Output: |
IP | Matches either IPv6 addresses as supported by %{IPv6} or IPv4 addresses as supported by %{IPv4} | 5 |
Input: Pattern: Output: |
HOSTNAME or HOST | Matches domain names, including subdomains | 5 |
Input:
Pattern: Output: |
IPORHOST | Matches either a hostname or an IP address | 5 |
Input:
Pattern: Output: |
HOSTPORT | Matches an IP address or hostname as supported by %{IPORHOST} pattern followed by a colon and a port number, capturing the port as "PORT" in the output. | 5 |
Input: Pattern: Output:
|
URIHOST | Matches an IP address or hostname as supported by %{IPORHOST} pattern, optionally followed by a colon and a port number, capturing the port as "port" if present. | 5 |
Input: Pattern: Output:
|
Path grok patterns
Grok Pattern | Description | Maximum pattern limit | Example |
---|---|---|---|
UNIXPATH | Matches URL paths, potentially including query parameters. | 20 |
Input: Pattern: Output: |
WINPATH | Matches Windows file paths. | 5 |
Input:
Pattern: Output: |
PATH | Matches either URL or Windows file paths | 5 |
Input: Pattern: Output: |
TTY | Matches Unix device paths for terminals and pseudo-terminals. | 20 |
Input: Pattern: Output: |
URIPROTO | Matches letters, optionally followed by a plus (+) character and additional letters or plus (+) characters | 20 |
Input: Pattern: Output:
|
URIPATH | Matches the path component of a URI | 20 |
Input:
Pattern: Output:
|
URIPARAM | Matches URL query parameters | 5 |
Input:
Pattern: Output:
|
URIPATHPARAM | Matches a URI path optionally followed by query parameters | 5 |
Input:
Pattern: Output:
|
URI | Matches a complete URI | 5 |
Input:
Pattern: Output:
|
Date and time grok patterns
Grok Pattern | Description | Maximum pattern limit | Example |
---|---|---|---|
MONTH | Matches full or abbreviated english month names as whole words | 20 |
Input: Pattern: Output: Input: Pattern: Output: |
MONTHNUM | Matches month numbers from 1 to 12, with optional leading zero for single-digit months. | 20 |
Input: Pattern: Output: Input: Pattern: Output: |
MONTHNUM2 | Matches two-digit month numbers from 01 to 12. | 20 |
Input: Pattern: Output: |
MONTHDAY | Matches day of the month from 1 to 31, with optional leading zero. | 20 |
Input: Pattern: Output: |
YEAR | Matches year in two or four digits | 20 |
Input: Pattern: Output: Input: Pattern: Output: |
DAY | Matches full or abbreviated day names. | 20 |
Input: Pattern: Output: |
HOUR | Matches hour in 24-hour format with an optional leading zero (0)0-23. | 20 |
Input: Pattern: Output: |
MINUTE | Matches minutes (00-59). | 20 |
Input: Pattern: Output: |
SECOND | Matches a number representing seconds (0)0-60, optionally followed by a decimal point or colon and one or more digits for fractional minutes | 20 |
Input: Pattern: Output: Input: Pattern: Output: Input: Pattern: Output: |
TIME | Matches a time format with hours, minutes, and seconds in the format (H)H:mm:(s)s. Seconds include leap second (0)0-60. | 20 |
Input: Pattern: Output: |
DATE_US | Matches a date in the format of (M)M/(d)d/(yy)yy or (M)M-(d)d-(yy)yy. | 20 |
Input: Pattern: Output: Input: Pattern: Output: |
DATE_EU | Matches date in format of (d)d/(M)M/(yy)yy, (d)d-(M)M-(yy)yy, or (d)d.(M)M.(yy)yy. | 20 |
Input: Pattern: Output: Input: Pattern: Output: |
ISO8601_TIMEZONE | Matches UTC offset 'Z' or time zone offset with optional colon in format [+-](H)H(:)mm. | 20 |
Input: Pattern: Output: Input: Pattern: Output: Input: Pattern: Output: |
ISO8601_SECOND | Matches a number representing seconds (0)0-60, optionally followed by a decimal point or colon and one or more digits for fractional seconds | 20 |
Input: Pattern: Output: |
TIMESTAMP_ISO8601 | Matches ISO8601 datetime format (yy)yy-(M)M-(d)dT(H)H:mm:((s)s)(Z|[+-](H)H:mm) with optional seconds and timezone. | 20 |
Input: Pattern:
Output:
Input: Pattern:
Output:
Input: Pattern:
Output:
|
DATE | Matches either a date in the US format using %{DATE_US} or in the EU format using %{DATE_EU} | 20 |
Input: Pattern: Output: Input: Pattern: Output: |
DATESTAMP | Matches %{DATE} followed by %{TIME} pattern, separated by space or hyphen. | 20 |
Input: Pattern: Output: |
TZ | Matches common time zone abbreviations (PST, PDT, MST, MDT, CST CDT, EST, EDT, UTC). | 20 |
Input: Pattern: Output: |
DATESTAMP_RFC822 | Matches date and time in format: Day MonthName (D)D (YY)YY (H)H:mm:(s)s Timezone | 20 |
Input: Pattern:
Output: Input: Pattern:
Output: |
DATESTAMP_RFC2822 | Matches RFC2822 date-time format: Day, (d)d MonthName (yy)yy (H)H:mm:(s)s Z|[+-](H)H:mm | 20 |
Input: Pattern:
Output: Input: Pattern:
Output: |
DATESTAMP_OTHER | Matches date and time in format: Day MonthName (d)d (H)H:mm:(s)s Timezone (yy)yy | 20 |
Input: Pattern:
Output: |
DATESTAMP_EVENTLOG | Matches compact datetime format without separators: (yy)yyMM(d)d(H)Hmm(s)s | 20 |
Input: Pattern:
Output:
|
Log grok patterns
Grok Pattern | Description | Maximum pattern limit | Example |
---|---|---|---|
LOGLEVEL | Matches standard log levels in different capitalizations
and abbreviations, including the following:
Alert/ALERT , Trace/TRACE ,
Debug/DEBUG , Notice/NOTICE ,
Info/INFO ,
Warn/Warning/WARN/WARNING ,
Err/Error/ERR/ERROR ,
Crit/Critical/CRIT/CRITICAL ,
Fatal/FATAL , Severe/SEVERE ,
Emerg/Emergency/EMERG/EMERGENCY |
20 |
Input: Pattern: Output: |
HTTPDATE | Matches date and time format often used in log files. Format: (d)d/MonthName/(yy)yy:(H)H:mm:(s)s Timezone MonthName: Matches full or abbreviated english month names (Example: "Jan" or "January") Timezone: Matches %{INT} grok pattern | 20 |
Input: Pattern: Output: |
SYSLOGTIMESTAMP | Matches date format with MonthName (d)d (H)H:mm:(s)s MonthName: Matches full or abbreviated english month names (Example: "Jan" or "January") | 20 |
Input: Pattern:
Output: |
PROG | Matches a program name consisting of string of letters, digits, dot, underscore, forward slash, percent sign, and hyphen characters. | 20 |
Input: Pattern: Output:
|
SYSLOGPROG | Matches PROG grok pattern optionally followed by a process ID in square brackets. | 20 |
Input:
Pattern:
Output:
|
SYSLOGHOST | Matches either a %{HOST} or %{IP} pattern | 5 |
Input:
Pattern: Output: |
SYSLOGFACILITY | Matches syslog priority in decimal format. The value should be enclosed in angular brackets (<>). | 20 |
Input: Pattern: Output:
|
Common log grok patterns
You can use pre-defined custom grok patterns to match Apache, NGINX and Syslog Protocol (RFC 5424) log formats. When you use these specific patterns, they must be the first patterns in your matching configuration, and no other patterns can precede them. Also, you can follow them only with exactly one DATA. GREEDYDATA or GREEDYDATA_MULTILINE pattern.
Grok pattern | Description | Maximum pattern limit |
---|---|---|
APACHE_ACCESS_LOG |
Matches Apache access logs |
1 |
NGINX_ACCESS_LOG |
Matches NGINX access logs |
1 |
SYSLOG5424 |
Matches Syslog Protocol (RFC 5424) logs |
1 |
The following shows valid and invalid examples for using these common log format patterns.
"%{NGINX_ACCESS_LOG} %{DATA}" // Valid "%{SYSLOG5424}%{DATA:logMsg}" // Valid "%{APACHE_ACCESS_LOG} %{GREEDYDATA:logMsg}" // Valid "%{APACHE_ACCESS_LOG} %{SYSLOG5424}" // Invalid (multiple common log patterns used) "%{NGINX_ACCESS_LOG} %{NUMBER:num}" // Invalid (Only GREEDYDATA and DATA patterns are supported with common log patterns) "%{GREEDYDATA:logMsg} %{SYSLOG5424}" // Invalid (GREEDYDATA and DATA patterns are supported only after common log patterns)
Common log format examples
Apache log example
Sample log:
127.0.0.1 - - [03/Aug/2023:12:34:56 +0000] "GET /page.html HTTP/1.1" 200 1234
Transformer:
[ { "grok": { "match": "%{APACHE_ACCESS_LOG}" } } ]
Output:
{ "request": "/page.html", "http_method": "GET", "status_code": 200, "http_version": "1.1", "response_size": 1234, "remote_host": "127.0.0.1", "timestamp": "2023-08-03T12:34:56Z" }
NGINX log example
Sample log:
192.168.1.100 - Foo [03/Aug/2023:12:34:56 +0000] "GET /account/login.html HTTP/1.1" 200 42 "https://www.amazon.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36"
Transformer:
[ { "grok": { "match": "%{NGINX_ACCESS_LOG}" } } ]
Output:
{ "request": "/account/login.html", "referrer": "https://www.amazon.com/", "agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36", "http_method": "GET", "status_code": 200, "auth_user": "Foo", "http_version": "1.1", "response_size": 42, "remote_host": "192.168.1.100", "timestamp": "2023-08-03T12:34:56Z" }
Syslog Protocol (RFC 5424) log example
Sample log:
<165>1 2003-10-11T22:14:15.003Z mymachine.example.com evntslog - ID47 [exampleSDID@32473 iut="3" eventSource= "Application" eventID="1011"][examplePriority@32473 class="high"]
Transformer:
[ { "grok": { "match": "%{SYSLOG5424}" } } ]
Output:
{ "pri": 165, "version": 1, "timestamp": "2003-10-11T22:14:15.003Z", "hostname": "mymachine.example.com", "app": "evntslog", "msg_id": "ID47", "structured_data": "exampleSDID@32473 iut=\"3\" eventSource= \"Application\" eventID=\"1011\"", "message": "[examplePriority@32473 class=\"high\"]" }
csv
The csv processor parses comma-separated values (CSV) from the log events into columns.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
source |
Path to the field in the log event that will be parsed |
No |
|
Maximum length: 128 Maximum nested key depth: 3 |
delimiter |
The character used to separate each column in the original comma-separated value log event |
No |
|
Maximum length: 1 unless the value is |
quoteCharacter |
Character used as a text qualifier for a single column of data |
No |
|
Maximum length: 1 |
columns |
List of names to use for the columns in the transformed log event. |
No |
|
Maximum CSV columns: 100 Maximum length: 128 Maximum nested key depth: 3 |
Setting delimiter
to \t
will separate each column on
a tab character, and \t
will separate each column on a single space
character.
Example
Suppose part of an ingested log event looks like this:
'Akua Mansa':28:'New York: USA'
Suppose we use only the csv processor:
[ { "csv": { "delimiter": ":", "quoteCharacter": "'" } } ]
The transformed log event would be the following.
{ "column_1": "Akua Mansa", "column_2": "28", "column_3": "New York: USA" }
parseKeyValue
Use the parseKeyValue processor to parse a specified field into key-value pairs. You can customize the processor to parse field information with the following options.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
source |
Path to the field in the log event that will be parsed |
No |
|
Maximum length: 128 Maximum nested key depth: 3 |
destination |
The destination field to put the extracted key-value pairs into |
No |
Maximum length: 128 |
|
fieldDelimiter |
The field delimiter string that is used between key-value pairs in the original log events |
No |
|
Maximum length: 128 |
keyValueDelimiter |
The delimiter string to use between the key and value in each pair in the transformed log event |
No |
|
Maximum length: 128 |
nonMatchValue |
A value to insert into the value field in the result, when a key-value pair is not successfully split. |
No |
Maximum length: 128 |
|
keyPrefix |
If you want to add a prefix toall transformed keys, specify it here. |
No |
Maximum length: 128 |
|
overwriteIfExists |
Whether to overwrite the value if the destination key already exists |
No |
|
Example
Take the following example log event:
key1:value1!key2:value2!key3:value3!key4
Suppose we use the following processor configuration:
[ { "parseKeyValue": { "destination": "new_key", "fieldDelimiter": "!", "keyValueDelimiter": ":", "nonMatchValue": "defaultValue", "keyPrefix": "parsed_" } } ]
The transformed log event would be the following.
{ "new_key": { "parsed_key1": "value1", "parsed_key2": "value2", "parsed_key3": "value3", "parsed_key4": "defaultValue" } }
Built-in processors for Amazon vended logs
parseWAF
Use this processor to parse Amazon WAF vended logs, It takes the contents of
httpRequest.headers
and creates JSON keys from each header
name, with the corresponding value. It also does the same for
labels
. These transformations can make querying Amazon WAF logs much
easier. For more information about Amazon WAF log format, see
Log examples for web ACL traffic.
This processor accepts only @message
as the input.
Important
If you use this processor, it must be the first processor in your transformer.
Example
Take the following example log event:
{ "timestamp": 1576280412771, "formatVersion": 1, "webaclId": "arn:aws:wafv2:ap-southeast-2:111122223333:regional/webacl/STMTest/1EXAMPLE-2ARN-3ARN-4ARN-123456EXAMPLE", "terminatingRuleId": "STMTest_SQLi_XSS", "terminatingRuleType": "REGULAR", "action": "BLOCK", "terminatingRuleMatchDetails": [ { "conditionType": "SQL_INJECTION", "sensitivityLevel": "HIGH", "location": "HEADER", "matchedData": ["10", "AND", "1"] } ], "httpSourceName": "-", "httpSourceId": "-", "ruleGroupList": [], "rateBasedRuleList": [], "nonTerminatingMatchingRules": [], "httpRequest": { "clientIp": "1.1.1.1", "country": "AU", "headers": [ { "name": "Host", "value": "localhost:1989" }, { "name": "User-Agent", "value": "curl/7.61.1" }, { "name": "Accept", "value": "*/*" }, { "name": "x-stm-test", "value": "10 AND 1=1" } ], "uri": "/myUri", "args": "", "httpVersion": "HTTP/1.1", "httpMethod": "GET", "requestId": "rid" }, "labels": [{ "name": "value" }] }
The processor configuration is this:
[ { "parseWAF": {} } ]
The transformed log event would be the following.
{ "httpRequest": { "headers": { "Host": "localhost:1989", "User-Agent": "curl/7.61.1", "Accept": "*/*", "x-stm-test": "10 AND 1=1" }, "clientIp": "1.1.1.1", "country": "AU", "uri": "/myUri", "args": "", "httpVersion": "HTTP/1.1", "httpMethod": "GET", "requestId": "rid" }, "labels": { "name": "value" }, "timestamp": 1576280412771, "formatVersion": 1, "webaclId": "arn:aws:wafv2:ap-southeast-2:111122223333:regional/webacl/STMTest/1EXAMPLE-2ARN-3ARN-4ARN-123456EXAMPLE", "terminatingRuleId": "STMTest_SQLi_XSS", "terminatingRuleType": "REGULAR", "action": "BLOCK", "terminatingRuleMatchDetails": [ { "conditionType": "SQL_INJECTION", "sensitivityLevel": "HIGH", "location": "HEADER", "matchedData": ["10", "AND", "1"] } ], "httpSourceName": "-", "httpSourceId": "-", "ruleGroupList": [], "rateBasedRuleList": [], "nonTerminatingMatchingRules": [] }
parsePostgres
Use this processor to parse Amazon RDS for PostgreSQL vended logs, extract fields, and convert them to JSON format. For more information about RDS for PostgreSQL log format, see RDS for PostgreSQL database log files.
This processor accepts only @message
as the input.
Important
If you use this processor, it must be the first processor in your transformer.
Example
Take the following example log event:
2019-03-10 03:54:59 UTC:10.0.0.123(52834):postgres@logtestdb:[20175]:ERROR: column "wrong_column_name" does not exist at character 8
The processor configuration is this:
[ { "parsePostgres": {} } ]
The transformed log event would be the following.
{ "logTime": "2019-03-10 03:54:59 UTC", "srcIp": "10.0.0.123(52834)", "userName": "postgres", "dbName": "logtestdb", "processId": "20175", "logLevel": "ERROR" }
parseCloudfront
Use this processor to parse Amazon CloudFront vended logs, extract fields, and convert them into JSON format. Encoded field values are decoded. Values that are integers and doubles are treated as such. For more information about Amazon CloudFront log format, see Configure and use standard logs (access logs).
This processor accepts only @message
as the input.
Important
If you use this processor, it must be the first processor in your transformer.
Example
Take the following example log event:
2019-12-04 21:02:31 LAX1 392 192.0.2.24 GET d111111abcdef8.cloudfront.net /index.html 200 - Mozilla/5.0%20(Windows%20NT%2010.0;%20Win64;%20x64)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20Chrome/78.0.3904.108%20Safari/537.36 - - Hit SOX4xwn4XV6Q4rgb7XiVGOHms_BGlTAC4KyHmureZmBNrjGdRLiNIQ== d111111abcdef8.cloudfront.net https 23 0.001 - TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Hit HTTP/2.0 - - 11040 0.001 Hit text/html 78 - -
The processor configuration is this:
[ { "parseCloudfront": {} } ]
The transformed log event would be the following.
{ "date": "2019-12-04", "time": "21:02:31", "x-edge-location": "LAX1", "sc-bytes": 392, "c-ip": "192.0.2.24", "cs-method": "GET", "cs(Host)": "d111111abcdef8.cloudfront.net", "cs-uri-stem": "/index.html", "sc-status": 200, "cs(Referer)": "-", "cs(User-Agent)": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "cs-uri-query": "-", "cs(Cookie)": "-", "x-edge-result-type": "Hit", "x-edge-request-id": "SOX4xwn4XV6Q4rgb7XiVGOHms_BGlTAC4KyHmureZmBNrjGdRLiNIQ==", "x-host-header": "d111111abcdef8.cloudfront.net", "cs-protocol": "https", "cs-bytes": 23, "time-taken": 0.001, "x-forwarded-for": "-", "ssl-protocol": "TLSv1.2", "ssl-cipher": "ECDHE-RSA-AES128-GCM-SHA256", "x-edge-response-result-type": "Hit", "cs-protocol-version": "HTTP/2.0", "fle-status": "-", "fle-encrypted-fields": "-", "c-port": 11040, "time-to-first-byte": 0.001, "x-edge-detailed-result-type": "Hit", "sc-content-type": "text/html", "sc-content-len": 78, "sc-range-start": "-", "sc-range-end": "-" }
parseRoute53
Use this processor to parse Amazon Route 53 Public Data Plane vended logs, extract fields, and convert them into JSON format. Encoded field values are decoded. This processor does not support Amazon Route 53 Resolver logs.
This processor accepts only @message
as the input.
Important
If you use this processor, it must be the first processor in your transformer.
Example
Take the following example log event:
1.0 2017-12-13T08:15:50.235Z Z123412341234 example.com AAAA NOERROR TCP IAD12 192.0.2.0 198.51.100.0/24
The processor configuration is this:
[ { "parseRoute53": {} } ]
The transformed log event would be the following.
{ "version": 1.0, "queryTimestamp": "2017-12-13T08:15:50.235Z", "hostZoneId": "Z123412341234", "queryName": "example.com", "queryType": "AAAA", "responseCode": "NOERROR", "protocol": "TCP", "edgeLocation": "IAD12", "resolverIp": "192.0.2.0", "ednsClientSubnet": "198.51.100.0/24" }
parseVPC
Use this processor to parse Amazon VPC vended logs, extract fields, and convert them into JSON format. Encoded field values are decoded.
Important
The parseVPC
processor works only on logs with the default
Amazon VPC flow log format. It does not work on custom VPC flow logs.
This processor accepts only @message
as the input.
Important
If you use this processor, it must be the first processor in your transformer.
Example
Take the following example log event:
2 123456789010 eni-abc123de 192.0.2.0 192.0.2.24 20641 22 6 20 4249 1418530010 1418530070 ACCEPT OK
The processor configuration is this:
[ { "parseVPC": {} } ]
The transformed log event would be the following.
{ "version": 2, "accountId": "123456789010", "interfaceId": "eni-abc123de", "srcAddr": "192.0.2.0", "dstAddr": "192.0.2.24", "srcPort": 20641, "dstPort": 22, "protocol": 6, "packets": 20, "bytes": 4249, "start": 1418530010, "end": 1418530070, "action": "ACCEPT", "logStatus": "OK" }
String mutate processors
lowerCaseString
The lowerCaseString
processor converts a string to its lowercase
version.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
withKeys |
A list of keys to convert to lowercase |
Yes |
Maximum entries: 10 |
Example
Take the following example log event:
{ "outer_key": { "inner_key": "INNER_VALUE" } }
The transformer configuration is this, using lowerCaseString
with
parseJSON
:
[ { "parseJSON": {} }, { "lowerCaseString": { "withKeys":["outer_key.inner_key"] } } ]
The transformed log event would be the following.
{ "outer_key": { "inner_key": "inner_value" } }
upperCaseString
The upperCaseString
processor converts a string to its uppercase
version.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
withKeys |
A list of keys to convert to uppercase |
Yes |
Maximum entries: 10 |
Example
Take the following example log event:
{ "outer_key": { "inner_key": "inner_value" } }
The transformer configuration is this, using upperCaseString
with
parseJSON
:
[ { "parseJSON": {} }, { "upperCaseString": { "withKeys":["outer_key.inner_key"] } } ]
The transformed log event would be the following.
{ "outer_key": { "inner_key": "INNER_VALUE" } }
splitString
The splitString
processor is a type of string mutate processor
which splits a field into an array using a delimiting character.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
entries |
Array of entries. Each item in the array must contain
source and delimiter
fields. |
Yes |
Maximum entries: 10 |
|
source |
The key of the field value to split |
Yes |
Maximum length: 128 |
|
delimiter |
The delimiter string to split the field value on |
Yes |
Maximum length: 128 |
Example 1
Take the following example log event:
[ { "parseJSON": {} }, { "splitString": { "entries": [ { "source": "outer_key.inner_key", "delimiter": "_" } ] } } ]
The transformer configuration is this, using splitString
with
parseJSON
:
[ { "parseJSON": {} }, { "splitString": { "entries": [ { "source": "outer_key.inner_key", "delimiter": "_" } ] } } ]
The transformed log event would be the following.
{ "outer_key": { "inner_key": [ "inner", "value" ] } }
Example 2
The delimiter to split the string on can be multiple characters long.
Take the following example log event:
{ "outer_key": { "inner_key": "item1, item2, item3" } }
The transformer configuration is as follows:
[ { "parseJSON": {} }, { "splitString": { "entries": [ { "source": "outer_key.inner_key", "delimiter": ", " } ] } } ]
The transformed log event would be the following.
{ "outer_key": { "inner_key": [ "item1", "item2", "item3" ] } }
substituteString
The substituteString
processor is a type of string mutate
processor which matches a key’s value against a regular expression and replaces
all matches with a replacement string.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
entries |
Array of entries. Each item in the array must contain
source , from , and to
fields. |
Yes |
Maximum entries: 10 |
|
source |
The key of the field to modify |
Yes |
Maximum length: 128 Maximum nested key depth: 3 |
|
from |
The regular expression string to be replaced. Special
regex characters such as [ and ] must be escaped using \\
when using double quotes and with \ when using single quotes
or when configured from the Amazon Web Services Management Console. For more information,
see Class Pattern You can wrap a pattern in |
Yes |
Maximum length: 128 |
|
to |
The string to be substituted for each match of
from Backreferences to capturing groups can be
used. Use the form $n for numbered groups such as
$1 and use ${group_name} for named
groups such as ${my_group} .> |
Yes |
Maximum length: 128 Maximum number of backreferences: 10 Maximum number of duplicate backreferences: 2 |
Example 1
Take the following example log event:
{ "outer_key": { "inner_key1": "[]", "inner_key2": "123-345-567", "inner_key3": "A cat takes a catnap." } }
The transformer configuration is this, using substituteString
with parseJSON
:
[ { "parseJSON": {} }, { "substituteString": { "entries": [ { "source": "outer_key.inner_key1", "from": "\\[\\]", "to": "value1" }, { "source": "outer_key.inner_key2", "from": "[0-9]{3}-[0-9]{3}-[0-9]{3}", "to": "xxx-xxx-xxx" }, { "source": "outer_key.inner_key3", "from": "cat", "to": "dog" } ] } } ]
The transformed log event would be the following.
{ "outer_key": { "inner_key1": "value1", "inner_key2": "xxx-xxx-xxx", "inner_key3": "A dog takes a dognap." } }
Example 2
Take the following example log event:
{ "outer_key": { "inner_key1": "Tom, Dick, and Harry", "inner_key2": "arn:aws:sts::123456789012:assumed-role/MyImportantRole/MySession" } }
The transformer configuration is this, using substituteString
with parseJSON
:
[ { "parseJSON": {} }, { "substituteString": { "entries": [ { "source": "outer_key.inner_key1", "from": "(\w+), (\w+), and (\w+)", "to": "$1 and $3" }, { "source": "outer_key.inner_key2", "from": "^arn:aws:sts::(?P<account_id>\\d{12}):assumed-role/(?P<role_name>[\\w+=,.@-]+)/(?P<role_session_name>[\\w+=,.@-]+)$", "to": "${account_id}:${role_name}:${role_session_name}" } ] } } ]
The transformed log event would be the following.
{ "outer_key": { "inner_key1": "Tom and Harry", "inner_key2": "123456789012:MyImportantRole:MySession" } }
trimString
The trimString
processor removes whitespace from the beginning
and end of a key.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
withKeys |
A list of keys to trim |
Yes |
Maximum entries: 10 |
Example
Take the following example log event:
{ "outer_key": { "inner_key": " inner_value " } }
The transformer configuration is this, using trimString
with
parseJSON
:
[ { "parseJSON": {} }, { "trimString": { "withKeys":["outer_key.inner_key"] } } ]
The transformed log event would be the following.
{ "outer_key": { "inner_key": "inner_value" } }
JSON mutate processors
addKeys
Use the addKeys
processor to add new key-value pairs to the log
event.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
entries |
Array of entries. Each item in the array can contain
key , value , and
overwriteIfExists fields. |
Yes |
Maximum entries: 5 |
|
key |
The key of the new entry to be added |
Yes |
Maximum length: 128 Maximum nested key depth: 3 |
|
value |
The value of the new entry to be added |
Yes |
Maximum length: 256 |
|
overwriteIfExists |
If you set this to true , the existing value is
overwritten if key already exists in the event. The
default value is false . |
No |
false |
No limit |
Example
Take the following example log event:
{ "outer_key": { "inner_key": "inner_value" } }
The transformer configuration is this, using addKeys
with
parseJSON
:
[ { "parseJSON": {} }, { "addKeys": { "entries": [ { "source": "outer_key.new_key", "value": "new_value" } ] } } ]
The transformed log event would be the following.
{ "outer_key": { "inner_key": "inner_value", "new_key": "new_value" } }
deleteKeys
Use the deleteKeys
processor to delete fields from a log event.
These fields can include key-value pairs.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
withKeys |
The list of keys to delete. |
Yes |
No limit |
Maximum entries: 5 |
Example
Take the following example log event:
{ "outer_key": { "inner_key": "inner_value" } }
The transformer configuration is this, using deleteKeys
with
parseJSON
:
[ { "parseJSON": {} }, { "deleteKeys": { "withKeys":["outer_key.inner_key"] } } ]
The transformed log event would be the following.
{ "outer_key": {} }
moveKeys
Use the moveKeys
processor to move a key from one field to
another.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
entries |
Array of entries. Each item in the array can contain
source , target , and
overwriteIfExists fields. |
Yes |
Maximum entries: 5 |
|
source |
The key to move |
Yes |
Maximum length: 128 Maximum nested key depth: 3 |
|
target |
The key to move to |
Yes |
Maximum length: 128 Maximum nested key depth: 3 |
|
overwriteIfExists |
If you set this to true , the existing value is
overwritten if key already exists in the event. The
default value is false . |
No |
false |
No limit |
Example
Take the following example log event:
{ "outer_key1": { "inner_key1": "inner_value1" }, "outer_key2": { "inner_key2": "inner_value2" } }
The transformer configuration is this, using moveKeys
with
parseJSON
:
[ { "parseJSON": {} }, { "moveKeys": { "entries": [ { "source": "outer_key1.inner_key1", "target": "outer_key2" } ] } } ]
The transformed log event would be the following.
{ "outer_key1": {}, "outer_key2": { "inner_key2": "inner_value2", "inner_key1": "inner_value1" } }
renameKeys
Use the renameKeys
processor to rename keys in a log event.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
entries |
Array of entries. Each item in the array can contain
key , target , and
overwriteIfExists fields. |
Yes |
No limit |
Maximum entries: 5 |
key |
The key to rename |
Yes |
No limit |
Maximum length: 128 |
target |
The new key name |
Yes |
No limit |
Maximum length: 128 Maximum nested key depth: 3 |
overwriteIfExists |
If you set this to true , the existing value is
overwritten if key already exists in the event. The
default value is false . |
No |
false |
No limit |
Example
Take the following example log event:
{ "outer_key": { "inner_key": "inner_value" } }
The transformer configuration is this, using renameKeys
with
parseJSON
:
[ { "parseJSON": {} }, { "renameKeys": { "entries": [ { "key": "outer_key", "target": "new_key" } ] } } ]
The transformed log event would be the following.
{ "new_key": { "inner_key": "inner_value" } }
copyValue
Use the copyValue
processor to copy values within a log event.
You can also use this processor to add metadata to log events, by copying the
values of the following metadata keys into the log events:
@logGroupName
, @logGroupStream
,
@accountId
, @regionName
. This is illustrated in
the following example.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
entries |
Array of entries. Each item in the array can contain
source , target , and
overwriteIfExists fields. |
Yes |
Maximum entries: 5 |
|
source |
The key to copy |
Yes |
Maximum length: 128 Maximum nested key depth: 3 |
|
target |
The key to copy the value to |
Yes |
No limit |
Maximum length: 128 Maximum nested key depth: 3 |
overwriteIfExists |
If you set this to true , the existing value is
overwritten if key already exists in the event. The
default value is false . |
No |
false |
No limit |
Example
Take the following example log event:
{ "outer_key": { "inner_key": "inner_value" } }
The transformer configuration is this, using copyValue
with
parseJSON
:
[ { "parseJSON": {} }, { "copyValue": { "entries": [ { "source": "outer_key.new_key", "target": "new_key" }, { "source": "@logGroupName", "target": "log_group_name" }, { "source": "@logGroupStream", "target": "log_group_stream" }, { "source": "@accountId", "target": "account_id" }, { "source": "@regionName", "target": "region_name" } ] } } ]
The transformed log event would be the following.
{ "outer_key": { "inner_key": "inner_value" }, "new_key": "inner_value", "log_group_name": "myLogGroupName", "log_group_stream": "myLogStreamName", "account_id": "012345678912", "region_name": "us-east-1" }
listToMap
The listToMap
processor takes a list of objects that contain key
fields, and converts them into a map of target keys.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
source |
The key in the ProcessingEvent with a list of objects that will be converted to a map |
Yes |
Maximum length: 128 Maximum nested key depth: 3 |
|
key |
The key of the fields to be extracted as keys in the generated map |
Yes |
Maximum length: 128 |
|
valueKey |
If this is specified, the values that you specify in this
parameter will be extracted from the source objects
and put into the values of the generated map. Otherwise,
original objects in the source list will be put into the values
of the generated map. |
No |
Maximum length: 128 |
|
target |
The key of the field that will hold the generated map |
No |
Root node |
Maximum length: 128 Maximum nested key depth: 3 |
flatten |
A Boolean value to indicate whether the list will be flattened into single items or if the values in the generated map will be lists. By default the values for the matching keys will be
represented in an array. Set |
No |
false |
|
flattenedElement |
If you set flatten to true , use
flattenedElement to specify which element,
first or last , to keep. |
Required when |
Value can only be first or
last |
Example
Take the following example log event:
{ "outer_key": [ { "inner_key": "a", "inner_value": "val-a" }, { "inner_key": "b", "inner_value": "val-b1" }, { "inner_key": "b", "inner_value": "val-b2" }, { "inner_key": "c", "inner_value": "val-c" } ] }
Transformer for use case 1:
flatten
is false
[ { "parseJSON": {} }, { "listToMap": { "source": "outer_key" "key": "inner_key", "valueKey": "inner_value", "flatten": false } } ]
The transformed log event would be the following.
{ "outer_key": [ { "inner_key": "a", "inner_value": "val-a" }, { "inner_key": "b", "inner_value": "val-b1" }, { "inner_key": "b", "inner_value": "val-b2" }, { "inner_key": "c", "inner_value": "val-c" } ], "a": [ "val-a" ], "b": [ "val-b1", "val-b2" ], "c": [ "val-c" ] }
Transformer for use case 2:
flatten
is true
and flattenedElement
is
first
[ { "parseJSON": {} }, { "listToMap": { "source": "outer_key" "key": "inner_key", "valueKey": "inner_value", "flatten": true, "flattenedElement": "first" } } ]
The transformed log event would be the following.
{ "outer_key": [ { "inner_key": "a", "inner_value": "val-a" }, { "inner_key": "b", "inner_value": "val-b1" }, { "inner_key": "b", "inner_value": "val-b2" }, { "inner_key": "c", "inner_value": "val-c" } ], "a": "val-a", "b": "val-b1", "c": "val-c" }
Transformer for use case 3:
flatten
is true
and flattenedElement
is
last
[ { "parseJSON": {} }, { "listToMap": { "source": "outer_key" "key": "inner_key", "valueKey": "inner_value", "flatten": true, "flattenedElement": "last" } } ]
The transformed log event would be the following.
{ "outer_key": [ { "inner_key": "a", "inner_value": "val-a" }, { "inner_key": "b", "inner_value": "val-b1" }, { "inner_key": "b", "inner_value": "val-b2" }, { "inner_key": "c", "inner_value": "val-c" } ], "a": "val-a", "b": "val-b2", "c": "val-c" }
Datatype converter processors
typeConverter
Use the typeConverter
processor to convert a value type
associated with the specified key to the specified type. It's a casting
processor that changes the types of the specified fields. Values can be
converted into one of the following datatypes: integer
,
double
, string
and boolean
.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
entries |
Array of entries. Each item in the array must contain
key and type fields. |
Yes |
Maximum entries: 10 |
|
key |
The key with the value that is to be converted to a different type |
Yes |
Maximum length: 128 Maximum nested key depth: 3 |
|
type |
The type to convert to. Valid values are
integer , double , string
and boolean . |
Yes |
Example
Take the following example log event:
{ "name": "value", "status": "200" }
The transformer configuration is this, using typeConverter
with
parseJSON
:
[ { "parseJSON": {} }, { "typeConverter": { "entries": [ { "key": "status", "type": "integer" } ] } } ]
The transformed log event would be the following.
{ "name": "value", "status": 200 }
datetimeConverter
Use the datetimeConverter
processor to convert a datetime string
into a format that you specify.
Field | Description | Required? | Default | Limits |
---|---|---|---|---|
source |
The key to apply the date conversion to. |
Yes |
Maximum entries: 10 |
|
matchPatterns |
A list of patterns to match against the source
field |
Yes |
Maximum entries: 5 |
|
target |
The JSON field to store the result in. |
Yes |
Maximum length: 128 Maximum nested key depth: 3 |
|
targetFormat |
The datetime format to use for the converted data in the target field. |
No |
|
Maximum length:64 |
sourceTimezone |
The time zone of the source field. For a list of possible values, see Java Supported Zone Ids and
Offsets |
No |
UTC |
Minimum length:1 |
targetTimezone |
The time zone of the target field. For a list of possible values, see Java Supported Zone Ids and
Offsets |
No |
UTC |
Minimum length:1 |
locale |
The locale of the source field. For a list of possible values, see Locale getAvailableLocales() Method in Java with
Examples |
Yes |
Minimum length:1 |
Example
Take the following example log event:
{"german_datetime": "Samstag 05. Dezember 1998 11:00:00"}
The transformer configuration is this, using dateTimeConverter
with parseJSON
:
[ { "parseJSON": {} }, { "dateTimeConverter": { "source": "german_datetime", "target": "target_1", "locale": "de", "matchPatterns": ["EEEE dd. MMMM yyyy HH:mm:ss"], "sourceTimezone": "Europe/Berlin", "targetTimezone": "America/New_York", "targetFormat": "yyyy-MM-dd'T'HH:mm:ss z" } } ]
The transformed log event would be the following.
{ "german_datetime": "Samstag 05. Dezember 1998 11:00:00", "target_1": "1998-12-05T17:00:00 MEZ" }