TimestampConverter
SMT that allows the user to specify the format of the timestamp inserted as a header. It also avoids the synchronization block requirement for converting to a string representation of the timestamp.
An adapted version of the TimestampConverter SMT. The SMT adds a few more features to the original:
allows nested fields resolution (e.g.
a.b.c
)uses _key or _value as prefix to understand the field to convert is part of the record Key or Value
allows conversion from one string representation to another (e.g.
yyyy-MM-dd HH:mm:ss
toyyyy-MM-dd
)allows conversion using a rolling window boundary (e.g. every 15 minutes, or one hour)
Transform Type Class
Configuration
Name | Description | Type | Default | Valid Values |
---|---|---|---|---|
| The name of the header to insert the timestamp into. | String | ||
| The field path containing the timestamp, or empty if the entire value is a timestamp. Prefix the path with the literal string | String | ||
| Sets the desired timestamp representation. | String | string,unix,date,time,timestamp | |
| Sets the format of the timestamp when the input is a string. The format requires a Java DateTimeFormatter-compatible pattern. Multiple (fallback) patterns can be added, comma-separated. | String | ||
| Sets the format of the timestamp when the output is a string. The format requires a Java DateTimeFormatter-compatible pattern. | String | ||
| An optional parameter for the rolling time window type. When set it will adjust the output value according to the time window boundary. | String | none | none, hours, minutes, seconds |
| An optional positive integer parameter for the rolling time window size. When | Int | 15 | |
| The desired Unix precision for the timestamp. Used to generate the output when type=unix or used to parse the input if the input is a Long. This SMT will cause precision loss during conversions from, and to, values with sub-millisecond components. | String | milliseconds | seconds, milliseconds, microseconds, nanoseconds |
| Sets the timezone. It can be any valid java timezone. Overwrite it when | String | UTC |
Example
To convert to and from a string representation of the date and time in the format yyyy-MM-dd HH:mm:ss.SSS
, use the following configuration:
To convert to and from a string representation while applying an hourly rolling window:
To convert to and from a string representation while applying an hourly rolling window and timezone:
To convert to and from a string representation while applying a 15 minutes rolling window:
To convert to and from a Unix timestamp, use the following:
Here is an example using the record timestamp field:
Configuration for format.from.pattern
format.from.pattern
Configuring multiple format.from.pattern
items requires careful thought as to ordering and may indicate that your Kafka topics or data processing techniques are not aligning with best practices. Ideally, each topic should have a single, consistent message format to ensure data integrity and simplify processing.
Multiple Patterns Support
The format.from.pattern
field supports multiple DateTimeFormatter patterns in a comma-separated list to handle various timestamp formats. Patterns containing commas should be enclosed in double quotes. For example:
Best Practices
While this flexibility can be useful, it is generally not recommended due to potential complexity and inconsistency. Ideally, a topic should have a single message format to align with Kafka best practices, ensuring consistency and simplifying data processing.
Configuration Order
The order of patterns in format.from.pattern
matters. Less granular formats should follow more specific ones to avoid data loss. For example, place yyyy-MM-dd
after yyyy-MM-dd'T'HH:mm:ss
to ensure detailed timestamp information is preserved.
Last updated