TimestampConverter
SMT that allows the user to specify the format of the timestamp inserted as a header. It also avoids the synchronization block requirement for converting to a string representation of the timestamp.
An adapted version of the TimestampConverter SMT. The SMT adds a few more features to the original:
allows nested fields resolution (e.g.
a.b.c
)uses _key or _value as prefix to understand the field to convert is part of the record Key or Value
allows conversion from one string representation to another (e.g.
yyyy-MM-dd HH:mm:ss
toyyyy-MM-dd
)allows conversion using a rolling window boundary (e.g. every 15 minutes, or one hour)
Transform Type Class
Configuration
header.name
The name of the header to insert the timestamp into.
String
field
The field path containing the timestamp, or empty if the entire value is a timestamp. Prefix the path with the literal string _key
, _value
or _timestamp
, to specify the record Key, Value or Timestamp is used as source. If not specified _value
is implied.
String
target.type
Sets the desired timestamp representation.
String
string,unix,date,time,timestamp
format.from.pattern
Sets the format of the timestamp when the input is a string. The format requires a Java DateTimeFormatter-compatible pattern. Multiple (fallback) patterns can be added, comma-separated.
String
format.to.pattern
Sets the format of the timestamp when the output is a string. The format requires a Java DateTimeFormatter-compatible pattern.
String
rolling.window.type
An optional parameter for the rolling time window type. When set it will adjust the output value according to the time window boundary.
String
none
none, hours, minutes, seconds
rolling.window.size
An optional positive integer parameter for the rolling time window size. When rolling.window.type
is defined this setting is required. The value is bound by the rolling.window.type
configuration. If type is minutes
or seconds
then the value cannot bigger than 60, and if the type is hours
then the max value is 24.
Int
15
unix.precision
The desired Unix precision for the timestamp. Used to generate the output when type=unix or used to parse the input if the input is a Long. This SMT will cause precision loss during conversions from, and to, values with sub-millisecond components.
String
milliseconds
seconds, milliseconds, microseconds, nanoseconds
timezone
Sets the timezone. It can be any valid java timezone. Overwrite it when target.type
is set to date, time, or string
, otherwise it will raise an exception.
String
UTC
Example
To convert to and from a string representation of the date and time in the format yyyy-MM-dd HH:mm:ss.SSS
, use the following configuration:
To convert to and from a string representation while applying an hourly rolling window:
To convert to and from a string representation while applying an hourly rolling window and timezone:
To convert to and from a string representation while applying a 15 minutes rolling window:
To convert to and from a Unix timestamp, use the following:
Here is an example using the record timestamp field:
Configuration for format.from.pattern
format.from.pattern
Configuring multiple format.from.pattern
items requires careful thought as to ordering and may indicate that your Kafka topics or data processing techniques are not aligning with best practices. Ideally, each topic should have a single, consistent message format to ensure data integrity and simplify processing.
Multiple Patterns Support
The format.from.pattern
field supports multiple DateTimeFormatter patterns in a comma-separated list to handle various timestamp formats. Patterns containing commas should be enclosed in double quotes. For example:
Best Practices
While this flexibility can be useful, it is generally not recommended due to potential complexity and inconsistency. Ideally, a topic should have a single message format to align with Kafka best practices, ensuring consistency and simplifying data processing.
Configuration Order
The order of patterns in format.from.pattern
matters. Less granular formats should follow more specific ones to avoid data loss. For example, place yyyy-MM-dd
after yyyy-MM-dd'T'HH:mm:ss
to ensure detailed timestamp information is preserved.
Last updated