InsertWallclockHeaders
A Kafka Connect Single Message Transform (SMT) that inserts date, year, month,day, hour, minute and second headers using the system clock as a message header.
The headers inserted are of type STRING. By using this SMT, you can partition the data by yyyy-MM-dd/HH
or yyyy/MM/dd/HH
, for example, and only use one SMT.
The list of headers inserted are:
date
year
month
day
hour
minute
second
All headers can be prefixed with a custom prefix. For example, if the prefix is wallclock_
, then the headers will be:
wallclock_date
wallclock_year
wallclock_month
wallclock_day
wallclock_hour
wallclock_minute
wallclock_second
When used with the Lenses connectors for S3, GCS or Azure data lake, the headers can be used to partition the data. Considering the headers have been prefixed by _
, here are a few KCQL examples:
Transform Type Class
Configuration
Name | Description | Type | Default | Importance |
---|---|---|---|---|
| Optional header prefix. | String | Low | |
| Optional Java date time formatter. | String | yyyy-MM-dd | Low |
| Optional Java date time formatter for the year component. | String | yyyy | Low |
| Optional Java date time formatter for the month component. | String | MM | Low |
| Optional Java date time formatter for the day component. | String | dd | Low |
| Optional Java date time formatter for the hour component. | String | HH | Low |
| Optional Java date time formatter for the minute component. | String | mm | Low |
| Optional Java date time formatter for the second component. | String | ss | Low |
| Optional. Sets the timezone. It can be any valid Java timezone. | String | UTC | Low |
| Optional. Sets the locale. It can be any valid Java locale. | String | en | Low |
Example
To store the epoch value, use the following configuration:
To prefix the headers with wallclock_
, use the following:
To change the date format, use the following:
To use the timezone Asia/Kolkoata
, use the following:
To facilitate S3, GCS, or Azure Data Lake partitioning using a Hive-like partition name format, such as date=yyyy-MM-dd / hour=HH
, employ the following SMT configuration for a partition strategy.
and in the KCQL setting utilise the headers as partitioning keys:
Last updated