Rekeying data
This page describes a tutorial to rekey data in a Kafka topic with Lenses SQL Processors.
Sometimes you have a topic that is almost exactly what you need, except that the key of the record requires a bit of massaging.
In Lenses SQL you can use the special SELECT ... as _key
syntax to quickly re-key your data.
In our example, we have a topic containing events coming from temperature sensors.
Each record contains the sensor’ measured temperature, the time of the measurement, and the id of the sensor. The key is a unique string (for example, a UUID) that the upstream system assigns to the event.
You can replicate the example, by creating a topic in SQL Studio:
We can also insert some example data to do our experiments:
You can explore the topic in lenses to check the content and the shape of what you just inserted.
Let’s say that what you need is that same stream of events, but the record key should be the sensor_id
instead of the UUID.
With the special SELECT ... as _key
syntax, a few lines are enough to define our new re-keying processor:
The query above will take the sensor_id
from the value of the record and put it as the new key. All the values fields will remain untouched:
Maybe the sensor_id
is not enough, and for some reason, you also need the hour of the measurement in the key. In this case, the key will become a composite object with two fields: the sensor_id
and the event_hour
:
As you can see, you can build composite objects in Lenses with ease just listing all the structure’s fields, one after the other.
In the last example, the _key
output storage format will be inferred automatically by the system as JSON
. If you need more control, you can use the STORE AS
clause before the SELECT
.
The following example will create a topic as the previous one, but where the keys will be stored as AVRO
:
Happy re-keying!
Last updated