Target Partition
Configures the records distribution to the target topic.
The sink.partition
key within a replication flow controls how records are distributed across partitions in the target topic. This setting is critical for managing data ordering and distribution, especially when the source and target topics have different partition counts. K2K provides two distinct strategies.
Preserve Source Partition
This strategy ensures that each record is written to the same partition number in the target topic as it originated from in the source topic. For example, a record from partition 5
of a source topic will be written to partition 5
of the target topic.
Use Cases:
This is the most common strategy for ensuring data fidelity.
It is essential for preserving the per-partition message ordering from the source cluster. If two messages were in order in a source partition, they will be in the same order in the corresponding target partition.
Example
To enable this, set the value of partition
to the keyword source
.
# sink configuration to preserve the source partition number
sink:
name: "partition-preserving-sink"
topic: source
partition: source
Prerequisite: For this strategy to succeed, the target topic must have at least as many partitions as the source topic. An attempt to write to a non-existent partition (e.g., from source partition 10 to a target topic with only 8 partitions) will result in a record production error.
Producer-Defined Partitioning
This strategy delegates the partitioning decision to K2K's internal Kafka producer client, which follows standard Apache Kafka partitioning logic.
Behavior:
If a record key is present, the producer's default partitioner hashes the key to consistently select a target partition. This preserves ordering for all records that share the same key.
If a record key is null, records are distributed across available partitions in batches (typically in a round-robin fashion) to ensure even load distribution.
This logic can be fully customized by setting the
partitioner.class
property in thetarget.kafka.producer
configuration block.
Use Cases:
This strategy is required when the target topic has a different number of partitions than the source, especially when it has fewer.
It is useful for re-partitioning data on the target cluster based on the record key, rather than preserving the original source partitioning scheme.
Example
To enable this, set the value of partition
to the keyword producer
.
# sink configuration to delegate partitioning to the Kafka producer
sink:
name: "producer-partitioned-sink"
topic: source
partition: producer
Last updated
Was this helpful?