1 of 4

Release notes

This page describes the release notes for the Stream Reactor.

Stream Reactor

This page contains the release notes for the Stream Reactor.

11.1.0 DataLakes Sinks

This release resolves a gap in the S3, GCS, and Azure sinks when latest schema optimization is enabled (via connect.***.latest.schema.optimization.enabled

Secret Providers

This page contains the release notes for Connect Secret Providers.

2.3.0

Security: Write maven Descriptors on packaging to avoid incorrect dependencies being identified by security scanner tools. (Fixes CVE-2023-1370).
Security: Add dependency checking as part of the build process.

AES256 Provider:

Security: Change AES256 key to PASSWORD type to avoid logging secrets.

AWS Secrets Manager Provider:

New property : file.write Writes secrets to file on path. Required for Java trust stores, key stores, certs that need to be loaded from file. For ease of use for the secret provider, this is disabled by default.
New property : secret.default.ttl If no TTL is configured in AWS Secrets Manager, apply a default TTL (in milliseconds).
New property : aws.endpoint.override Add override for non-standard or compatible AWS endpoints.

Azure Secret Provider:

Bugfix: Recompute TTL values on each get so the timestamp of reschedule shrinks until TTL is reached.
Bugfix: Fix so that UTF-8 encodings in Azure are correctly mapped to the UTF8 encoding in the secret provider.

Hashicorp Vault Provider:

Bugfix: Files will be written to the correct directory.
New property: app.role.path Support vault approle custom mount path.
New property: kubernetes.auth.path Support vault custom auth path (with default value to be auth/kubernetes).

Single Message Transforms

This page contains the release notes for Single Message Transforms.

1.3.2

Adds support for multiple "from" patterns.

This converts the format.from.pattern field in the following SMTs:

InsertFieldTimestampHeaders
InsertRollingFieldTimestampHeaders
TimestampConverter

into a List (comma separated) so that these SMTs can support multiple (fallback) DateTimeFormatter patterns should multiple timestamps be in use.

Configuration Compatibility

When updating your configuration, if format.from.pattern contains commas, enclose the pattern in double quotes.

Configurations should be backwards-compatible with previous versions of the SMT, the exception is if commas are used in the format.from.pattern string.

To update the configuration of format.from.pattern ensure you enclose any pattern which contains commas in double quotes.

Old Configuration:

New Configuration

Multiple format.from.pattern can now be configured, each pattern containing a comma can be enclosed in double quotes:

Configuration Order

When configuring format.from.pattern, the order matters; less granular formats should follow more specific ones to avoid data loss. For example, place yyyy-MM-dd after yyyy-MM-dd'T'HH:mm:ss to ensure detailed timestamp information isn't truncated.

1.3.1

Increase error information for debugging.

1.3.0

Adds support for adding metadata to kafka connect headers (for a source connector).

1.2.1

Workaround for Connect runtime failing with unexplained exception where it looks like the static fields of parent class is not resolved prop.

1.2.0

Adds support for inserting time based headers using a Kafka message payload field.

1.1.2

Fix public visibility of rolling timestamp headers.

1.1.1

Don't make CTOR protected.

1.1.0

Introducing four new Single Message Transforms (SMTs) aimed at simplifying and streamlining the management of system or record timestamps, along with support for rolling windows. These SMTs are designed to significantly reduce the complexity associated with partitioning data in S3/Azure/GCS Sink based on time, offering a more efficient and intuitive approach to data organization. By leveraging these SMTs, users can seamlessly handle timestamp-based partitioning tasks, including optional rolling window functionality, paving the way for smoother data management workflows.

Single Message Transforms

This page contains the release notes for Single Message Transforms.

1.3.2

Adds support for multiple "from" patterns.

This converts the format.from.pattern field in the following SMTs:

InsertFieldTimestampHeaders
InsertRollingFieldTimestampHeaders
TimestampConverter

into a List (comma separated) so that these SMTs can support multiple (fallback) DateTimeFormatter patterns should multiple timestamps be in use.

Configuration Compatibility

When updating your configuration, if format.from.pattern contains commas, enclose the pattern in double quotes.

Configurations should be backwards-compatible with previous versions of the SMT, the exception is if commas are used in the format.from.pattern string.

To update the configuration of format.from.pattern ensure you enclose any pattern which contains commas in double quotes.

Old Configuration:

New Configuration

Multiple format.from.pattern can now be configured, each pattern containing a comma can be enclosed in double quotes:

Configuration Order

1.3.1

Increase error information for debugging.

1.3.0

Adds support for adding metadata to kafka connect headers (for a source connector).

1.2.1

Workaround for Connect runtime failing with unexplained exception where it looks like the static fields of parent class is not resolved prop.

1.2.0

Adds support for inserting time based headers using a Kafka message payload field.

1.1.2

Fix public visibility of rolling timestamp headers.

1.1.1

Don't make CTOR protected.

1.1.0

Secret Providers

This page contains the release notes for Connect Secret Providers.

2.3.0

Security: Write maven Descriptors on packaging to avoid incorrect dependencies being identified by security scanner tools. (Fixes CVE-2023-1370).
Security: Add dependency checking as part of the build process.

AES256 Provider:

Security: Change AES256 key to PASSWORD type to avoid logging secrets.

AWS Secrets Manager Provider:

New property : file.write Writes secrets to file on path. Required for Java trust stores, key stores, certs that need to be loaded from file. For ease of use for the secret provider, this is disabled by default.
New property : secret.default.ttl If no TTL is configured in AWS Secrets Manager, apply a default TTL (in milliseconds).
New property : aws.endpoint.override Add override for non-standard or compatible AWS endpoints.

Azure Secret Provider:

Bugfix: Recompute TTL values on each get so the timestamp of reschedule shrinks until TTL is reached.
Bugfix: Fix so that UTF-8 encodings in Azure are correctly mapped to the UTF8 encoding in the secret provider.

Hashicorp Vault Provider:

Bugfix: Files will be written to the correct directory.
New property: app.role.path Support vault approle custom mount path.
New property: kubernetes.auth.path Support vault custom auth path (with default value to be auth/kubernetes).

Added post-processing capabilities for AWS S3 and GCP Storage source connectors ( Azure Datalake Gen 2 support coming soon).
New KCQL properties:
- post.process.action: Defines the action (DELETE or MOVE) to perform on source files after successful processing.
- post.process.action.bucket: Specifies the target bucket for the MOVE action (required for MOVE).
- post.process.action.prefix: Specifies a new prefix for the file’s location when using the MOVE action (required for MOVE).
Use cases:
- Free up storage space by deleting files.
- Archive or organize processed files by moving them to a new location.
Example 1 : Delete Files:

Important: The AWS Source Partition Search properties have changed for consistency of configuration. The properties that have changed for 6.2.0 are:
- connect.s3.partition.search.interval changes to connect.s3.source.partition.search.interval.
- connect.s3.partition.search.continuous changes to connect.s3.source.partition.search.continuous.
- connect.s3.partition.search.recurse.levels changes to connect.s3.source.partition.search.recurse.levels.
If you use any of these properties, when you upgrade to the new version then your source will halt and the log will display an error message prompting you to adjust these properties. Be sure to update these properties in your configuration to enable the new version to run.
Dependency upgrade of Hadoop libraries version to mitigate against CVE-2022-25168.

Enhanced PARTITIONBY Support: expanded support for PARTITIONBY fields, now accommodating fields containing dots. For instance, you can use PARTITIONBY a, `field1.field2` for enhanced partitioning control.
Advanced Padding Strategy: a more advanced padding strategy configuration. By default, padding is now enforced, significantly improving compatibility with S3 Source.
Improved Error Messaging: Enhancements have been made to error messaging, providing clearer guidance, especially in scenarios with misaligned topic configurations (#978).
Commit Logging Refactoring: Refactored and simplified the CommitPolicy for more efficient commit logging (#964).
Comprehensive Testing: Added additional unit testing around configuration settings, removed redundancy from property names, and enhanced KCQL properties parsing to support Map structures.
Consolidated Naming Strategies: Merged naming strategies to reduce code complexity and ensure consistency. This effort ensures that both hierarchical and custom partition modes share similar code paths, addressing issues related to padding and the inclusion of keys and values within the partition name.
Optimized S3 API Calls: Switched from using deleteObjects to deleteObject for S3 API client calls (#957), enhancing performance and efficiency.
JClouds Removal: The update removes the use of JClouds, streamlining the codebase.
Legacy Offset Seek Removal: The release eliminates legacy offset seek operations, simplifying the code and enhancing overall efficiency

Expanded Text Reader Support: new text readers to enhance data processing flexibility, including:
- Regex-Driven Text Reader: Allows parsing based on regular expressions.
- Multi-Line Text Reader: Handles multi-line data.
- Start-End Tag Text Reader: Processes data enclosed by start and end tags, suitable for XML content.
Improved Parallelization: enhancements enable parallelization based on the number of connector tasks and available data partitions, optimizing data handling.
Data Consistency: Resolved data loss and duplication issues when the connector is restarted, ensuring reliable data transfer.
Dynamic Partition Discovery: No more need to restart the connector when new partitions are added; runtime partition discovery streamlines operations.
Efficient Storage Handling: The connector now ignores the .indexes directory, allowing data storage in an S3 bucket without a prefix.
Increased Default Records per Poll: the default limit on the number of records returned per poll was changed from 1024 to 10000, improving data retrieval efficiency and throughput.
Ordered File Processing: Added the ability to process files in date order. This feature is especially useful when S3 files lack lexicographical sorting, and S3 API optimisation cannot be leveraged. Please note that it involves reading and sorting files in memory.
Parquet INT96 Compatibility: The connector now allows Parquet INT96 to be read as a fixed array, preventing runtime failures.

All
- Scala upgrade to 2.13.10
- Dependency upgrades
- Upgrade to Kafka 3.3.0
- SimpleJsonConverter - Fixes mismatching schema error.
AWS S3 Sink Connector
- Add connection pool config
- Add Short type support
- Support null values
AWS S3 Source Connector
- Add connection pool config
- Retain partitions from filename or regex
- Switch to AWS client by default
MQTT Source Connector
- Allow toggling the skipping of MQTT Duplicates
MQTT Sink Connector
- Functionality to ensure unique MQTT Client ID is used for MQTT sink
Elastic6 & Elastic7 Sink Connectors
- Fixing issue with missing null values

All
- Scala 2.13 Upgrade
- Gradle to SBT Migration
- Producing multiple artifacts supporting both Kafka 2.8 and Kafka 3.1.
- Upgrade to newer dependencies to reduce CVE count
- Switch e2e tests from Java to Scala.
AWS S3 Sink Connector
- Optimal seek algorithm
- Parquet data size flushing fixes.
- Adding date partitioning capability
FTP Source Connector
- Fixes to slice mode support.
Hazelcast Sink Connector
- Upgrade to HazelCast 4.2.4. The configuration model has changed and now uses clusters instead of username and password configuration.
Hive Sink Connector
- Update of parquet functionality to ensure operation with Parquet 1.12.2.
- Support for Hive 3.1.3.
JMS Connector
- Enable protobuf support.
Pulsar
- Upgrade to Pulsar 2.10 and associated refactor to support new client API.

All
- Move to KCQL 2.8.9
- Change sys.errors to ConnectExceptions
- Additional testing with TestContainers
- Licence scan report and status
AWS S3 Sink Connector
- S3 Source Offset Fix
- Fix JSON & Text newline detection when running in certain Docker images
- Byte handling fixes
AWS S3 Source Connector
- Change order of match to avoid scala.MatchError
- S3 Source rewritten to be more efficient and use the natural ordering of S3 keys
- Region is necessary when using the AWS client
Cassandra Sink & Source Connectors
- Add connection and read client timeout
FTP Connector
- Support for Secure File Transfer Protocol
Hive Sink Connector
- Array Support
- Kerberos debug flag added
Influx DB Sink
- Bump influxdb-java from version 2.9 to 2.29
- Added array handling support
MongoDB Sink Connector
- Nested Fields Support
Redis Sink Connector
- Fix Redis Pubsub Writer
- Add support for json and json with schema

AWS S3 Sink Connector
- Prevent null pointer exception in converters when maps are presented will null values
- Offset reader optimisation to reduce S3 load
- Ensuring that commit only occurs after the preconfigured time interval when using WITH_FLUSH_INTERVAL
AWS S3 Source Connector (New Connector)
Cassandra Source Connector
- Add Bucket Timeseries Mode
- Reduction of logging noise
- Proper handling of uninitialized connections on task stop()
Elasticsearch Sink Connector
- Update default port
Hive Sink
- Improve Orc format handling
- Fixing issues with partitioning by non-string keys
Hive Source
- Ensuring newly written files can be read by the hive connector by introduction of a refresh frequency configuration option.
Redis Sink
- Correct Redis writer initialisation

Release notes

Stream Reactor

11.1.0

DataLakes Sinks

Secret Providers

2.3.0

AES256 Provider:

AWS Secrets Manager Provider:

Azure Secret Provider:

Hashicorp Vault Provider:

Single Message Transforms

1.3.2

Configuration Compatibility

Old Configuration:

New Configuration

Configuration Order

1.3.1

1.3.0

1.2.1

1.2.0

1.1.2

1.1.1

1.1.0

Single Message Transforms

1.3.2

Configuration Compatibility

Old Configuration:

New Configuration

Configuration Order

1.3.1

1.3.0

1.2.1

1.2.0

1.1.2

1.1.1

1.1.0

Release notes

Secret Providers

2.3.0

AES256 Provider:

AWS Secrets Manager Provider:

Azure Secret Provider:

Hashicorp Vault Provider:

Stream Reactor

11.1.0

DataLakes Sinks

11.0.0

Compatibility Notice

Connector Retirement

10.0.3

Datalakes sinks (AWS S3, GCS and Azure DataLake Gen2)

10.0.2

Azure DataLake Gen 2

10.0.1

DataLakes

10.0.0

New Apache Cassandra sink

Azure CosmosDB sink

New Features and Improvements

9.0.2

DataLake Sinks (S3, GCP Storage, Azure Data Lake)

9.0.1

Google BigQuery Sink

9.0.0

All Modules

DataLake Sinks (S3, GCP Storage, Azure Data Lake)

MQTT Connector

8.1.33

DataLake Sinks (S3, GCP Storage, Azure Data Lake)

8.1.32

GCP Storage Sink

8.1.31

DataLake Sinks

8.1.30

DataLake Sinks (S3, GCP Storage, Azure Data Lake)

HTTP Sink

8.1.29

DataLake Sinks (S3, GCP Storage, Azure Data Lake)

HTTP Sink

8.1.28