This KIP proposes extracting the data format classes currently in the Kafka Connect API module into a standalone module.
Status
Current state: Under Discussion
Discussion thread: here
JIRA: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
In the current implementation, data classes are shipped within the connect-api module. So users outside the Kafka Connect world are unwilling to bring unnecessary dependencies, which can prevent broader adoption and integration. By moving these classes to a separate module, we can achieve:
- More flexibility for developers
- Developers can now use these data classes outside of Kafka Connect
- It makes it easier to create consistent data models across different applications
- Cleaner project structure
- Allows independent versioning of data format components
- Creates a clearer separation of different parts of the Kafka ecosystem
- Become standard for other tools and frameworks
- It makes it easier for third-party tools to work with Kafka's data models
- Supports a more generic data format
- It helps different streaming platforms work together more smoothly
Public Interfaces
The package org.apache.kafka.connect.data and the classes in it will remain but moved to a separate module.
Proposed Changes
- Create a new module: connect-data
- Move all classes from the package org.apache.kafka.connect.data to this new module
- Move DataException, SchemaBuilderException, SchemaProjectorException from org.apache.kafka.connect.errors to org.apache.kafka.connect.data.errors in the connect-data module
- The DataException will no longer extend ConnectException since now the connect-data can be used also outside Kafka Connect
- Add the new connect-data as a dependency for
- connect-api
- connect-transforms
- connect-json
- connect-file
- connect-runtime
- connect-mirror
- connect-test-plugins
- Jmh-benchmarks
Compatibility, Deprecation, and Migration Plan
Nothing to manage since the change will be backward compatible.
Test Plan
All current tests should be fine to detect any issues.
Rejected Alternatives
None