This KIP proposes extracting the data format classes currently in the Kafka Connect API module into a standalone module.

Status

Current state: Under Discussion

Discussion thread: here

JIRA: here

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

In the current implementation, data classes are shipped within the connect-api module. So users outside the Kafka Connect world are unwilling to bring unnecessary dependencies, which can prevent broader adoption and integration. By moving these classes to a separate module, we can achieve:

  • More flexibility for developers
    • Developers can now use these data classes outside of Kafka Connect
    • It makes it easier to create consistent data models across different applications
  • Cleaner project structure
    • Allows independent versioning of data format components
    • Creates a clearer separation of different parts of the Kafka ecosystem
  • Become standard for other tools and frameworks
    • It makes it easier for third-party tools to work with Kafka's data models
    • Supports a more generic data format
    • It helps different streaming platforms work together more smoothly

Public Interfaces

The package org.apache.kafka.connect.data and the classes in it will remain but moved to a separate module.

Proposed Changes

  • Create a new module: connect-data
  • Move all classes from the package org.apache.kafka.connect.data to this new module
  • Move DataException, SchemaBuilderException, SchemaProjectorException from org.apache.kafka.connect.errors to org.apache.kafka.connect.data.errors in the connect-data module
  • The DataException will no longer extend ConnectException since now the connect-data can be used also outside Kafka Connect
  • Add the new connect-data as a dependency for 
    • connect-api
    • connect-transforms 
    • connect-json
    • connect-file
    • connect-runtime
    • connect-mirror
    • connect-test-plugins
    • Jmh-benchmarks

Compatibility, Deprecation, and Migration Plan

Nothing to manage since the change will be backward compatible.

Test Plan

All current tests should be fine to detect any issues.

Rejected Alternatives

None

  • No labels