- Status
- Motivation
- Proposed Changes
- Converters use case
- Compatibility, Deprecation, and Migration Plan
- Test Plan
- Rejected Alternatives
This page is meant as a template for writing a KIP. To create a KIP choose Tools->Copy on this page and modify with your content and replace the heading with the next KIP number and a description of your issue. Replace anything in italics with your own description.
Status
Current state: "Under Discussion - Duplicate"
Discussion thread: here
JIRA: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
This KIP introduces support for the timestamp-micros logical type within Kafka Connect.
While formats like Avro, Parquet, and others support higher precision timestamps (including microseconds :: https://avro.apache.org/docs/1.11.0/spec.html), Kafka Connect has been limited to handling only millisecond precision (timestamp).
As a result, any timestamp data expressed in microseconds is truncated when communicated between the Kafka Connect source and sink, leading to potential loss of precision and data fidelity.
This change aims to extend the logical timestamp support in Kafka Connect to include timestamp-micros. By doing so, it ensures that the full precision of time data can be communicated accurately across the entire pipeline without any loss of information.
It handles:
Precision Loss: Many data storage formats, such as Avro, have long supported timestamp precision at the microsecond level. However, due to Kafka Connect's inability to handle timestamp-micros, this precision is lost when converting or sending data between Kafka and sink systems.
Increasing Adoption of High Precision Time: As data systems increasingly rely on high-precision timestamps (e.g., in systems like logs, event streams, or financial applications), it's crucial to maintain this precision throughout the entire data flow, especially when integrating with systems like Kafka Connect.
Supporting timestamp-micros in Kafka Connect will enhance compatibility with existing data storage formats, provide the ability to retain high-precision timestamps, and improve data quality.
Although KIP-808 introduced precision support for microseconds (micros) and nanoseconds (nanos), the Kafka Connect framework currently supports only a single logical type: Timestamp. Internally, this Timestamp type relies on java.util.Date
, which is limited to millisecond precision. As a result, when Kafka Connect processes timestamps with microsecond or nanosecond precision, it inadvertently truncates the additional precision, treating the input as milliseconds. This leads to data corruption, as the original microsecond-level details are lost.
Proposed Changes:
This KIP proposes to add new logical-type “TimestampMicros” to handle micros level precision supported by Avro and other formats. It uses internally “java.time.Instant” to handle micros epochs.
|
This only adds support for another logical type without touching any other existing data-types or logical types.
How this will be used by converters:
Currently the converters let’s say Avro converter does the following:
|
Here since we do not have the class to handle Micros logical type if logicalType does not matches “AVRO_LOGICAL_TIMESTAMP_MILLIS” or if it will be “timestamp-mills” we will simply build “int64” builder but after this change we can handle that in the following manner:
|
Once this will be approved and merge, a fix will be added to KIP-808 in the following way:
|
Compatibility, Deprecation, and Migration Plan
No breaking changes.
Users upgrading an existing MirrorMaker 2 (MM2) cluster do not need to change any configurations.
Backward-compatible, meaning existing data will continue to function as expected.
Users who want to use
TimestampMicros
simply need to upgrade the Kafka Connect client.
Test Plan
Unit Tests: Ensure correctness in
TimestampMicros
conversions.
Rejected Alternatives
Reusing the existing
Timestamp
logical type:This would lead to data corruption and precision loss when handling microsecond timestamps.
Since
java.util.Date
only supports milliseconds, the microsecond data would be lost.