...
Use case | Goal | Solution with LogAppendTime index | Solution with CreateTime index | Comparison | |
---|---|---|---|---|---|
1 | Search by timestamp | Not lose messages | If user want to search for a message with CreateTime CT. They can use CT to search in the LogAppendTime index. Because LogAppendTime > CT for the same message (assuming no skew clock). If the clock is skewed, people can search with CT - X where X is the max skew. If user want to search for a message with LogAppendTime LAT, they can just search with LAT and get a millisecond accuracy. | User can just search with CT and get a minute level granularity offset. | If the latency in the pipeline is greater than one minute, user might consume less message by using CreateTime index. Otherwise, LogAppendTime index is probably preferred. Consider the following case:
If user want to search with CT after they consumed m2, they will have to reconsume from m1. Depending on how big LAT2 - LAT1 is, the amount of messages to be reconsumed can be very big. |
2 | Search by timestamp (bootstrap) |
| In bootstrap case, all the LAT would be close. For example If user want to process the data in last 3 days and did the following:
In this case, LogAppendTime index does not help too much. That means user needs to filter out the data older than 3 days before dumping them into Kafka. | In bootstrap case, the CreateTime will not change, if user follow the same procedure started in LogAppendTime index section. Searching by timestamp will work. | LogAppendTime index needs further attention from user. |
3 | Failover from cluster 1 to cluster 2 |
| Similar search by timestamp. User can choose to use CT or LAT of cluster 1 to search on cluster 2. In this case, searching with CT - MaxLatencyOfCluster will provide strong guarantee on not losing messages, but might have some duplicates depending on the difference in latency between cluster 1 and cluster 2.
| User can use CT to search and get minute level granularity. Duplicates are still not avoidable. There can be some tricky cases here. Consider the following case [1]:
In this case, m1 is created before m2. Due to latency difference, m1 arrives cluster 1 then m2 does, m2 arrives cluster 2 before m1 does. If a consumer consumed m2 in cluster 2 and fail over to cluster 1, simply search by CT2 will miss m1 because m1 has larger offset than m2 in cluster 2 but smaller offset than m2 in cluster 1. So the same trick or CT - MaxLatencyOfCluster is still needed. | In cross cluster fail over case, both solution can provide strong guarantee of not losing messages. But both needs to depend on the knowledge of MaxLatencyOfCluster. |
4 | Get lag for consumers by time | Know how long a consumer is lagging by time. | With LogAppendTime in the message, consumer can easily find out the lag by time and estimate how long it might need to reach the log end. | Not supported. | |
5 | Broker side latency metric | Let the broker to report latency of each topic. i.e. LAT - CT | The latency can be simply reported as LAT - CT. | The latency can be reported as System.currentTimeMillis - CT | The two solutions are the same. This latency information can be used for MaxLatencyOfCluster in use case 3. |
...