DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
Status
Current state: Accepted
Discussion thread: https://lists.apache.org/thread/zg5jbs4ogqgv7d9qwzvb5rp5vd5y2soc
Vote thread: https://lists.apache.org/thread/s3xo06ow8xz1vsg71lwkjn04qbklny3w
JIRA: KAFKA-17057 - Getting issue details... STATUS
Released:
Motivation
With KAFKA-16508 we changed the Kafka Streams behavior to call the ProductionExceptionHandler for a single special case of retriable TimeoutException thrown for a potentially (we don't know yet, as metadata propagation is async) missing output topic, to break an infinite retry loop.
However, this seems not to be very flexible, as users might want to keep retrying, too.
Public Interfaces
Add a new return option RETRY to the existing ProductionExceptionHandlerResponse :
public interface ProductionExceptionHandler extends Configurable {
enum ProductionExceptionHandlerResponse {
// existing options
/* continue processing */
CONTINUE(0, "CONTINUE"),
/* fail processing */
FAIL(1, "FAIL"),
// newly added option
/* retry the operation -- might imply throwing a TaskCorruptedException and retrying from the last committed offset;
only valid to return this option if the passed in exception is a RetriableException;
if returned for a non-retriable exception, it will be interpreted as FAIL */
RETRY(2, "RETRY");
}
Proposed Changes
We propose to add a new option ProductionExceptionHandlerResponse.RETRY that a production exception handler can return for RetriableException. If this option is returned for a non-retriable exception, it will be interpreted as FAIL.
We further propose to update the logic of the existing (and default) DefaultProductionHandler to check for retriable exceptions and return RETRY instead of FAIL. While we consider the change of
KAFKA-16508
-
Getting issue details...
STATUS
as bug-fix, updating the exiting handler preserves backward compatibility, and seems to provide a better default behavior.
Compatibility, Deprecation, and Migration Plan
We only add a new return option, and thus no backward compatibility concerns arise.
Test Plan
Regular unit and integration testing is sufficient.
Documentation Plan
Update relevant JavaDocs and the web page docs.
Rejected Alternatives
We propose to interpret RETRY as FAIL for non-retriable exception. An alternative would be, to add a new method to ProductionExceptionHandler that we call for retriable errors only, and add a RetriableResponse enum and offer the new RETRY option only on the new enum (and the newly added method of ProductionExcetiponHandler returns the new response enum). While this option might express semantics a little bit stricter, it seems overkill to expand the API surface area, and the proposed interpretation of RETRY as FAIL seems sounds.