DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
Status
Current state: "Under Discussion"
Discussion thread: here
JIRA: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Currently, the ProducerPerformance tool supports transactional producers but cannot randomly abort transactions during testing. This prevents developers from evaluating how transactional producers behave under failure scenarios.
This proposal targets four critical areas:
- Measuring the overhead costs of aborting transactions;
- Assessing how failed transactions affect producer throughput;
- Validating transaction recovery processes;
- Benchmarking transactional producers under realistic fault conditions.
Public Interfaces
A single new command-line option is introduced: --transaction-abort-ratio
Property | Description |
|---|---|
Type | |
Range | 0.0 to 1.0 |
Default | 0.0 (no transactions aborted) |
Dependency | Only valid when transactions are enabled |
- At `1.0`: all transactions are aborted;
- At `0.5`: approximately half of the transactions are aborted;
- At `0.0` (default): behavior is identical to the current version.
Proposed Changes
1. Argument parsing
Add the --transaction-abort-ratio argument after the existing --transaction-duration-ms argument:
parser.addArgument("--transaction-abort-ratio")
.action(store())
.required(false)
.type(Double.class)
.metavar("TRANSACTION-ABORT-RATIO")
.dest("transactionAbortRatio")
.setDefault(0.0)
.help("The ratio of transactions to abort during the test. "
+ "The value should be between 0.0 and 1.0. "
+ "This option is only valid when transactions are enabled.");
2. Configuration validation
Add a new field double transactionAbortRatio with the following validations:
- Throw
ArgumentParserExceptionif the value is outside the [0.0, 1.0] range; - Throw
ArgumentParserExceptioniftransactionAbortRatiois greater than 0.0 but transactions are not enabled, since aborting without transactions is meaningless.
3. Abort logic
There are currently two places where commitTransaction() is called:
- In-loop: when the transaction duration exceeds
transactionDurationMs - Post-loop: to handle remaining uncommitted records
At both commit points, use the existing SplittableRandom instance to randomly decide whether to commit or abort based on the ratio:
if (random.nextDouble() < config.transactionAbortRatio) {
producer.abortTransaction();
} else {
producer.commitTransaction();
}
Note: The current code uses a fixed seed (new SplittableRandom(0)). The abort decision reuses the same instance, so for a given ratio and record count, results are deterministic and reproducible.
4. Warmup phase behavior
Transactions during the warmup phase are also subject to the --transaction-abort-ratio. The warmup phase (introduced in KAFKA-17645) is designed to bring the system into a steady state before collecting performance statistics. If the abort ratio were only applied during the steady-state phase, the system would transition from an all-commit warmup to a mixed commit/abort steady state, introducing an additional settling period that undermines the purpose of warmup. Applying the same abort ratio during warmup ensures the system has already stabilized under the target conditions when steady-state measurement begins.
Compatibility, Deprecation, and Migration Plan
This feature is fully backward compatible:
- When --transaction-abort-ratio is not specified, the tool behaves exactly as before;
- No existing configuration options are changed or removed;
- All existing functionality remains intact.
Test Plan
Unit tests will cover the following scenarios:
- Argument parsing: Verify that
--transaction-abort-ratio 0.5is parsed correctly - Range validation: Verify that values outside [0.0, 1.0] (e.g., -0.1, 1.5) throw an error
- Transaction dependency validation: Verify that setting a non-zero abort ratio without enabling transactions throws an error
- Abort logic:
- With ratio 0.0, all transactions are committed (
commitTransaction()is called) - With ratio 1.0, all transactions are aborted (
abortTransaction()is called) - With intermediate ratios, verify that the number of commit and abort calls matches the expected proportion
- With ratio 0.0, all transactions are committed (
Rejected Alternatives
None