Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note that we are not making any change to heartbeat.interval.ms. Although the increased session timeout allows for less frequent heartbeats, the heartbeat also serves the purpose of discovering that a rebalance is in progress. 

Proposed Changes

In addition  There are no changes here beyond the change to the default session timeout change for the consumer, we propose to change the behavior of the broker configurations .

Compatibility, Deprecation, and Migration Plan

We don't foresee any complications with compatibility. New clients will take the new default and old clients will continue to use the old value.

Rejected Alternatives

An earlier iteration of this proposal considered a change to the behavior of group.min.session.timeout.ms and group.max.session.timeout.ms. Previously, if a consumer attempted to join a group with a session timeout outside of the allowed range of these configurations, then the broker would return INVALID_SESSION_TIMEOUT in the JoinGroup response, which was treated as a fatal error. Instead, the coordinator will now round the value to the nearest limit. For example, if the session timeout is less than group.min.session.timeout.ms,  then the coordinator will ignore the client provided value and use group.min.session.timeout.ms. At the same time, we will make both these dynamic configurations so that they can be changed without restarting brokers.

The motivation for this change of behavior is primarily to give operators more graceful options to restrict session timeout behavior. Today if an operator wants to change either of these settings, there is no safe way to do it without potentially causing existing applications to fail. 

Compatibility, Deprecation, and Migration Plan

We don't foresee any complications with compatibility. New clients will take the new default and old clients will continue to use the old value.

Rejected Alternatives

N/AThe idea was to let the coordinator automatically adjust the session timeout provided by the client to be within this range. This would give operators a way to change the allowed session timeout range without causing existing clients to fail. Unfortunately, this does not work gracefully with all clients. In particular, librdkafka-based consumers enforce the session timeout locally. If the session timeout is reached without response from the coordinator, then partitions are automatically revoked and the consumer rejoins the group as a new member. The coordinator, on the other hand, would still enforce the adjusted session timeout, which means that the rebalance would get delayed until the old member could be expired.