Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


This feature can be enabled by setting the property “follower.fetch.pending.reads.insync.enable” to true. The default value will be set as false to give backward compatibility.

The This approach will reduce the offline partition occurrence. But the main disadvantage with this approach is offline partitions occurrence will be reduced but it can still happen when there are requests queued up and the existing fetch requests in io threads are taking longer. The subsequent requests may get stuck in the queue and they may not be able to get served before the ISR Expiration Task considers them out of sync and eventually causes offline partitions. 


This is an extension of Solution 1 but with the leader relinquishes relinquishing the leadership if a fetch request takes longer than expected.  It will also move its broker id to last in the sequence of ISRs while sending AlterISRRequest to Zookeeper or Controller. That will avoid choosing this broker as the leader immediately by the controller. This approach will mitigate the case of requests getting piled up in the requests queue as mentioned earlier. We can introduce the respective config for the timeout with a default value. 

The main disadvantage with this approach is ISR thrashing may occur when PreferredLeaderElection is enabled and the current affected leader is a preferred leader. there are intermitten issues across brokers.

Example : 

<Topic, partition>: <events, 0>