Attendees:

  • JV, Bobby, Kishore, Matteo, Francesco, Zhimerg, Sam, Enrico, Charan, Yiming

 Agenda:

  • Discuss about Netty 4
  • Discuss: Write/AddEntry failures beyond Ack Quorum responses are ignored by the client. Replication scans to detect these are expensive and NOT bullet proof. It just scans once a week, and just checks first and last entry in the segment, which may not be enough. Especially the case if bookie is up but certain ledger dir on the bookie got deleted or some other scenario where a under replicated hole can go undetected.
  • Testing: Existing Junit tests doesn't come close to give confidence for any distributed storage systems. What kind of destructive, consistency, and constraint guarantee tests are done at Twitter/Yahoo? As a community what else we can do? Jepsen? Runway? What does community think? how can we improve this area?
  • Discuss next release plan.
  • BOOKKEEPER-1040) Use separate log for compaction - Salesforce used different techniques to address this problem, we can discuss.

  • Salesforce started looking into having multiple entrylogs.

 

Open JIRA issues for 4.5.0:

Key Summary T Created Updated Due Assignee Reporter P Status Resolution
Loading...
Refresh

 

Open Pull Requests:e

https://github.com/apache/bookkeeper/pulls

 

Discussion:

 

  • Netty 4: problems with pause/restore handlers. Kishore will look into Yahoo/Matteo changes in order to get tests working. There is some flaky test which is failing due to lost entries/acks (especially the SlowBookieTest)
  • Replicator: problem with long lived ledgers. We are now aware of lost entries due to corrupted disks. No way to detect failures.
    • (Yiming) Twitter: lost-disk -> drop the bookie -> the replicator will substitute the bookie
  • JV: Tests frameworks for corruption/failures:
    • Twitter: dedicated clusters for tests, inject failures at pub/sub level and at SO/TCP level
    • Yahoo: inject failures at pub/sub level
    • MagNews:  inject failures at application level
  • Sam at Salesforce: fault injection at application level. Create dedicated clusters of machine to run system/integration tests. Sam started working on 
  • JV: I wonder if anyone could volunteer to study Jepsen and Runway to use it for BookKeeper
  • Bobby: it just a matter of resources. In the short term we have to finish the 'pushback'
  • Enrico will look at it but not in next days, let's start a mailing list thread
  • Sam: it is important to establish a common way (foundation) to set tup integration/system tests
  • Charan asks for review for his pending PRs, in particular for the BOOKKEEPER-1028 and BOOKKEEPER-1029 https://github.com/apache/bookkeeper/pull/127
    • Matteo suggests to disable the autoreplication during the upgrade of a single bookie machine
    • JV: in order to disable the replication now you have to disable it on every bookie
    • Matteo: fault detection replication is different from manual upgrades which is a controlled thing
    • Charan: the replication can be disabled at ZooKeeper level, so it is simple to turn off the replication for the whole cluster
    • Matteo: it is not ideal. It should ber better to mark the single bookie to be skipped by the replicator
    • Charan: Please pull this request, and Enrico is going to do it tomorrow.
    • Matteo: We should have a better way to decommissioning . Need a way to inform auditor which bookie is going out, and auditor generates under replication extents only for that bookie. But right now, all we are doing is triggering bookieDelay node on the ZK, which asks auditor to override the delay and start auditor immediately to scan.
    • JV: We all agreed that we need a better way to handle delay/bookie and asking auditor to start replicating a particular bookie as part of decomissioning workflow. But what Salesforece/Charan did is not bad and a great interim solution, and will move forward with it.

 

  • No labels