Child pages
  • Hadoop 3 release status updates
Skip to end of metadata
Go to start of metadata

2017-12-14

Hadoop 3.0.0 has been released! Thanks to our many contributors, and congratulations to the community on this milestone.

Thus ends our 3.0.0 release updates. The 3.0.x release series will continue to be maintained, and 3.1.0 is planned for the first half of 2018.

2017-12-01

Haven't written one of these in a month. I had high hopes for RC0, but it failed due to  HADOOP-15058 - create-release site build outputs dummy shaded jars due to skipShade Resolved  which Sangjin found, and then a number of other blockers were found shortly after that.

We're back to blocker burndown. My new (realistic) goal is to get 3.0.0 out before Christmas. We could always use more help with reviews; most things are patch available.

 

Highlights:

Red flags:

Previously tracked blockers that have been resolved or dropped:

GA blockers:

  • HDFS-12840 - Creating a file with non-default EC policy in a EC zone is not correctly serialized in the editlog Resolved : Has gone through several rounds of review, looks close.
  • HADOOP-15080 - Aliyun OSS: update oss sdk from 2.8.1 to 2.8.3 to remove its dependency on Cat-x "json-lib" Resolved : New issue, waiting on LEGAL but we might need to pull this entire feature.
  • HADOOP-15059 - 3.0 deployment cannot work with old version MR tar ball which breaks rolling upgrade Resolved : Has gone through some review and has a +1 from Daryn, could use confirmation from Vinod and others
  • HADOOP-15058 - create-release site build outputs dummy shaded jars due to skipShade Resolved : Needs review, asked Allen but might need someone else to help.

GA criticals:

  • HDFS-12872 - EC Checksum broken when BlockAccessToken is enabled Resolved : Patch needs review
  • YARN-7381 - Enable the configuration: yarn.nodemanager.log-container-debug-info.enabled by default in yarn-default.xml Reopened : Has gone through some review and Wangda +1'd, could use confirmation from Ray and others

Features merged for GA:

  • Erasure coding
    • Testing is still ongoing at Cloudera, which resulted in  HDFS-12840 - Creating a file with non-default EC policy in a EC zone is not correctly serialized in the editlog Resolved  and  HDFS-12872 - EC Checksum broken when BlockAccessToken is enabled Resolved .
  • Classpath isolation (HADOOP-11656)
    • No change.
  • Compat guide (HADOOP-13714)
    • We slid a couple more changes into 3.0.0 after RC0 was cancelled, making this work more complete.
  • TSv2 alpha 2
    • No change.
  • API-based scheduler configuration  YARN-5734 - OrgQueue for easy CapacityScheduler queue configuration management Resolved
    • No change.
  • HDFS router-based configuration  HDFS-10467 - Router-based HDFS federation Resolved
    • No change.
  • Resource types  YARN-3926 - Extend the YARN resource model for easier resource-type management and profiles Resolved
    • Had some post-merge issues that were resolved, nothing outstanding.

2017-10-31

Lots of progress towards GA, we look on track for cutting RC0 this week. I ran the versions script to check the branch matches up with JIRA and fixed things up, and also checked that the changelog and release notes look reasonable.

Highlights:

  • Resource types vote has passed and will be merged with branch-3.0 shortly.
  • Down to three blockers on the dashboard, all being actively revved.

Red flags:

  • Still need to validate that resource types is ready to go once it's merged.

Previous tracked GA blockers that have been resolved or dropped:

  • Change of ExecutionType
    • YARN-7178 - Add documentation for Container Update API Resolved : Arun got the patch in with reviews from Wangda and Haibo.
  • ReservationSystem
    • YARN-4827 - Document configuration of ReservationSystem for FairScheduler Resolved : Yufei and Subru got this in.
  • Rolling upgrade
    • YARN-6142 - Support rolling upgrade between 2.x and 3.x Resolved : Ray resolved this since we think it's sufficiently complete.
  • Erasure coding
    • HDFS-12686 - Erasure coding system policy state is not correctly saved and loaded during real cluster restart Resolved : Resolved this one to incorporate it in HDFS-12682

GA blockers:

  • Rolling upgrade
    • HDFS-11096 - Support rolling upgrade between 2.x and 3.x Patch Available : I asked Sean if we can downgrade this from blocker
  • Erasure coding
    • HDFS-12682 - ECAdmin -listPolicies will always show SystemErasureCodingPolicies state as DISABLED Resolved : Actively being worked on and reviewed, should be in soon
    • HDFS-11467 - Support ErasureCoding section in OIV XML/ReverseXML Resolved : Waiting on HDFS-12682, I asked if we can work concurrently

Features merged for GA:

  • Erasure coding
    • Testing is still ongoing at Cloudera, no new bugs found recently
    • Closing on remaining blockers for GA
  • Classpath isolation (HADOOP-11656)
    • HADOOP-13916 - Document how downstream clients should make use of the new shaded client artifacts Open : Seems unlikely to make it
  • Compat guide (HADOOP-13714)
    • HADOOP-14876 - Create downstream developer docs from the compatibility guidelines Resolved : Patch is being actively revved and reviewed, Robert +1'd, Anu posted a big review
    • HADOOP-14875 - Create end user documentation from the compatibility guidelines Patch Available : No patch yet
  • TSv2 alpha 2
    • This was merged, no problems thus far (smile)
  • API-based scheduler configuration  YARN-5734 - OrgQueue for easy CapacityScheduler queue configuration management Resolved
    • Merged, no problems thus far (smile)
  • HDFS router-based configuration  HDFS-10467 - Router-based HDFS federation Resolved
    • Merged, no problems thus far (smile)
  • Resource types  YARN-3926 - Extend the YARN resource model for easier resource-type management and profiles Resolved
    • Vote has passed, Daniel is currently doing the mechanics of merging
    • Need to also perform final validation post-merge

Dropping the "unmerged features" section since we're not letting in anything else at this point.

2017-10-20

Apologies for skipping the update last week. Here's how we're tracking for GA.

Highlights:

  • Merge of HDFS router-based federation and API-based scheduler configuration with no reported problems. Kudos to the contributors involved!

Red flags:

  • We're making a last-minute push to get resource types (but not resource profiles in). Coming this late, it's a risk, but we decided it's worthwhile for this feature. See Daniel's yarn-dev email for the full rationale.
  • Still uncovering EC bugs from testing

Previously tracked GA blockers that have been resolved or dropped:

  • YARN-6623 - Add support to turn off launching privileged containers in the container-executor Resolved : Committed and resolved
  • Change of ExecutionType
    • YARN-7275 - NM Statestore cleanup for Container updates Resolved : Patch committed, resolved.
  • ReservationSystem
    • YARN-4859 - [Bug] Unable to submit a job to a reservation when using FairScheduler Resolved : Yufei tested this and found things mostly worked, filed two not-blocker followons:  YARN-7347 - Fixe the bug in Fair scheduler to handle a queue named "root.root" Open  and  YARN-7348 - Ignore the vcore in reservation request for fair policy queue Open

GA blockers:

  • Change of ExecutionType
    • YARN-7178 - Add documentation for Container Update API Resolved : Still no update from Arun, I pinged it.
  • ReservationSystem
    • YARN-4827 - Document configuration of ReservationSystem for FairScheduler Resolved : Yufei said he'd work on it as of 2 days ago
  • Rolling upgrade
    • YARN-6142 - Support rolling upgrade between 2.x and 3.x Resolved : I pinged this and asked for a status update
    • HDFS-11096 - Support rolling upgrade between 2.x and 3.x Patch Available : I pinged this and asked for a status update
  • Erasure coding
    • HDFS-12682 - ECAdmin -listPolicies will always show SystemErasureCodingPolicies state as DISABLED Resolved : New blocker filed this week, Xiao is working on it
    • HDFS-12686 - Erasure coding system policy state is not correctly saved and loaded during real cluster restart Resolved : New blocker filed this week, Sammi is on it
    • HDFS-12686 - Erasure coding system policy state is not correctly saved and loaded during real cluster restart Resolved : Old blocker, Huafeng is on it, waiting on review from Wei-Chiu or Sammi

Features merged for GA:

  • Erasure coding
    • Continued bug reporting and fixing based on testing at Cloudera.
    • Two new blockers filed this week, mentioned above.
    • Huafeng completed patch to reenable disabled EC tests
  • Classpath isolation (HADOOP-11656)
    • HADOOP-13916 - Document how downstream clients should make use of the new shaded client artifacts Open : I pinged it
  • Compat guide (HADOOP-13714)
    • HADOOP-14876 - Create downstream developer docs from the compatibility guidelines Resolved : Daniel has a patch up, revved based on Steve's review feedback, waiting on Steve's reply
    • HADOOP-14875 - Create end user documentation from the compatibility guidelines Patch Available : No patch yet
  • TSv2 alpha 2
    • This was merged, no problems thus far (smile)
  • API-based scheduler configuration  YARN-5734 - OrgQueue for easy CapacityScheduler queue configuration management Resolved
    • Merged, no problems thus far (smile)
  • HDFS router-based configuration  HDFS-10467 - Router-based HDFS federation Resolved
    • Merged, no problems thus far (smile)

Unmerged features:

  • Resource types / profiles (YARN-3926 and YARN-7069) (Wangda Tan)
    • We're going to try and get in resource types for 3.0.0 and leave resource profiles for 3.1.0. Daniel is spearheading this and other major contributors like Wangda and Sunil are onboard with the plan. Branch has been created and undergone testing, I expect a merge vote ASAP.
    • This is our biggest remaining risk.
  • YARN native services (YARN-5079) (Jian He)
    • YARN-7351 - Fix high CPU usage issue in RegistryDNS Resolved  is blocking
    • Allen is reviewing YARN-7127 with some design-level questions about the API and architecture
    • This doesn't look like this is getting in based on YARN-7127 discussion

2017-10-06

The beta1 RC0 vote passed, and beta1 is out! Now tracking GA features.

Highlights:

  • 3.0.0-beta1 has been released!
  • Router-based federation merge vote should be about to pass
  • API-based scheduler configuration merge vote is out, has the votes so far

Red flags:

  • Still need to nail down whether we're going to try and merge resource profiles. I've been emailing with Wangda and Daniel about this, we need to reach a decision ASAP (might already be too late).
  • Still waiting on Allen to review YARN native services feature.

Previously tracked GA blockers that have been resolved or dropped:

  • YARN-7134 - AppSchedulingInfo has a dependency on capacity scheduler Open :  Wangda downgraded this to "Major", dropping from list.

GA blockers:

  • YARN-6623 - Add support to turn off launching privileged containers in the container-executor Resolved : Actively being reviewed
  • Change of ExecutionType
    • YARN-7275 - NM Statestore cleanup for Container updates Resolved : Kartheek has posted a patch, waiting for review
    • YARN-7178 - Add documentation for Container Update API Resolved : No update from Arun, though it's just a docs patch
  • ReservationSystem
    • YARN-4859 - [Bug] Unable to submit a job to a reservation when using FairScheduler Resolved : Yufei has picked this up
    • YARN-4827 - Document configuration of ReservationSystem for FairScheduler Resolved : Yufei has picked this up, just a docs patch
  • Rolling upgrade
    • YARN-6142 - Support rolling upgrade between 2.x and 3.x Resolved : Ray is still going through JACC and proto output
    • HDFS-11096 - Support rolling upgrade between 2.x and 3.x Patch Available : Sean has revved the patch and is waiting on reviews from Ray, Allen

Features merged for GA:

  • Erasure coding
    • Continued bug reporting and fixing based on testing at Cloudera.
    • Still need to finish the 3.0 must-do's
  • Classpath isolation (HADOOP-11656)
    • HADOOP-14771 is still floating, along with adding documentation.
  • Compat guide (HADOOP-13714)
    • Synced with Daniel, he plans to wrap up the remaining  stuff next week
  • TSv2 alpha 2
    • This was merged, no problems thus far (smile)

Unmerged features:

  • Resource types / profiles (YARN-3926 and YARN-7069) (Wangda Tan)
    • This has been merged for 3.1.0, YARN-7069 tracks follow on work
    • Wangda said that he's okay waiting for 3.1.0 for this, we're waiting on Daniel. I synced with Daniel earlier this week, and he wants to try and get some of it into 3.0.0. Waiting on an update.
    • I still need a JIRA query for tracking the state of this.
  • HDFS router-based federation (HDFS-10467) (Inigo Goiri and Chris Douglas)
    • Merge vote should close any minute now
  • API-based scheduler configuration (Jonathan Hung)
    • Merge vote is out, will close next week
  • YARN native services (YARN-5079) (Jian He)
    • Subtasks were filed to address Allen's review comments from the previous merge vote, only one pending 
    • We need to confirm with Allen that this is ready to go, he hasn't been reviewing

2017-09-29

After about a month of slip, RC0 has been sent out for a VOTE. Focus now turns to GA, where we will attempt to keep the original beta1 target date (early November).

Highlights:

  • RC0 vote was sent out on Thursday, two binding +1's so far.

Red flags:

  • Resource profiles still has a number of pending subtasks, which is concerning from a schedule perspective. I emailed Wangda about this, and we need to discuss with other key contributors.
  • Native services has one pending subtask but we haven't gotten follow-on reviews from Allen (who -1'd the earlier merge vote). Need to confirm that we've satisfied his feedback.

Previously tracked beta1 blockers that have been resolved or dropped:

  • YARN-6623 was pushed out of beta1 to GA, has been committed so we can drop it from tracking.
  • HADOOP-14897 (Loosen compatibility guidelines for native dependencies): Patch committed!

beta1 blockers:

  • None, RC0 is out

GA blockers:

  • YARN-7134 - AppSchedulingInfo has a dependency on capacity scheduler Open  : this one popped out of nowhere, I don't have an update yet.
  • YARN-7178 - Add documentation for Container Update API Resolved : this also popped out of nowhere, no update yet.
  • YARN-7275 - NM Statestore cleanup for Container updates Resolved : Ditto
  • YARN-4859 - [Bug] Unable to submit a job to a reservation when using FairScheduler Resolved : Ditto
  • YARN-4827 - Document configuration of ReservationSystem for FairScheduler Resolved : Ditto

Features merged for GA:

  • Erasure coding
    • People are looking more at the flaky tests and nice-to-haves
    • Some bugs reported and being fixed based on testing at Cloudera
    • Need to finish the 3.0 must-do's.
  • Addressing incompatible changes (YARN-6142 and HDFS-11096)
    • Sean has posted a new rev of the rolling upgrade script
    • Some YARN PB backward compat issues that we decided weren't blockers and are scheduled for GA
  • Classpath isolation (HADOOP-11656)
    • HADOOP-13917 (Ensure nightly builds run the integration tests for the shaded client): Resolved, Sean retriggered and determined that this works.
    • HADOOP-14771 is still floating, along with adding documentation.
  • Compat guide (HADOOP-13714)
    • A few subtasks are targeted at GA
  • TSv2 alpha 2
    • This was merged, no problems thus far (smile)

Unmerged features:

  • Resource profiles (YARN-3926 and YARN-7069) (Wangda Tan)
    • This has been merged for 3.1.0, YARN-7069 tracks follow on work
    • ~7 patch available subtasks, I asked Wangda to set up a JIRA query for tracking this
  • HDFS router-based federation (HDFS-10467) (Inigo Goiri and Chris Douglas)
    • Inigo sent out the merge vote
  • API-based scheduler configuration (Jonathan Hung)
    • Jonathan sent out a discuss thread for merge, thinking is early next week. Larry did a security-oriented review.
  • YARN native services (YARN-5079) (Jian He)
    • Subtasks were filed to address Allen's review comments from the previous merge vote, only one pending 
    • We need to confirm with Allen that this is ready to go, he hasn't been reviewing

2017-09-22

We've had some late breaking blockers related to Docker support that are delaying the release. We're on a day-by-day slip at this point.

 

Highlights:

  • I did a successful test create-release earlier this week.

Red flags:

  • Docker work resulted in some last minute blockers

Previously tracked beta1 blockers that have been resolved or dropped:

  • HADOOP-14771 (hadoop-client does not include hadoop-yarn-client): Dropped this from the blocker list as it's mainly for documentation purposes
  • HDFS-12247 (Rename AddECPolicyResponse to AddErasureCodingPolicyResponse) was committed.
beta1 blockers:
  • YARN-6623 (Add support to turn off launching privileged containers in the container-executor): This is a newly escalated blocker related to the Docker work in YARN. Patch is up but we're still waiting on a commit.
  • HADOOP-14897 (Loosen compatibility guidelines for native dependencies): Raised by Chris Douglas, Daniel will post a patch soon.

beta1 features:

  • Erasure coding
    • Resolved last must-do for beta1!
    • People are looking more at the flaky tests and nice-to-haves
    • Eddy continues to make improvements to block reconstruction codepaths
  • Addressing incompatible changes (YARN-6142 and HDFS-11096)
    • Ray has gone through almost all the YARN protos and thinks we're okay to move forwards.
    • I think we'll move forward without this committed, given that Sean has run it successfully.
  • Classpath isolation (HADOOP-11656)
    • HADOOP-13917 (Ensure nightly builds run the integration tests for the shaded client): Sean wants to get this in before beta1 if there's time, it's already catching issues. Relies on YETUS-543 which I reviewed, waiting on Allen.
    • HADOOP-14771 might be squeezed in if there's time.
  • Compat guide (HADOOP-13714)
    • HADOOP-14897 Above mentioned blocker filed by Chris Douglas.
  • TSv2 alpha 2
    • This was merged, no problems thus far (smile)

GA features:

  • Resource profiles (Wangda Tan)
    • Merge vote was sent out. Since branch-3.0 has been cut, this can be merged to trunk (3.1.0) and then backported once we've completed testing.
  • HDFS router-based federation (Chris Douglas)
    • This is like YARN federation, very separate and doesn't add new APIs, run in production at MSFT.
    • If it passes Cloudera internal integration testing, I'm fine putting this in for GA.
  • API-based scheduler configuration (Jonathan Hung)
    • Jonathan mentioned that his main goal is to get this in for 2.9.0, which seems likely to go out after 3.0.0 GA since there hasn't been any serious release planning yet. Jonathan said that delaying this until 3.1.0 is fine.
  • YARN native services
    • Still not 100% clear when this will land.

2017-09-19

Sorry for the late update. We're down to one blocker and one EC must do! Made great progress over the last week and a bit.

We will likely cut RC0 this week.

Highlights:

  • Down to just two blocker issues!

Red flags:

  • HDFS unit tests are quite flaky. Some blockers were filed and then resolved or downgraded. More work to do here.

Previously tracked beta1 blockers that have been resolved or dropped:

  • HADOOP-14738 (Remove S3N and obsolete bits of S3A; rework docs): Committed!
  • HADOOP-14284 (Shade Guava everywhere): We resolved this since we decided it was unnecessary for beta1.
  • YARN-7162 (Remove XML excludes file format): Robert committed after review from Junping.
  • HADOOP-14847 (Remove Guava Supplier and change to java Supplier in AMRMClient and AMRMClientAysnc): Committed!
  • HADOOP-14238 (Rechecking Guava's object is not exposed to user-facing API): We dropped this off the blocker list in the absence of other known issues
  • HADOOP-14835 (mvn site build throws SAX errors): I committed after further discussion and review with Sean Mackrory and Allen. Planning to switch to japicmp for later releases.
  • HDFS-12218 (Rename split EC / replicated block metrics in BlockManager): Committed.

 

beta1 blockers:
  • HADOOP-14771 (hadoop-client does not include hadoop-yarn-client): This was committed but then reverted since it broke the build. Haibo and Sean are actively pressing towards a correct fix.


beta1 features:

  • Erasure coding
    • Resolved a number of must-dos
      • HDFS-7859 (fsimage changes) was committed!
      • HDFS-12395 (edit log changes) was also committed!
      • HDFS-12218 is discussed above.
    • Remaining blockers:
      • HDFS-12447 is to refactor some of the fsimage code, Andrew needs to review
    • Also been progress cleaning up the flaky unit tests, still more to do
  • Addressing incompatible changes (YARN-6142 and HDFS-11096)
    • Ray has gone through almost all the YARN protos and thinks we're okay to move forwards.
    • I think we'll move forward without this committed, given that Sean has run it successfully.
  • Classpath isolation (HADOOP-11656)
    • We have just HADOOP-14771 left.
  • Compat guide (HADOOP-13714)
    • This was committed! Some follow-on work filed for GA.
  • TSv2 alpha 2
    • This was merged, no problems thus far (smile)

GA features:

  • Resource profiles (Wangda Tan)
    • Merge vote was sent out. Since branch-3.0 has been cut, this can be merged to trunk (3.1.0) and then backported once we've completed testing.
  • HDFS router-based federation (Chris Douglas)
    • This is like YARN federation, very separate and doesn't add new APIs, run in production at MSFT.
    • If it passes Cloudera internal integration testing, I'm fine putting this in for GA.
  • API-based scheduler configuration (Jonathan Hung)
    • Jonathan mentioned that his main goal is to get this in for 2.9.0, which seems likely to go out after 3.0.0 GA since there hasn't been any serious release planning yet. Jonathan said that delaying this until 3.1.0 is fine.
  • YARN native services
    • Still not 100% clear when this will land.

2017-09-07

Slightly early update since I'll be out tomorrow. We're one week out, and focus is on blocker burndown.

Highlights:

  • 3.1.0 release planning is underway, led by Wangda. Target release date is in January.

Red flags:

  • YARN native services merge vote got a -1 for beta1, I recommended we drop it from beta1 and retarget for a later release.
  • 11 blockers on the dashboard, one more than last week (sad)

Previously tracked beta1 blockers that have been resolved or dropped:

  • HADOOP-14826 was duped to HADOOP-14738.
  • YARN-5536 (Multiple format support (JSON, etc.) for exclude node file in NM graceful decommission with timeout): Downgraded in priority in favor of YARN-7162 which Robert has posted a patch for.
  • MAPREDUCE-6941 (The default setting doesn't work for MapReduce job): I resolved this and Junping confirmed this is fine.
beta1 blockers:
  • HADOOP-14738 (Remove S3N and obsolete bits of S3A; rework docs): Steve has been actively revving this with our new committer Aaron Fabbri ready to review. The scope has expanded from HADOOP-14826, so it's not just a doc update.
  • HADOOP-14284 (Shade Guava everywhere): No change since last week. This is an umbrella JIRA.
  • HADOOP-14771 (hadoop-client does not include hadoop-yarn-client): Patch up, needs review, still waiting on Busbey. Bharat gave it a review.
  • YARN-7162 (Remove XML excludes file format): Robert has posted a patch and is waiting for a review.
  • HADOOP-14238 (Rechecking Guava's object is not exposed to user-facing API): Bharat took this up and turned it into an umbrella.
    • HADOOP-14847 (Remove Guava Supplier and change to java Supplier in AMRMClient and AMRMClientAysnc) Bharat posted a patch on a subtask to fix the known Guava Supplier issue in AMRMClient. Needs a review.
  • HADOOP-14835 (mvn site build throws SAX errors): I'm working on this. Debugged it and have a proposed patch up, discussing with Allen.
  • HDFS-12218 (Rename split EC / replicated block metrics in BlockManager): I'm working on this, just need to commit it, already have a +1 from Eddy.

beta1 features:

  • Erasure coding
    • There are three must-dos, all being actively worked on.
    • HDFS-7859 is being actively reviewed and revved by Sammi and Kai and Eddy.
    • HDFS-12395 was split out of HDFS-7859 to do the edit log changes.
    • HDFS-12218 is discussed above.
  • Addressing incompatible changes (YARN-6142 and HDFS-11096)
    • Ray and Allen reviewed Sean's HDFS rolling upgrade scripts.
    • Sean did a run through of the HDFS JACC report and it looked fine.
  • Classpath isolation (HADOOP-11656)
    • Sean has retriaged the subtasks and has been posting patches.
  • Compat guide (HADOOP-13714)
    • Daniel has been collecting feedback on dev lists, but still needs a detailed review of the patch.
  • YARN native services
    • Jian sent out the merge vote, but it's been -1'd for beta1 by Allen. I propose we drop this from beta1 scope and retarget.
  • TSv2 alpha 2
    • This was merged, no problems thus far (smile)

GA features:

  • Resource profiles (Wangda Tan)
    • Merge vote was sent out. Since branch-3.0 has been cut, this can be merged to trunk (3.1.0) and then backported once we've completed testing.
  • HDFS router-based federation (Chris Douglas)
    • This is like YARN federation, very separate and doesn't add new APIs, run in production at MSFT.
    • If it passes Cloudera internal integration testing, I'm fine putting this in for GA.
  • API-based scheduler configuration (Jonathan Hung)
    • Jonathan mentioned that his main goal is to get this in for 2.9.0, which seems likely to go out after 3.0.0 GA since there hasn't been any serious release planning yet. Jonathan said that delaying this until 3.1.0 is fine.

2017-09-01

We're two weeks out from beta1, focus is on blocker burndown.

Highlights:

  • S3Guard merged!
  • TSv2 alpha2 merged!
  • branch-3.0 has been cut after discussion on dev lists.

Red flags:

  • 10 blockers on the dashboard, closed and bumped some but new ones appeared.
  • Still need to land YARN native services and fix some S3Guard doc issues for beta1.
  • Rolling upgrade JIRAs for YARN and HDFS are not making any visible progress

Previously tracked beta1 blockers that have been resolved:

  • HADOOP-13363 (Upgrade to protobuf 3): I dropped this from beta1 since it's simply not going to happen in time.
  • YARN-7076: This was quickly resolved! Thanks Jian, Junping, Jason for the action.
  • YARN-7094 (Document that server-side graceful decom is currently not recommended): Patch committed!

beta1 blockers:

  • HADOOP-14826 (review S3 docs prior to 3.0.0-beta1): New blocker with S3Guard merged. Should just be a quick doc update.
  • HADOOP-14284 (Shade Guava everywhere): Agreement to shade yarn-client at at HADOOP-14771. Shading hadoop-hdfs is still being discussed?
  • HADOOP-14771 (hadoop-client does not include hadoop-yarn-client): Patch up, needs review, waiting on Busbey
  • YARN-5536 (Multiple format support (JSON, etc.) for exclude node file in NM graceful decommission with timeout): We're waiting on input from Junping.
  • MAPREDUCE-6941 (The default setting doesn't work for MapReduce job): Ray thinks this is a Won't Fix, waiting on Junping to confirm.
  • HADOOP-14238 (Rechecking Guava's object is not exposed to user-facing API): This relates to HADOOP-14771, I left a JIRA comment.

beta1 features:

  • Erasure coding
    • There are three must-dos. Two have patches, one might not be a must-do.
    • HDFS-11882 has been revved and reviewed, seems close
    • HDFS-11467 and HDFS-7859 are related, Sammi/Eddy/Kai are discussing, Sammi thinks we can still make beta1.
  • Addressing incompatible changes (YARN-6142 and HDFS-11096)
    • Sean has HDFS rolling upgrade scripts up, waiting on Ray to add some YARN/MR coverage too.
    • Need to do a final runthrough of the JACC reports for YARN and HDFS.
  • Classpath isolation (HADOOP-11656)
    • Sean has retriaged the subtasks and has been posting patches.
  • Compat guide (HADOOP-13714)
    • New patch is up, but needs review. Daniel asked Chris Douglas and Steve Loughran.
  • YARN native services
    • Jian sent out the merge vote
  • TSv2 alpha 2
    • This was merged, no problems thus far (smile)

GA features:

  • Resource profiles (Wangda Tan)
    • Merge vote was sent out. Since branch-3.0 has been cut, this can be merged to trunk (3.1.0) and then backported once we've completed testing.
  • HDFS router-based federation (Chris Douglas)
    • This is like YARN federation, very separate and doesn't add new APIs, run in production at MSFT.
    • If it passes Cloudera internal integration testing, I'm fine putting this in for GA.
  • API-based scheduler configuration (Jonathan Hung)
    • Jonathan mentioned that his main goal is to get this in for 2.9.0, which seems likely to go out after 3.0.0 GA since there hasn't been any serious release planning yet. Jonathan said that delaying this until 3.1.0 is fine.

2017-08-25

Another month flew by without an update. This is a big one.

Red flags:

  • 11 blockers still on the dashboard, with some filed recently. Need to burn these down.
  • There are many branch merges proposals flying around for features that were not originally being tracked for beta1 and GA. Introducing new code always comes with risk, so I'm working with the different contributors involved to discuss target versions, confirm readiness, and define quality bars for merge.

Miscellaneous blockers:

  • HADOOP-14284 (Shade Guava everywhere): We have agreement to shade the yarn client JAR. Shading hadoop-hdfs is still being discussed. 
  • HADOOP-13363 (Upgrade to protobuf 3): Waiting on the Guava shading first.
  • YARN-7076: New blocker, we need an assignee.
  • YARN-7094 (Document that server-side graceful decom is currently not recommended): Robert has a patch up, needs review. This is a stopgap for the old blocker YARN-5464.
  • YARN-5536 (Multiple format support (JSON, etc.) for exclude node file in NM graceful decommission with timeout): Robert has a proposal that needs to be pushed on.

beta1 features:

  • Erasure coding
    • There are three must-dos. Two have patches, one might not be a must-do.
    • I pinged the pluggable policy JIRA to see if metadata and API compatibility is complete.
  • Addressing incompatible changes (YARN-6142 and HDFS-11096)
    • Sean has HDFS rolling upgrade scripts up, waiting on Ray to add some YARN/MR coverage too.
    • Need to do a final runthrough of the JACC reports for YARN and HDFS.
  • Classpath isolation (HADOOP-11656)
    • We're down to the wire on this, I pinged Sean for an update.
  • Compat guide (HADOOP-13714)
    • I pinged the JIRA on this too, no updated patch since May

Features under discussion:

I discussed with a number of lead contributors on these features that were previously not on my radar.

3.0.0-beta1:

  • YARN native services (Jian He)
    • I was convinced that this is very separate from the core. I'll get someone from Cloudera to run it through our integration tests to verify it doesn't break anything downstream, then happy to merge.
  • TSv2 alpha 2 (Vrushali C)
    • Despite being called "alpha 2", this is more like "beta" in terms of readiness. Twitter is planning to roll it out to production. Seems quite done.
    • I double checked with Haibo, and he successfully ran it through our internal integration testing.

3.0.0 GA:

  • Resource profiles (Wangda Tan)
    • Alpha feature, APIs are not stable yet. Has some compatible PB changes, will verify rolling upgrade from branch-2. Touches some core parts of YARN.
    • Decided that it's too close to beta1 for this, we're going to test it a lot and make sure it's ready for 3.0.0 GA.
  • HDFS router-based federation (Chris Douglas)
    • This is like YARN federation, very separate and doesn't add new APIs, run in production at MSFT.
    • If it passes Cloudera internal integration testing, I'm fine putting this in for GA.

3.1.0:

  • Storage Policy Satisfier (Uma Gangumalla)
    • We're resolving some design discussions on JIRA. Plan is to do some MVP work on the API to get this into 3.1, and if we're happy with the second phase, consider for 3.0 GA.
  • HDFS tiered storage (Chris Douglas):
    • This touches some core stuff, and the write path is still being worked on. Still somewhat useful with just the read path. Targeting at 3.1.0 gives enough time to wrap this up.

2017-07-28

Long time since the previous update. In the meanwhile, we released alpha4! Onward to beta1.

Red flags:

  • Beta1 is targeted for 09/15, and there are a number of blockers we kicked down the road that need to be resolved.

Miscellaneous blockers

  • HADOOP-14284 (Shade Guava everywhere): This one is stalled, concerning.
  • HADOOP-13363 (Upgrade to protobuf 3): Waiting on the Guava shading first.

beta1 features:

  • Erasure coding
    • Need to wrap up pluggable EC work and other must-dos, reasonable confidence since these are actively being worked on
  • Addressing incompatible changes
    • YARN-6142 (compatibility with 2.x): Still need automated testing for this
    • HDFS-11096 (compatibility with 2.x): Sean has revived efforts here and has a rolling upgrade script under review, closing on some incompats we found too.
  • Classpath isolation (HADOOP-11656)
    • The subtasks were not completed for alpha4. Sean has revived efforts and is committed to finishing this for beta1.
  • Compat guide (HADOOP-13714)
    • Daniel posted an initial patch for discussion, but stalled out

2017-04-21

Red flags:

  • Less than a month out and some new blockers just surfaced, need to keep pushing.

Miscellaneous blockers

  • HADOOP-14284 (Shade Guava everywhere): Progressing, though trying to resolve Curator issues
  • HADOOP-13363 (Upgrade to protobuf 3): Waiting on the Guava shading first
  • HADOOP-14330 (Kerby breaks multiple SPN support): uncovered from secure testing with Oozie, unknown if other Kerby bugs are lurking
  • YARN-5894 (fst licensing): no assignee yet

alpha3 features:

  • Erasure coding
    • Good burndown of remaining blockers for alpha3, seems on track
    • HDFS-11643 (Balancer fencing fails when writing erasure coded lock file): Close to commit
    • HDFS-11644 (DFSStripedOutputStream should not implement Syncable): API-related issue that affects YARN and HBase

beta1 features:

  • Addressing incompatible changes
    • YARN-6142 (compatibility with 2.x): Still need automated testing for this
    • HDFS-11096 (compatibility with 2.x): Still need automated testing for this
  • Classpath isolation (HADOOP-11656)
    • Sean has not yet resumed work, but still plans to complete API-related subtasks before alpha3
  • Compat guide (HADOOP-13714)
    • Daniel has started working on this

2017-02-21

Red flags:

  • Not much movement on EC blockers since last update. Moved out planned alpha3 date as a result from mid-April to mid-May.

alpha3 features:

  • Erasure coding
    • Some discussion around configuration of enabled EC policies.
    • Manoj and Andrew are working on EC CLI improvements (HDFS-11405, HDFS-11426, HDFS-11427, HDFS-11428)
    • Manoj continues to work on OIV improvements (HDFS-10983)
    • Kai Sasaki revving patch on HDFS-8196 for webUI additions after reviews from Takanobu Asanuma and Andrew
  • Tomcat to Jetty conversions
    • HDFS-10860 (HttpFS): This was committed! Will remove section for next update.

beta1 features:

  • Addressing incompatible changes
    • YARN-6142 (compatibility with 2.x): Subtask at YARN-6143 has made good progress thanks to Sunil and Wangda, seems close to commit.
    • HDFS-11096 (compatibility with 2.x): Nothing further pending on this JIRA, but would like automated test runs setup to catch future breakages.
  • Classpath isolation (HADOOP-11656)
    • Sean plans to work on this in March/April, which lines up with the new alpha3 schedule

2017-02-03

3.0.0-alpha2 was released last week, onward to alpha3!

Notable release blockers:

  • Since alpha3 is planned as the last alpha before beta, the goal is to be feature complete by alpha3. beta1 can be used to stage any remaining incompatible changes, or nice-to-have feature-related work.
  • No other release blockers are filed.

alpha3 features:

  • Erasure coding
    • All must-do JIRAs are now marked as blockers for alpha3, goal is for EC to be feature complete and API stable by alpha3.
    • Light development activity. Andrew did some reviews, Manoj is working on OIV improvements.
  • Tomcat to Jetty conversions
    • HDFS-10860 (HttpFS): Patch has gone through multiple revs and has a +1 pending from Xiao

beta1 features:

  • Addressing incompatible changes
    • YARN-6142 (compatibility with 2.x): Tracking JIRA filed and assigned to Ray
    • HDFS-11096 (compatibility with 2.x): Nothing further pending on this JIRA, but would like automated test runs setup to catch future breakages.
  • Classpath isolation (HADOOP-11656)
    • Andrew has asked Sean for an ETA on completing the remaining subtasks, ideally by alpha3 but latest beta1

2017-01-19

branch-3.0.0-alpha2 has been created in git. Still on track for planned alpha2 release at the end of January.

Notable release blockers:

  • YARN-5667 (Move HBase backend code in ATS v2 into its separate module): One subtask left at YARN-5928, looks close to commit.
  • Also resolved blockers like HADOOP-13996, YARN-6071, YARN-5646, YARN-6072, HADOOP-13961, HDFS-11312, YARN-6068, HADOOP-13780 since last update. Wow!

Development activity

  • Addressing incompatible changes
    • HDFS-11096 (compatibility with 2.x): Sean filed and fixed HDFS-11312, which was a PB incompatibility. Seems like this work is complete on the HDFS side, but we still need YARN and MR help.
  • Erasure coding
    • Andrew went through and re-triaged all the JIRAs on HDFS-8031, using the "hdfs-ec-3.0-must-do" and "hdfs-ec-3.0-nice-to-have" labels. HDFS-8031 more closely reflects what we want to get done for 3.0, and the labels are now tracked on the JIRA dashboard.
  • Classpath isolation (HADOOP-11565)
    • Arun added a Maven variable to skip shading for faster build times. No progress on subtasks required for beta1.
  • Tomcat to Jetty conversions
    • HADOOP-13597 (KMS): this patch went in
    • HDFS-10860 (HttpFS): Patch up and gone through initial review

2017-01-04

Notable release blockers:

Down to just four release blockers, three of which have patches posted and seem to be close.

  • HADOOP-13780 (L&N updates): Patch is posted and +1'd, likely will be committed soon.
  • HADOOP-13896 (distribution tarball missing jars): Tiny patch is posted, pending review and precommit.
  • YARN-6022 (Revert changes of AbstractResourceRequest): Patch is posted and been reviewed, looks close.
  • YARN-5667 (Move HBase backend code in ATS v2 into its separate module): One subtask left at YARN-5928, PA but no progress recently.

Development activity

  • Addressing incompatible changes
    • MAPREDUCE-6704: resolved!
    • YARN-5882 (issuing YARN delegation tokens even in insecure deployments): Jian reverted YARN-4126 pending further discussion, so no longer a blocker.
    • HDFS-11096 (compatibility with 2.x): Sean reviewed the JACC output and posted a big comment on possible source/binary incompatibilities.
  • Erasure coding
    • Still slowly burning down must-dos from the JIRA query.
  • Classpath isolation (HADOOP-11565)
    • HADOOP-11804 (shaded client jars): Initial patch has been committed, Sean also filed follow-on JIRAs for related work we need to clean up before beta1.
  • Tomcat to Jetty conversions
    • HADOOP-13597 (KMS): +1 pending a few nits
    • HDFS-10860 (HttpFS): Patch up and gone through initial review

2016-12-09

Notable release blockers:

  • HADOOP-13780: L&N updates from Jetty bump. Xiao has picked this up, no patch yet though.

Development activity

  • Addressing incompatible changes
    • MAPREDUCE-6704: still seems close after examination of Docker issue
    • YARN-5882 (issuing YARN delegation tokens even in insecure deployments): Andrew proposed we revert YARN-4126 to unblock the release while discussion continues
    • YARN-5559: Resolved!
    • YARN-5184: Resolved!
    • HDFS-11096 (compatibility with 2.x): Discussion still underway. Kihwal and Eric worked to resolve HDFS-11207 which reverted some incompatible DN/NN changes that broke rolling upgrade.
  • Erasure coding
    • Still slowly burning down must-dos from the JIRA query.
  • Classpath isolation (HADOOP-11565)
    • HADOOP-11804 (shaded client jars): Sean added integration tests for WebHDFS and things seem close.
  • Tomcat to Jetty conversion (HDFS-11096, HDFS-10860)
    • HDFS-11096 (KMS): patch is under review
    • HDFS-10860 (HttpFS): no patch yet

2016-11-28

Notable release blockers:

  • org.json:json: legal-discuss has allowed grandfathering in releases until April 30, 2017 for usages of org.json:json. HADOOP-13050 bumping the problematic AWS SDK dependency has been committed. We're in the clear.
  • HADOOP-13780: L&N updates from Jetty bump. Still unassigned.

Development activity

  • Addressing incompatible changes
    • MAPREDUCE-6704: still seems close after examination of Docker issue
    • YARN-5882: issuing YARN delegation tokens even in insecure deployments, discussion underway
    • YARN-5559: found via JDiff analysis, fix some accidental incompats
    • YARN-5184: found via downstream testing. New abstract methods added to public class.
    • HDFS-11096 (compatibility with 2.x): Summary of a discussion between Andrew and Karthik was posted, seeking additional community feedback.
  • Erasure coding
    • Slowly burning down the must-dos from the JIRA query.
  • Classpath isolation (HADOOP-11565)
    • HADOOP-11804 (shaded client jars): POC testing with Avro and HBase seem to work, waiting on additional testing and reviews.
  • Tomcat to Jetty conversion (HDFS-11096, HDFS-10860)
    • Initial patch posted on HDFS-11096 for KMS conversion.

2016-11-16

Notable release blockers:

  • HADOOP-11694: Need to upgrade AWS SDK dependency, org.json:json fallout
  • HADOOP-13780: L&N updates from Jetty bump

Development activity

  • Addressing incompatible changes
    • MAPREDUCE-6704: make the default pseudo-cluster work without additional env var configuration, seems close?
    • YARN-5882: issuing YARN delegation tokens even in insecure deployments, discussion underway
    • YARN-5559: found via JDiff analysis, fix some accidental incompats
    • YARN-5184: found via downstream testing. New abstract methods added to public class.
  • Erasure coding
    • Work continues on API refinements and supportability-related JIRAs.
  • Classpath isolation (HADOOP-11565)
    • HADOOP-11804 (shaded client jars): Multiple revs have been posted based on review feedback.

 















  • No labels