Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Thus, wencourage to run Helix controller using Pinot code base; we will provide the options for cluster admin to choose whether they prefer to use Pinot controller only mode, Helix only mode or dual controller mode of Pinot controller settings. 

Github issue: https://github.com/apache/incubator-pinot/issues/3957

Design

Lead Controller Resource

...

Pinot only controller focuses only on Pinot’s workloadsIt will do the periodic workloads for a subset of the tables when it becomes master  of some certain partitions in the new resource. 

...

  1. Refactor existing controller code to run Helix only controller (https://github.com/apache/incubator-pinot/pull/3864). 
  2. Refactor the existing code so that Pinot controller can have a unique interface for all the periodic workloads (https://github.com/apache/incubator-pinot/pull/3264).
  3. Add logic to create the new resource but disable it in HelixSetupUtils class. The rebalance mode can be set as FULL-AUTO (https://github.com/apache/incubator-pinot/pull/4047).
  4. Add controller config in Pinot controller side to choose whether to use (i.e. Pinot only mode, Helix only mode or dual mode(default mode)).
  5. Add logic in controller side to start checking whether new resource is enabled or not. Pinot controller will cache the partition number once it becomes master of the partition. If lead controller resource is yet disabled, controller won’t get any state transition messages. 
    iWhen there’s a state transition from Slave to Master for Partition_X:  Cache Partition number X in Pinot controller.
    ii. When there’s a state transition fromfrom Master to Slave for Partition_X:  Remove Partition number X from cache in Pinot controller. 
    iii. When a periodic task is run, or real-time segment completion request is received: 

  6. Add logic in server side to look at new resource if it’s disconnected from Helix controller & new resource is enabled or not. Currently server side logic caches the previous lead controller. With this new feature, the caching logic will still be on, and new checks will happen only when disconnected or we get not_leader message back. Since Pinot server only fetches external view once and will cache the new leader information, it doesn't increase ZK reads by too much.

 

Deployment Plan

The deployment plan consists of 3 steps. 

...

Roll out all the code changes and don’t enable the new resource yet. We won’t make any code changes after this step. 

Step 1 

Have a proper number of dual-mode controllers running in the clusterAdd as many new controllers to the cluster as the number of helix controllers you need. Be sure to include redundancy for failures/upgrades. We suggest three new controllers. Start them in dual mode. These dual-mode controllers will be Pinot-only controllers once rollout completed. 

Step 2

Enable the new resource. All the dual-mode controllers will be immediately registered as masters/slaves in the new resourcePeriodic tasks and real-time segment completion will immediately be distributed. Have it bake for several weeks. During this time, we can test the robustness of this feature by trying to disable and re-enable the resource, running stress tests like simulating node connection loss/failures, or bumping up a compatible Helix version. 

...

. The following criteria must be met before we move on to the next step and they might take several weeks to achieve:

  1. All LLC and HLC tables have completed at least one segment and started new ones.
  2. All tables are accounted for in all the periodic tasks (no table is ignored).
  3. At least one round of rolling restart of pinot controllers is done, and criteria 1 and 2 are verified after the restart.
  4. The lead controller resource is disabled, and criteria 1 and 2 are again verified. This is important in case we need to roll back due to a problem.

Step 3 

After verifying everything working fine, we can add 2 to 3 Helix-only controllers to the cluster, so that they can be the candidates of the Helix cluster leadership. Then, switch all the dual mode controllers to Pinot-only mode one by one. After doing so, only Helix-only controller can be Helix leader, and all the Pinot-only controllers only work on Pinot’s workloads. Rollout finished. 

 Image RemovedImage Added

Rollback Plan

Rollback plan is the reverse of rollout plan: 

Step 1 

Switch all Restart all the Pinot-only controllers to dual-mode controllers. Then, shutdown  

Step 2 

Shutdown Helix-only controllers. 

Step

...

3

Disable lead controller resource. All the controller workload will be done by Helix leader. 

Image RemovedImage Added

Test plans and schedule

...