This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Page tree
Skip to end of metadata
Go to start of metadata

There are a lot of questions around how rebalance works, why certain things happen, and how to tune it. 

Why does a rebalance simulate take so long?

Currently a simulate is calculated off the correct bucket size.  This means that if simulate occurs during a disk recovery, it will wait until the recovery is completed and gii is finished to get the correct size of the bucket entry and byte size. The size information is used to calculate where buckets should be moved to in simulate and rebalance. Also, if during the simulate (same as rebalance) there is a member join or depart event, it will rebuild the model and it means starting a new simulate/rebalance again.

From our limited testing, simulate takes much less time than rebalance without a disk recovery in progress at the time of the simulate. In a small dunit test, simulate took 1.5 seconds and an actual rebalance took 39 seconds.

Improvements: Could log (or error) when simulate has started but recovery hasn’t completed.  We could also log (or error) if members join or depart during a simulation. 


 

How does threading impact behavior, especially now if set to a high value (could it be reset to 1 if threads configured to be greater than the number of PRs?

There is one thread setting we find that can be used in rebalance.

 

gemfire.resource.manager.threads is used to achieve the parallel rebalance of multiple partitioned regions. For each region, the rebalance operation submits a future task to a thread pool. Multiple threads could then rebalance each PR. The setting could be any number, if the thread setting is more than the number of the PRs, it means some threads will not have work to do. If set to one, it means only one thread will do the rebalance work for all the PRs. The default on develop is 1.

gemfire.MAX_PARALLEL_BUCKET_RECOVERIES determines how many threads will be used to recover redundancy when a new node starts up. Current default is 8. Note that we observed that this only affected redundancy recovery, not rebalancing.

 

 

Why are multiple rebalances needed?

We are still investigating this.