Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: extend

Query reexecution provides a facility to re-run the query multiple times in case of an unfortunate event happens.

Table of Contents


Introduced in Hive 3.0 (HIVE-17626)

Big picture

Query reexecution provides a facility to re-run the query multiple times in case of an unfortunate even happens.

ReExecition strategies


ReExecition strategies


Enables to change the hive settings for all reexecutions which will be happening. It works by adding a configuration subtree as an overlay to the actual hive settings(reexec.overlay.*)

Code Block
set zzz=1;
set reexec.overlay.zzz=2;

set hive.query.reexecution.enabled=true;
set hive.query.reexecution.strategies=overlay;

create table t(a int);
insert into t values (1);
select assert_true(${hiveconf:zzz} > a) from t group by a;

Every hive setting which has a prefix of "reexec.overlay" will be set for all reexecutions.

A more real life example would be to disable join auto conversion for all reexecutions:

Code Block


During query execution; the actual number passing rows in every operator is tracked. This information is reused during re-planning which could result in a better plan.

Situation in which this would be needed:

  • missing statististics
  • incorrect statistics
  • many joins

It's not that easy to craft queries which will lead to OOM situations; but to enable it:

Code Block
set hive.query.reexecution.strategies=overlay,reoptimize;

Operator Matching

Operator level statistics are matched to the new plan using operator subtree matching this also enables to match the information to a query which have "similar" parts.


hive.query.reexecution.enabledtrueFeature enabler

reexecution plugins; currently overlay and reoptimize is supported


runtime statistics can be persisted:

  • query: - only used during the reexecution
  • hiveserver: persisted during the lifetime of the hiveserver
  • metastore: persisted in the metastore; and loaded on hiveserver startu[

number of reexecution that may happen


Enable to gather runtime statistics on all queries.

hive.query.reexecution.stats.cache.batch.size-1If runtime stats are stored in metastore; the maximal batch size per round during load.
hive.query.reexecution.stats.cache.size100 000Size of the runtime statistics cache. Unit is: OperatorStat entry; a query plan consist ~100.
runtime.stats.clean.frequency3600sFrequency at which timer task runs to remove outdated runtime stat entries.
runtime.stats.max.age3daysStat entries which are older than this are removed.