Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: EXPLAIN AST explained

...

Code Block
EXPLAIN [EXTENDED|AST|DEPENDENCY|AUTHORIZATION|LOCKS|VECTORIZATION] query

AUTHORIZATION

...

is

...

supported

...

from

...

HIVE

...

0.14.0

...

via

...

HIVE-5961. VECTORIZATION is

...

supported

...

from

...

Hive

...

2.3.0

...

via

...

HIVE-11394LOCKS is

...

supported

...

from

...

Hive

...

3.2.0

...

via HIVE-17683.

AST was removed from EXPLAIN EXTENDED in HIVE-13533 and reinstated as a separate command in HIVE-15932.

The use of EXTENDED in the EXPLAIN statement produces extra information about the operators in the plan. This is typically physical information like file names.

...

Code Block
EXPLAIN
FROM src INSERT OVERWRITE TABLE dest_g1 SELECT src.key, sum(substr(src.value,4)) GROUP BY src.key;

The output of this statement contains the following parts:

  • The Abstract Syntax TreeDependency Graph

    Code Block
    ABSTRACTSTAGE SYNTAX TREEDEPENDENCIES:
      (TOK_QUERY (TOK_FROM (TOK_TABREF src)) (TOK_INSERT (TOK_DESTINATION (TOK_TAB dest_g1)) (TOK_SELECT (TOK_SELEXPR (TOK_COLREF src key)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_FUNCTION substr (TOK_COLREF src value) 4)))) (TOK_GROUPBY (TOK_COLREF src key))))
    
    

    The Dependency Graph

    Code Block
    STAGE DEPENDENCIES:
      Stage-1 is a root stage
      Stage-2 depends on stages: Stage-1
      Stage-0 depends on stages: Stage-2
    
    Stage-1 is a root stage
      Stage-2 depends on stages: Stage-1
      Stage-0 depends on stages: Stage-2
    

    This shows that Stage-1 is the root stage, Stage-2 is executed after Stage-1 is done This shows that Stage-1 is the root stage, Stage-2 is executed after Stage-1 is done and Stage-0 is executed after Stage-2 is done.

...

  • A mapping from table alias to Map Operator Tree  This mapping tells the mappers which operator tree to call in order to process the rows from a particular table or result of a previous map/reduce stage. In Stage-1 in the above example, the rows from src table are processed by the operator tree rooted at a Reduce Output Operator. Similarly, in Stage-2 the rows of the results of Stage-1 are processed by another operator tree rooted at another Reduce Output Operator. Each of these Reduce Output Operators partitions the data to the reducers according to the criteria shown in the metadata.
  • A Reduce Operator Tree – This is the operator tree which processes all the rows on the reducer of the map/reduce job. In Stage-1 for example, the Reducer Operator Tree is carrying out a partial aggregation whereas the Reducer Operator Tree in Stage-2 computes the final aggregation from the partial aggregates computed in Stage-1.

The AST Clause

Outputs the query's Abstract Syntax Tree.

Example:

Code Block
EXPLAIN AST
FROM src INSERT OVERWRITE TABLE dest_g1 SELECT src.key, sum(substr(src.value,4)) GROUP BY src.key;


Outputs:

Code Block
ABSTRACT SYNTAX TREE:
  (TOK_QUERY (TOK_FROM (TOK_TABREF src)) (TOK_INSERT (TOK_DESTINATION (TOK_TAB dest_g1)) (TOK_SELECT (TOK_SELEXPR (TOK_COLREF src key)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_FUNCTION substr (TOK_COLREF src value) 4)))) (TOK_GROUPBY (TOK_COLREF src key))))


The DEPENDENCY Clause

The use of DEPENDENCY in the EXPLAIN statement produces extra information about the inputs in the plan. It shows various attributes for the inputs. For example, for a query like:

...