Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added more detail to Merge description

...

  • Vectorization will be turned off for delete operations.  This is automatic and requires no action on the part of the user.  Non-delete operations are not affected.  Tables with deleted data can still be queried using vectorization.
  • In version 0.14 it is recommended that you set hive.optimize.sort.dynamic.partition=false when doing deletes, as this produces more efficient execution plans.

 

Merge

Note
titleVersion Information

MERGE is available starting in Hive 2.2.

Merge can only be performed on tables that support ACID. See Hive Transactions for details.

...

Code Block
Standard Syntax:
MERGE INTO <target table> AS T USING <source expression/table> AS S
ON <boolean expression1>
WHEN MATCHED [AND <boolean expression2>] THEN UPDATE SET <set clause list>
WHEN MATCHED [AND <boolean expression3>] THEN DELETE
WHEN NOT MATCHED [AND <boolean expression4>] THEN INSERT VALUES<value list>
Synopsis
  • Merge allows actions to be performed on a target table based on the results of a join with a source table.
  • In Hive 2.2, upon successful completion of this operation the changes will be auto-committed.

...

SQL Standard requires that an error is raised if the ON clause is such that more than 1 row in source matches a row in target.  This check is computationally expensive and may affect the overall runtime of a MERGE statement significantly.  hive.merge.cardinality.check=false may be used to disable the check at your own risk.  If the check is disabled, but the statement has such a cross join effect, it may lead to data corruption.

Notes
  • 1, 2, or 3 WHEN clauses may be present; at most 1 of each type: UPDATE UPDATE/DELETE/INSERT.
  • WHEN NOT MATCHED must be the last WHEN clause.
  • If both UPDATE and DELETE clauses are present, the 1st first one in the statement must include [AND <boolean expression>].
  • Vectorization will be turned off for merge operations.  This is automatic and requires no action on the part of the user.  Non-delete operations are not affected.  Tables with deleted data can still be queried using vectorization.

...