Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: explain immutable tables (HIVE-6406)

...

  • INSERT OVERWRITE will overwrite any existing data in the table or partition
    • unless IF NOT EXISTS is provided for a partition (as of Hive 0.9.0).
  • INSERT INTO will append to the table or partition, keeping the existing data in tactintact. (Note: INSERT INTO syntax is only available starting in version 0.8.)
    • As of Hive 0.13.0, a table can be made immutable by creating it with TBLPROPERTIES ("immutable"="true"). The default is "immutable"="false".
      INSERT INTO behavior into an immutable table is disallowed if any data is already present, although INSERT INTO still works if the immutable table is empty. The behavior of INSERT OVERWRITE is not affected by the "immutable" table property.
      An immutable table is protected against accidental updates due to a script loading data into it being run multiple times by mistake. The first insert into an immutable table succeeds and successive inserts fail, resulting in only one set of data in the table, instead of silently succeeding with multiple copies of the data in the table. 
  • Inserts can be done to a table or a partition. If the table is partitioned, then one must specify a specific partition of the table by specifying values for all of the partitioning columns.
  • Multiple insert clauses (also known as Multi Table Insert) can be specified in the same query.
  • The output of each of the select statements is written to the chosen table (or partition). Currently the OVERWRITE keyword is mandatory and implies that the contents of the chosen table or partition are replaced with the output of corresponding select statement.
  • The output format and serialization class is determined by the table's metadata (as specified via DDL commands on the table).

...