Child pages
  • StatsDev

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: add some info about stat fields

...

Column level top K statistics are still pending; see HIVE-3421.

Quick overview

DescriptionStored inCollected bySince
Number of partition the dataset consists ofFictional metastore property: numPartitionscomputed during displaying the properties of a partitioned tableHive 2.3
Number of files the dataset consists ofMetastore table property: numFiles Automatically during Metastore operations 
Total size of the dataset as its seen at the filesystem levelMetastore table property: totalSize 
Uncompressed size of the datasetMetastore table property: rawDataSize

Computed, these are the basic statistics. Calculated automatically when Configuration Properties#hive.stats.autogather is enabled.
Can be collected manually by: ANALYZE TABLE ... COMPUTE STATISTICS

Hive 0.8
Number of rows the dataset consist ofMetastore table property: numRows 

Column level statistics

Metastore; TAB_COL_STATS tableComputed, Calculated automatically when Configuration Properties#hive.stats.column.autogather is enabled.
Can be collected manually by: ANALYZE TABLE ... COMPUTE STATISTICS FOR COLUMNS
 

 

Implementation

The way the statistics are calculated is similar for both newly created and existing tables.

...