RCFileCat

$HIVE_HOME/bin/hive --rcfilecat is a shell utility which can be used to print data or metadata from RC files.

Data

Prints out the rows stored in an RCFile, columns are tab separated and rows are newline separated.

Usage:

hive --rcfilecat [--start=start_offset] [--length=len] [--verbose] fileName

--start=start_offset           Start offset to begin reading in the file
--length=len                   Length of data to read from the file
--verbose                      Prints periodic stats about the data read,
                               how many records, how many bytes, scan rate

Metadata

New in 0.11.0

Usage:

hive --rcfilecat [--column-sizes | --column-sizes-pretty] fileName

With the --column-sizes option set, instead of printing the data in the RC file, prints rows with 3 columns.
<column number> <uncompressed size> <compressed size>
The sizes of the columns are the aggregated sizes of the column in the entire file taken from the RC file headers.

With the --column-sizes-pretty option set prints the same data as is printed with the --column-sizes option but with a more human friendly format.

  • No labels