Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • TINYINT (1-byte signed integer, from -128 to 127)
  • SMALLINT (2-byte signed integer, from -32,768 to 32,767)
  • INT/INTEGER (4-byte signed integer, from -2,147,483,648 to 2,147,483,647)
  • BIGINT (8-byte signed integer, from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807)
  • FLOAT (4-byte single precision floating point number)
  • DOUBLE (8-byte double precision floating point number)
  • DOUBLE PRECISION (alias for DOUBLE, only available starting with Hive 2.2.0)
  • DECIMAL
    • Introduced in Hive 0.11.0 with a precision of 38 digits
    • Hive 0.13.0 introduced user-definable precision and scale
  • NUMERIC (same as DECIMAL, starting with Hive 3.0.0)

Date/Time Types

...

  • arrays: ARRAY<data_type> (Note: negative values and non-constant expressions are allowed as of Hive 0.14.)
  • maps: MAP<primitive_type, data_type> (Note: negative values and non-constant expressions are allowed as of Hive 0.14.)
  • structs: STRUCT<col_name : data_type [COMMENT col_comment], ...>
  • union: UNIONTYPE<data_type, data_type, ...> (Note: Only available starting with Hive 0.7.0.)

Column Types

Integral Types (TINYINTSMALLINTINT/INTEGERBIGINT)

Anchor
Integral Types (TINYINT, SMALLINT, INT/INTEGER, BIGINT)
Integral Types (TINYINT, SMALLINT, INT/INTEGER, BIGINT)
Anchor
tinyint
tinyint
Anchor
smallint
smallint
Anchor
int
int
Anchor
bigint
bigint

...

Integral literals

Column Types

Integral Types (TINYINTSMALLINTINT/INTEGERBIGINT)

Integral literals are assumed to be INT by default, unless the number exceeds the range of INT in which case it is interpreted as a BIGINT, or if one of the following postfixes is present on the number.

...

Info
titleVersion

INTEGER is introduced as a synonym for INT in Hive 2.2.0 (HIVE-14950).

Strings
Anchor
string
string
Anchor
Strings
Strings

String literals can be expressed with either single quotes (') or double quotes ("). Hive uses C-style escaping within the strings.

Varchar
Anchor
varchar
varchar
Anchor
Varchar
Varchar

Varchar types are created with a length specifier (between 1 and 6535565535), which defines the maximum number of characters allowed in the character string. If a string value being converted/assigned to a varchar value exceeds the length specifier, the string is silently truncated. Character length is determined by the number of code points contained by the character string.

...

Info
titleVersion

Varchar datatype was introduced in Hive 0.12.0 (HIVE-4844).

Char
Anchor
char
char
Anchor
Char
Char

Char types are similar to Varchar but they are fixed-length meaning that values shorter than the specified length value are padded with spaces but trailing spaces are not important during comparisons. The maximum length is fixed at 255.

...

Info
titleVersion

Char datatype was introduced in Hive 0.13.0 (HIVE-5191).

Timestamps
Anchor
timestamp
timestamp
Anchor
Timestamps
Timestamps

Supports traditional UNIX timestamp with optional nanosecond precision.

...

Info
titleVersion

Timestamps were introduced in Hive 0.8.0 (HIVE-2272).

Dates

Anchor
date
date
Anchor
Dates
Dates

DATE values describe a particular year/month/day, in the form YYYY-­MM-­DD. For example, DATE '2013-­01-­01'. Date types do not have a time of day component. The range of values supported for the Date type is 0000-­01-­01 to 9999-­12-­31, dependent on support by the primitive Java Date type.

...

Date types can only be converted to/from Date, Timestamp, or String types. Casting with user-specified formats is documented here.

Valid casts to/from Date type

Result

cast(date as date)

Same date value

cast(timestamp as date)

The year/month/day of the timestamp is determined, based on the local timezone, and returned as a date value.

cast(string as date)

If the string is in the form 'YYYY-MM-DD', then a date value corresponding to that year/month/day is returned. If the string value does not match this formate, then NULL is returned.

cast(date as timestamp)

A timestamp value is generated corresponding to midnight of the year/month/day of the date value, based on the local timezone.

cast(date as string)

The year/month/day represented by the Date is formatted as a string in the form 'YYYY-MM-DD'.

...

Supported Interval Description

Example

MeaningSince

Intervals of time units:

SECOND / MINUTE / DAY / MONTH / YEAR

INTERVAL '1' DAY

an interval of 1 day(s)

Hive 1.2.0 (HIVE-9792).

Year to month intervals, format: SY-M

S: optional sign (+/-)
Y: year count
M: month count

INTERVAL '1-2' YEAR TO MONTH

shorthand for:

INTERVAL '1' YEAR +
INTERVAL '2' MONTH

Hive 1.2.0 (HIVE-9792).

Day to second intervals, format: SD H:M:S.nnnnnn

S: optional sign (+/-)

D: day countH: hours 
M: minutes
S: seconds
nnnnnn: optional nanotime
INTERVAL '1 2:3:4.000005' DAY

shorthand for:

INTERVAL '1' DAY+
INTERVAL '2' HOUR +
INTERVAL '3' MINUTE +
INTERVAL '4' SECOND +
INTERVAL '5' NANO

Hive 1.2.0 (HIVE-9792).

Support for intervals with constant numbersINTERVAL 1 DAY

aids query readability / portability 

Hive 2.2.0 (HIVE-13557).

Support for intervals with expressions:
this may involve other functions/columns.
The expression must return with a number (which is not floating-point) or with a string.

INTERVAL (1+dt) DAYenables dynamic intervalsHive 2.2.0 (HIVE-13557).

Optional usage of interval keyword1 DAY
'1-2' YEAR TO MONTH
(

Note

the usage of the INTERVAL keyword is mandatory
for intervals with expressions (ex: INTERVAL (1+dt) SECOND)


1 DAY
'1-2' YEAR TO MONTH

INTERVAL 1 DAY
INTERVAL '1-2' YEARS TO MONTHINTERVAL (1+dt) SECOND

Hive 2.2.0 (HIVE-13557).

Add timeunit aliases to aid portability / readability:

 SECONDS / MINUTES / HOURS / DAYS / WEEKS / MONTHS / YEARS

2 SECONDS2 SECONDHive 2.2.0 (HIVE-13557).

Decimals 
Anchor
decimal
decimal

...

Anchor
Decimals
Decimals

Info
titleVersion

Decimal datatype was introduced in Hive 0.11.0 (HIVE-2693) and revised in Hive 0.13.0 (HIVE-3976).

NUMERIC is the same as DECIMAL as of Hive 3.0.0 (HIVE-16764).

The DECIMAL type in Hive is based on Java's BigDecimal which is used for representing immutable arbitrary precision decimal numbers in Java. All regular number operations (e.g. +, -, *, /) and relevant UDFs (e.g. Floor, Ceil, Round, and many more) handle decimal types. You can cast to/from decimal types like you would do with other numeric types. The persistence format of the decimal type supports both scientific and non-scientific notation. Therefore, regardless of whether your dataset contains data like 1E+44 like 4.004E+3 (scientific notation) or 4004 (non-scientific notation) or a combination of both, DECIMAL can be used for it.

...

For usage, see Floating Point Types in the Literals section below. Anchor

Decimal Literals
Anchor
Decimal Literals
Decimal Literals

Integral literals larger than BIGINT must be handled with Decimal(38,0). The Postfix BD is required. Example:

...

  1. Determine what precision/scale you would like to set for the decimal column in the table.
  2. For each decimal column in the table, update the column definition to the desired precision/scale using the ALTER TABLE command:

    Code Block
    sql
    sql
    ALTER TABLE foo CHANGE COLUMN dec_column_name dec_column_name DECIMAL(38,18);

    If the table is not a partitioned table, then you are done.  If the table has partitions, then go on to step 3.

  3. If the table is a partitioned table, then find the list of partitions for the table:

    Code Block
    sql
    sql
    SHOW PARTITIONS foo;
     
    ds=2008-04-08/hr=11
    ds=2008-04-08/hr=12
    ...


  4. Each existing partition in the table must also have its DECIMAL column changed to add the desired precision/scale.

    This can be done with a single ALTER TABLE CHANGE COLUMN by using dynamic partitioning (available for ALTER TABLE CHANGE COLUMN in Hive 0.14 or later, with HIVE-8411):

    Code Block
    sql
    sql
    SET hive.exec.dynamic.partition = true;
     
    -- hive.exec.dynamic.partition needs to be set to true to enable dynamic partitioning with ALTER PARTITION
    -- This will alter all existing partitions of the table - be sure you know what you are doing!
    ALTER TABLE foo PARTITION (ds, hr) CHANGE COLUMN dec_column_name dec_column_name DECIMAL(38,18);
     


    Alternatively, this can be done one partition at a time using ALTER TABLE CHANGE COLUMN, by specifying one partition per statement (This is available in Hive 0.14 or later, with HIVE-7971.):

    Code Block
    sql
    sql
    ALTER TABLE foo PARTITION (ds='2008-04-08', hr=11) CHANGE COLUMN dec_column_name dec_column_name DECIMAL(38,18);
    ALTER TABLE foo PARTITION (ds='2008-04-08', hr=12) CHANGE COLUMN dec_column_name dec_column_name DECIMAL(38,18);
    ...


The Decimal datatype is discussed further in Floating Point Types below.

Union Types
Anchor
union
union
Anchor
Union Types
Union Types

Warning
titleUNIONTYPE support is incomplete

The UNIONTYPE datatype was introduced in Hive 0.7.0 (HIVE-537), but full support for this type in Hive remains incomplete. Queries that reference UNIONTYPE fields in JOIN (HIVE-2508), WHERE, and GROUP BY clauses will fail, and Hive does not define syntax to extract the tag or value fields of a UNIONTYPE. This means that UNIONTYPEs are effectively lookpass-atthrough-only.

Union types can at any one point hold exactly one of their specified data types. You can create an instance of this type using the create_union UDF:

...

Info
titleVersion

Decimal datatype was introduced in Hive 0.11.0 (HIVE-2693). See Decimal Datatype above.

NUMERIC is the same as DECIMAL as of Hive 3.0.0 (HIVE-16764).

Decimal literals provide precise values and greater range for floating point numbers than the DOUBLE type. Decimal data types store exact representations of numeric values, while DOUBLE data types store very close approximations of numeric values.

...

For general information about running Hive tests, see How to Contribute to Apache Hive and Hive Developer FAQ.

Handling of NULL Values

Missing values are represented by the special value NULL. To import data with NULL fields, check documentation of the SerDe used by the table. (The default Text Format uses LazySimpleSerDe which interprets the string \N as NULL when importing.)

Allowed Implicit Conversions

...

FAQ.

Handling of NULL Values

Missing values are represented by the special value NULL. To import data with NULL fields, check documentation of the SerDe used by the table. (The default Text Format uses LazySimpleSerDe which interprets the string \N as NULL when importing.)

Change Types

When hive.metastore.disallow.incompatible.col.type.changes is set to false, the types of columns in Metastore can be changed from any type to any other type. After such a type change, if the data can be shown correctly with the new type, the data will be displayed. Otherwise, the data will be displayed as NULL.

Allowed Implicit Conversions


void

boolean

tinyint

smallint

int

bigint

float

double

decimal

string

varchar

timestamp

date

binary

void to

true

true

true

true

true

true

true

true

true

true

true

true

true

true

boolean to

false

true

false

false

false

false

false

false

false

false

false

false

false

false

tinyint to

false

false

true

true

true

true

true

true

true

true

true

false

false

false

smallint to

false

false

false

true

true

true

true

true

true

true

true

false

false

false

int to

false

false

false

false

true

true

true

true

true

true

true

false

false

false

bigint to

false

false

false

false

false

true

true

true

true

true

true

false

false

false

float to

false

false

false

false

false

false

true

true

true

true

true

false

false

false

double to

false

false

false

false

false

false

false

true

true

true

true

false

false

false

decimal to

false

false

false

false

false

false

false

false

true

true

true

false

false

false

string to

false

false

false

false

false

false

false

true

true

true

true

false

false

false

varchar to

false

false

false

false

false

false

false

true

true

true

true

false

false

false

timestamp to

false

false

false

false

false

false

false

false

false

true

true

true

false

false

date to

false

false

false

false

false

false

false

false

false

true

true

false

true

false

binary to

false

false

false

false

false

false

false

false

false

false

false

false

false

true

...


Save