...
- MetadataTypedColumnsetSerDe: This SerDe is used to read/write delimited records like CSV, tab-separated control-A separated records (sorry, quote is not supported yet).
LazySimpleSerDe: This SerDe can be used to read the same data format as MetadataTypedColumnsetSerDe and TCTLSeparatedProtocol, however, it creates Objects in a lazy way which provides better performance. Starting in Hive 0.14.0 it also supports read/write data with a specified encode charset, for example:
Code Block ALTER TABLE person SET SERDEPROPERTIES ('serialization.encoding'='GBK');
LazySimpleSerDe can treat 'T', 't', 'F', 'f', '1', and '0' as extended, legal boolean literals if the configuration property hive.lazysimple.extended_boolean_literal is set to
true
(Hive 0.14.0 and later). The default isfalse
, which means only 'TRUE' and 'FALSE' are treated as legal boolean literals.- ThriftSerDe: This SerDe is used to read/write Thrift serialized objects. The class file for the Thrift object must be loaded first.
- DynamicSerDe: This SerDe also read/write Thrift serialized objects, but it understands Thrift DDL so the schema of the object can be provided at runtime. Also it supports a lot of different protocols, including TBinaryProtocol, TJSONProtocol, TCTLSeparatedProtocol (which writes data in delimited records).
Also:
- For JSON files, JsonSerDe was added in Hive 0.12.0. An Amazon SerDe is available at
s3://elasticmapreduce/samples/hive-ads/libs/jsonserde.jar
for releases prior to 0.12.0. - An Avro SerDe was added in Hive 0.9.1. Starting in Hive 0.14.0 its specification is implicit with the STORED AS AVRO clause.
- A SerDe for the ORC file format was added in Hive 0.11.0.
- A SerDe for Parquet was added via plug-in in Hive 0.10 and natively in Hive 0.13.0.
- A SerDe for CSV was added in Hive 0.14.
...