Rob Weltman, Bilung Lee, Arvind Prabhakar, Cheolsoo Park, Jarek Jarcec Cecho
1. Datatype support: we discussed the datatype support for Sqoop2. Among the points raised were the following:
- Sqoop 1 datatypes are different for Java, Hive and Avro. Most of the time the connector does the datatype mapping.
- For Sqoop ,2 we would like to standardize on the datatype support so that all connectors can conform for consistency.
- Some datatypes discussed today were:
- String, char: UTF 8 encoded
- Timestamp: Unix time representation to millisecond level
- Binary: byte array
- Numeric: byte, short, int, long, float, double, byte
- Null type
2. Intermediate data format
Jarcec has started out the intermediate format specification page. This was discussed at a high level in context of the various datatypes to be used.
- Arvind to create a wiki page for datatypes and send it to dev list for discussion
- Jarcec to further refine intermediate representation specification on the wiki
Agenda for next meeting: follow up discussion regarding datatype support and intermediate representation
- No call on July 4th, being a holiday in US. Next call will be on July 11th. All urged to continue these discussions on the mailing list.