Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The REPL DUMP command has an optional WITH clause to set command-specific configurations to be used when trying to dump. These configurations are only used by the corresponding REPL DUMP command and won't be used for other queries running in the same session. In this example, we set the configurations to exclude external tables and also include only metadata and don't dump data. 


Return values:

  1. Error codes returned as return error codes (and over jdbc if with HS2)
  2. Returns 2 columns in the ResultSet:
    1. <dir-name> - the directory to which it has dumped info.
    2. <last-evid> - the last event-id associated with this dump, which might be the end-evid, or the curr-evid, as the case may be.

Note:

Now, the dump generated will be similar to the kind of dumps generated by EXPORTs, in that it will contain a _metadata file, but it will not contain the actual data files, instead using a _files file as an indirection to the actual files. One more aspect of REPL DUMP is that it does not take a directory as an argument on where to dump into. Instead, it creates its own dump directory inside a root dir specified by a new HiveConf parameter, hive.repl.rootdir , which will configure a root directory for dumps, and returns the dumped directory as part of the return value from it. It is intended also that we will introduce a replication dumpdir cleaner which will periodically clean it up.

This call is intended to be synchronous, and expects the caller to wait for the result.

If HiveConf parameter hive.in.If HiveConf parameter hive.in.test is  false, REPL DUMP will not use a new dump location, thus it will garble an existing dump. Hence before taking an incremental dump, clear the bootstrap dump location if hive.in.test is false.

Return values:

  1. Error codes returned as return error codes (and over jdbc if with HS2)
  2. Returns 2 columns in the ResultSet:
    1. <dir-name> - the directory to which it has dumped info.
    2. <last-evid> - the last event-id associated with this dump, which might be the end-evid, or the curr-evid, as the case may be.

This call is intended to be synchronous, and expects the caller to wait for the result.

Bootstrap note : The FROM clause means that we read the event log to determine what to dump. For bootstrapping, we would not use FROM.

...