Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

RESQUE use spatialindex to catch the exception in each spatial operation.

Modification for Hive Core Source Code

We mainly modified the sematic analyzer (SemanticAnalyzer.java) in Hive core source code. We made Hive to read the spatial function. We also made Hive to call our RESQUE for spatial computing and read the output via our query pipelines. The path of SemanticAnalyzer.java is:

hivesp/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java

We mainly modified the analyzeInternal function and other related functions.

In function analyzeInternal, we will use function analyzeSpatial to detect whether the HQL contains spatial request. In function analyzeSpatial, we modified the sematic tree to prepare for the spatial operations (computings), such as extracting the requested predicate (st_intersects, st_contains or something else). We also optimize the sematic tree in function analyzeSpatial.

Changes in Hive

We have tried to make minimum change to Hive to not to compromise the compatibility.
Changes are mostly at the language, and query optimization layer.

Lanague layer: Hive.g is changed to add data types and other spatial language support.
Parsing/Analyzing: Mostly the SemanticAnalyzer is changed (by adding functions) to generate an executable query plan.
Optimization: The generated query plan is optimized with a function which can produce optimal query plan according to the spatial predicate and table information.

The RESQUE library will be deployed as shared library, and a path to this library will be provided to hive to invoke functions in the library via RANSFORM mechanismThen in function genSpatialJoinOperator, we use the extracted spatial operation to generate a spatial query command and call our engine RESQUE via our query pipelines.