Child pages
  • Unit Testing Hive SQL

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


Through limited testing it was also shown that VIEWs do not in fact suffer from performance issues when compared with both a single large query , and chaining tables together, as often prophesied; in fact the execution plans and times for VIEWs and monolithic queries were comparable. 

Other approaches, such as using variable substitution, or HPL/SQL, were suggested, but on inspection, it seems these approaches cannot be used for achieving the goal of query modularization, as a variable cannot be substituted for an entire query fragment, and HPL/SQL lacks pipelined functions.Variable substitution was suggested as an approach for modularizing large queries, but upon inspection it was found to be unsuitable as an additional bash file is required which would make testing more complex. HPL/SQL was also considered, however it does not have the necessary pipelined function feature required for query modularization. 

Tools and frameworks

When constructing tests it is helpful to have a framework that simplifies the declaration and execution of tests. Typically these tools allow the specification of many of the following: