This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Child pages
  • Unit Testing Hive SQL

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

To effectively unit test the process that this query represents, VIEWs views can be used to separate and encapsulate each of the subqueries so that they can be tested independently. This approach seems to be more effective than using sequential execution of components with intermediate tables (both TEMPORARY and not), due to the performance implications of using intermediate tables; not only do they take longer to run, but more I/O is generated, and in using this approach the query optimization opportunities available to the query planner become limited.

Through limited testing it was also shown that VIEWs do not in fact suffer from performance issues when compared with both a single large query and chaining tables together, as often prophesied; in fact the execution plans and times for VIEWs and views and monolithic queries were comparable. 

...