Subqueries in the FROM Clause
Hive supports subqueries only in the FROM clause (through Hive 0.12). The subquery has to be given a name because every table in a FROM clause must have a name. Columns in the subquery select list must have unique names. The columns in the subquery select list are available in the outer query just like columns of a table. The subquery can also be a query expression with UNION. Hive supports arbitrary levels of subqueries.
Example with simple subquery:
Example with subquery containing a UNION ALL:
Subqueries in the WHERE Clause
As of Hive 0.13 some types of subqueries are supported in the WHERE clause. Those are queries where the result of the query can be treated as a constant for IN and NOT IN statements (called uncorrelated subqueries because the subquery does not reference columns from the parent query):
The other supported types are EXISTS and NOT EXISTS subqueries:
There are a few limitations:
- These subqueries are only supported on the right-hand side of an expression.
- IN/NOT IN subqueries may only select a single column.
- EXISTS/NOT EXISTS must have one or more correlated predicates.
- References to the parent query are only supported in the WHERE clause of the subquery.