...
Estimated Development Effort: medium
Staged replicated join
Currently for replicated join, right table must fit memory. We can borrow idea from Hive staged map join to spill right table to disk if not fit, and process the overflow in map cleanup.
Category: Performance
Dependency:
References:
Estimated Development Effort: medium
Agreed Work, Unknown Approach
...
Estimated Development Effort: large
Staged replicated join
Sparse tuple support
Implement sparse tuple and expose to Pig syntax to make the memory footprint small for sparse dataCurrently for replicated join, right table must fit memory. We can borrow idea from Hive staged map join to spill right table to disk if not fit, and process the overflow in map cleanup.
Category: Performance New Functionality
Dependency:
References:
Estimated Development Effort: medium small
Experimental
Add List Datatype
...