...
Spark mapjoin has a choice to take advantage of faster Spark functionality like broadcast-variable, or use something similar to distributed-cache. A discussion for choosing MR-style distributed cache is given in “small-table broadcasting” document in HIVE-7613, though broadcast-variable support might be added in future. Here is the plan that we want.
...