=20

=20
- =20
- Preliminaries=20
- =20
- Overview =20
- Specific Use Cases=20
- =20
- Geo-Location =20
- Side-Table Similarity=20

=20
- Requirements=20
- =20
- Goals =20
- Specific Non-Goals =20

=20

=20
- Literature Review=20 =20
- Design=20
- =20
- Map-side =20
- Reduce-side=20 =20

=20
- References =20

HIVE-556 requests that Hive support non-equ= ality joins commonly called theta joins. Theta joins cover all kinds of non= -equality joins such as <, <=3D, >, >=3D, <>, LIKE, RLIKE= , and Generic UDF.

=20Range joins, a specific kind of theta join, is extremely useful in tempo= ral and spatial applications. This work will focus first on range join as i= t=E2=80=99s arguably the most important theta join. After range join is imp= lemented the remaining theta join operators can be implemented with relativ= e ease. Supporting theta join and specifically range join is important as i= t will makes Hive competitive in additional use cases and removes a deficie= ncy when compared against traditional systems.

=20This use case is very common today for many common users of Hive website= s, mobile telephone services, online ad marketers. The user wants to join w= ebsite access logs with geo-location information of ip addresses. This can = be implemented as an interval join. Currentl= y this use case is implemented by creating a custom UDF which performs the = lookup. Compiling and deploying UDF=E2=80=99s is an error prone process and= not a common task for most users of Hive.

=20As an example of this use case, we can utilize the Maxmind GeoIP by coun= try dataset which is 79,980 records. Consider joining that dataset with a v= ery small access log of 5,000,000 entries using a cartesian product and the= n filtering the result. This approach results in 399,900,000,000 tuples tha= t need to be filtered in the mapper. Hive trunk was modified to implement m= ap-side interval join and the previously discussed join completed in 21 sec= onds on a single host while the cartesian product then filter approach ran = for many hours before it was killed.

=20This use case was originally proposed in HIVE-5= 56 by it=E2=80=99s creator Min Zhou. The user has a side table with som= e entries and wants to join the side table with a main table based on simil= arity. The example provide was using the RLIKE operator but LIKE or a gener= ic boolean UDF could be used as well.

=20A notable non-goal of this work is n-way theta joins. The algorithm to i= mplement n-way theta joins was inspired by the algorithm to implement 2-way= theta joins. Therefore n-way theta joins can be implemented in future work= .

=20- =20
- Map-side 2-way theta joins =20
- Reduce-side 2-way theta join =20

- =20
- Map-side n-way theta joins =20
- Reduce-side n-way theta joins =20
- Additional statistic collection =20

This work adds a Merge step to Map-Reduce which allows for easy expressi= on of relational algebra operators. This is interesting but not immediately= useful as it requires modification of the Map-Reduce framework it=E2=80=99= s not immediately useful.

=20This work studies a special type of set similarity, specifically similar= strings or bit vectors. This work could be useful in implementing some ope= rators such as LIKE. In addition this method which requires statistics to b= e calculated at run time results in multiple Map-Reduce jobs.

=20This work proposes an algorithm 1-Bucket-Theta to perform theta joins on= two relations in a single Map-Reduce job given some basic statistics, name= ly the cardinality of the two relations. This approach allows parallel impl= ementation of cartesian product as well. The work also details how addition= al input statistics can be exploited to improve efficiency. The approach is= to partition a join-matrix of the two relations.

=20This work is inspired by [3] and expands the method to N-way joins. The = approach used is to partition a hypercube of the relations. An approach to = merge the resulting many Map-Reduce jobs into a single job is also discusse= d with results similar to Y-Smart [5].

=20A large number of theta join use cases have the nice property that only = one of the relations is =E2=80=9Clarge=E2=80=9D. Therefore many theta joins= can be converted to map-joins. Presently these use cases utilize a map-sid= e cartesian product with post-join filters. As noted in the geo-location us= e case above some of these use cases, specifically range joins, can see sev= eral orders of magnitude speedup utilizing theta join.

=20Currently Map-side join utilizes a hashmap and a join is performed when = the incoming key matches a key in the hash map. To support range join this = will abstracted into a pluggable interface. The plugin can decide how two k= eys are joined. The equality join interface will continue to utilize a hash= map while range join can use a data structure such as an interval tree. Oth= er such optimizations can be made. For example the not equals join conditio= n <> can use a = view on top of a map.

=20Reduce-side joins will be implemented via 1-Bucket-Theta as described in= [3]. This requires the cardinality of the two relations and therefore to p= erform a reduce-side theta join statistics must be turned on. Initially if = the required statistics do not exist an exception will be thrown indicating= the problem. After the initial implementation we can use a method to estim= ate the cardinality.

=20As previously mention a detailed description of 1-Bucket-Theta is locate= d [3]. As such the discussion of the internals of the algorithm will be bri= ef. Joining two relations S and T can be viewed as a matrix with the S, the= smaller relation, on the left and T on the right.

=20The matrix is partitioned by r, the number of reducers. An example join = matrix follows, with four reducers 1-4 each a separate color:

=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
=20

=20
| | | | |

| 1 | 1 | 2 | 2 |

| 1 | 1 | 2 | 2 |

| 3 | 3 | 4 | 4 |

| 3 | 3 | 4 | 4 |

In the map phase, each tuple in S is sent to all reducers which intersec= t the tuples=E2=80=99 row id. For example the S-tuple with the row id of 2,= is sent to reducers 1, and 2. Similarly each tuple in T is sent to all red= ucers which intersect the tuples=E2=80=99 row id. For example, the tuple wi= th rowid 4, is sent to reducers 2 and 4.

=20In Hive and MapReduce row ids aren=E2=80=99t common available. Therefore= we choose a random row id between 1 and |S| (cardinality of S) for S and 1= and |T| (cardinality of T) for T. Thus a reduce-side theta join must know = the estimated cardinality of each relation and statistics must be enabled. = Random row id=E2=80=99s will result in well balanced reducer input when pro= cessing larger relations. As noted in [3], the partitioning scheme works su= ch that if a relation is much smaller than it=E2=80=99s pair the smaller re= lation will be broadcast two all reducers. As such therefore random-skew wh= ich would occur for small relations does not impact the algorithm in practi= ce. Additionally in Hive if a relation is small the join is converted to a = map-side join and 1-Bucket-Theta is not utilized.

=20In the mapper the join matrix will be initialized, a random row id chose= n, and then the tuple will be emitted for each reducer that intersects the = row id. Hive already has a mechanism to set the hash code for a specific tu= ple which can be re-used here. Additionally the tuples will need to be sort= ed in such a way so that tuples for S arrive in the reducer first. Luckily = Hive already implements this via the JoinReorder class.

=20The reducer is fairly simple, it buffers up the S relation and then perf= orms the requested join on each T tuple that is received.

=20