Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

...

  • Indexed Transactional HBase 'ITH' uses secondary tables but handles that transparently for the user.
  • Indexed HBase 'IH' is new in 0.20.3 https://issues.apache.org/jira/browse/HBASE-2037Image Removed compares both approaches. Both approaches don't support multi-valued attributes, so custom secondary tables are used for partition indices.

...

  • For equality matches the scan starts at "=value\00" and ends at "=value\01". The trailing null byte and one byte bound the scan to the exact value.
  • For greater-than matches the scan starts at "=value" without an upper bound.
  • For less-than matches the scan stops at "=value" incremented by one bit and without an lower bound.
  • Wiki MarkupFor substring matches a server-side filter is used: "^=<value pattern>\00\[A\-Fa\-f0\-9]\{8}\-\[A\-Fa\-f0\-9]\{4}\-\[A\-Fa\-f0\-9]\{4}\-\[A\-Fa\-f0\-9]\{4}\-\[A\-Fa\-f0\-9]\{12}$"
    If the filter contains an initial pattern the lower bound "=value" and upper bound "=value" incremented by one bit can be set.

It is not possible to obtain a candidate count from that type of index table in constant time. Instead the table must be scanned.

...

HBase sorts rows lexicographical by row key. To use the indices for greater-than and lesser-than filters it is important that the byte representation of the normalized values follows that rule. http://brunodumon.wordpress.com/2010/02/17/building-indexes-using-hbase-mapping-strings-numbers-and-dates-onto-bytes/Image Removed provides a good overview.