Filters are a feature in AsterixDB that stores extra information in the index about a chosen attribute to speed certain types of queries. Typically the value is something that is increasing, like time for example, that is frequently used as a predicate in a query. The way filters are stored is by writing the minimum and maximum value of the filter attribute to each LSM component of every index. Due to this, filters cannot be "undone" as every incoming record does not necessarily change the filter, however this only reduces the effectiveness of the filter, it does not cause wrong results. 

Despite having softer guarantees on consistency than the actual index values themselves, it is still necessary to log filter values for crash recovery. This is because the filter is used during search, so it either must be "open" (i.e. it excludes nothing from search), or it must have the values as wide or wider than when it did before crash. 

Currently during runtime, filters are updated in one of two ways. First, they can be updated as a direct result of an insert or upsert. In this case the filter is in the end of the incoming tuple. The LSM Index hides the filter from the wrapped BTree so the filter is not inserted. Then, the full (index + filter) tuple is logged. In the case of an upsert, the filter then must be also separately updated with the value of the previous tuple's filter value. This must also be logged if it changes the filter width. 


 

  • No labels