Controls how def~tables are exposed to queriesGiven such flexible and comprehensive layout of data and rich def~timeline, Hudi is able to support three different ways of querying a def~table, depending on its def~table-type

Query Typedef~copy-on-write (COW)def~merge-on-read (MOR)
Snapshot QueryQuery is performed on the latest def~base-files across all def~file-slices in a given def~table or def~table-partition and will see records written upto the latest def~commit action.Query is performed by merging the latest def~base-file and its def~log-files across all def~file-slices in a given def~table or def~table-partition and will see records written upto the latest def~delta-commit action.
Incremental QueryQuery is performed on the latest def~base-file, within a given range of start , end  def~instant-times (called the incremental query window), while fetching only records that were written during this window by use of the def~hoodie-special-columnsQuery is performed on a latest def~file-slice within the incremental query window, using a combination of reading records out of base or log blocks, depending on the window itself.
Read Optimized QuerySame as snapshot queryOnly access the def~base-file, providing data as of the last def~compaction action performed on a given def~file-slice. In general, guarantees of how fresh/upto date the queried data is, depends on def~compaction-policy

Related concepts

  1. def~read-optimized-query
  2. def~incremental-query
  3. def~snapshot-query
  4. def~timeline
  5. def~table
  6. def~commit
  7. def~table-type

  • No labels

1 Comment

  1. Please define all the concepts referenced in this page Vinoth Chandar.

    If you can also be extra kind to reference code matching them the would be awesome. (smile)