Definition
Hudi supports the following views of stored data
- Read Optimized View : Queries on this view see the latest snapshot of the dataset as of a given commit or compaction action. This view exposes only the base/columnar files in latest file slices to the queries and guarantees the same columnar query performance compared to a non-hudi columnar dataset.
- Incremental View : Queries on this view only see new data written to the dataset, since a given commit/compaction. This view effectively provides change streams to enable incremental data pipelines.
- Realtime View : Queries on this view see the latest snapshot of dataset as of a given delta commit action. This view provides near-real time datasets (few mins) by merging the base and delta files of the latest file slice on-the-fly.
Following table summarizes the trade-offs between the different views.
...
Controls how def~tables are exposed to queries
Excerpt | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Given such flexible and comprehensive layout of data and rich def~timeline, Hudi is able to support three different ways of querying a def~table, depending on its def~table-type
|
Related concepts
- def~read-optimized-query
- def~incremental-query
- def~snapshot-query
- def~timeline
- def~table
- def~commit
- def~table-
...