DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
Definition
Type that determines how data will be laid out as file and stored, inside a def~table.
Following table summarizes the trade-offs between these two table types
| Trade-off | def~copy-on-write (COW) | def~merge-on-read (MOR) |
|---|---|---|
| Data Latency | Higher | Lower |
| Update cost (I/O) | Higher (rewrite entire def~table parquet) | Lower (append to `delta log`) |
| Write Amplification | Higher | Lower (depending on compaction strategy to the def~table parquet) |
| Query/Read Amplification | Lower/Zero | Higher (merging base and deltas on the fly) |

5 Comments
SemanticBeeng
Vinoth Chandar : would this not be named `commit type` instead
I'd call `HDFS` and `S3` "storage type`s instead.
fyi: Balaji Varadarajan, Nishith Agarwal
Vinoth Chandar
Commit is just one type of action done on a dataset. Not sure if thats a good way describe it..
SemanticBeeng
I do not insist but both COW and MOR def~table-types are about the `commit def~instant-action` (only).
Vinoth Chandar
MOR does delta commits. Not commits actually.. Only cow and compaction do commits..
Vinoth Chandar
SemanticBeeng you are right.. better to call this dataset-type instead of storage type.. Changing this