INTRODUCTION

Apache CarbonData stores data in the columnar format, with each data block sorted independently with respect to each other to allow faster filtering and better compression.

DESCRIPTION

Though CarbonData stores data in Columnar format, it differs from the traditional Columnar formats as the columns in each row-group(Data Block) are sorted independent of the other columns. Though this arrangement requires CarbonData to store the row-number mapping against each column value, it makes it feasible to use binary search for faster filtering and since the values are sorted, same/similar values come together which yields better compression and reduces the storage overhead required by the row number mapping for the offsets.

BRIEF INTRO ABOUT COLUMNAR STORAGE

In a columnar database, all the column 1 values are physically together, followed by all the column 2 values, etc. The data is stored in record order, so the 100th entry for column 1 and the 100th entry for column 2 belong to the same input record. This allows individual data elements, for instance customer name, to be accessed in columns as a group, rather than individually row-by-row.         

Here is an example of a simple database table with 4 columns and 3 rows.

Table 1: Database Table with 4 columns and 3 rows

IDLastFirstBonus
1DoeJohn8000
2SmithJane4000
3BeckSam1000


Row-oriented storage :     1,Doe,John,8000;2,Smith,Jane,4000;3,Beck,Sam,1000;
   

Column-oriented storage :  1,2,3;Doe,Smith,Beck;John,Jane,Sam;8000,4000,1000;

One of the main benefits of a columnar database is that data can be highly compressed. The compression permits columnar operations — like MIN, MAX, SUM, COUNT and AVG— to be performed very rapidly.  Another benefit is that because a column-based storage is self-indexing, it uses less disk space than a relational database management system (RDBMS) containing the same data.

CARBONDATA FILE FORMAT

Apache CarbonData file contains groups of data called blocklet, along with all required information like schema, offsets and indices, etc, in a file footer.

The file footer can be read once to build the indices in memory, which then can be utilised for optimising the scans and processing of all the subsequent queries.

Each blocklet in the file is further divided into chunks of data called Data Chunks. Each data chunk is organised either in a columnar format or a row format, and stores the data of either in a single column or a set of columns. All blocklets in one file contain the same number and type of Data Chunks.

Figure 1 : CarbonData File 

carbon_data_format_new.png

Figure 2 : Detailed Description of CarbonData File Format

I) File Header :Contains information about

  • CarbonData file version number

  • List of column schema

  • Schema updation timestamp

II) Blocklet : A set of rows in columnar format

  • Balance between efficient scan and compression

  • Data are sorted along MDK (multi-dimensional keys)

  • Default blocklet size: 64MB (but the size is configurable)

  • Minimum size for predicate filtering

  • Large size for efficient reading and compression

data_carbondata 

sorted_mdk_cd sorted_data_cd

final_cd

Figure 3 : Pictorial representation of Columnar encoding 

Further the Blocklet contains Column Page groups for each column, also known as Column chunks.

The Column chunk is data for one column in a Blocklet.

  • Column data can be stored as sorted index

  • It is guaranteed to be contiguous in file

  • Allow multiple columns form a column group 

  • stored as a single column chunk in row-based format

  • suitable to set of columns frequently fetched together

  • saving stitching cost for reconstructing row

Each Data Chunk contains multiple groups of data called as Pages.

Page has the data of one column and the number of row is fixed to 32000 size. There are three types of pages.

  • Data Page: Contains the encoded data of a column/group of columns.

  • Row ID Page (optional): Contains the row id mappings used when the Data Page is stored as an inverted index.

  • suitable to low cardinality column

  • better compression & fast predicate filtering

inverted_blocklet_cd.jpg

Figure 4: Representation of Sort Columns within Column Chunks 

The inverted index tells the actual position of the column value in the column(i.e, the row number).

Example: value ‘1’ in the “column 2” is present in rows 1-8, so rest of the rows need not to be considered and hence allows fast filtering.

Also the inverted index stores the values in a sorted order and hence using binary search will effectively improve the searching time for the filter value.

It’ll also help to reconstruct the row, as the data has columnar storage, and the values might jumbled up during sorting and storing them column wise.

  • RLE Page (optional): Contains additional metadata used when the Data Page is RLE coded.

    encoding_cd

Figure 5: Run Length Encoding

III) Footer : Metadata information

  • File level metadata (Number of rows, segmentinfo ,list of blocklets info and index) & statistics

  • Schema

  • Blocklet Index & Metadata

blocklet_image.jpg

Figure 6 : CarbonData File Footer 
  • No labels