Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Reuse the current index layout and just treat deletemap as a new index file type

...

2.2. DeleteMap index file encoding

Like hash index, one bucket one deleteMap index. Therefore, a deleteMap index file needs to contain bitmaps of multiple files in the same bucket, its structure is actually a map<fileName, bitmap>, to support high-performance reads, we have designed the following file encoding to store this :

In IndexFileMeta:

{
  "org.apache.paimon.avro.generated.record":

...

{
    "_VERSION":

...

1,

...


    "_KIND":

...

0,

...


    "_PARTITION":

...

"\u0000\u0000\u0000\u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000p\u0000\u0000\u0000\u0000\u0000\u0000",

...


    "_BUCKET":

...

0,

...


    "_TYPE":

...

"DELETE_MAP",

...


    "_FILE_NAME":

...

"index-32f16270-5a81-4e5e-9f93-0e096b8b58d3-0",

...


    "_FILE_SIZE":

...

x,

...


    "_ROW_COUNT":

...

count of the map,
    "_DELETE_INDEX_RANGES": "binary Map<String, Pair<Integer, Integer>>"
  }
}

In IndexFile:

  • First, record version by a byte.
  • Then, record <serialized bitmap' size , serialized bitmap, serialized bitmap's checksum> in sequence.

For each serialized bitmap:

  • First, record a const magic number by an int.
  • Then, record serialized bitmap

2.2. DeleteMap index file encoding

Like hash index, one bucket one deleteMap index. Therefore, a deleteMap index file needs to contain bitmaps of multiple files in the same bucket, its structure is actually a map<fileName, bitmap>, to support high-performance reads, we have designed the following file encoding to store this map :

  • First, record the size of this map by an int.
  • Then, record the max fileName length by an int (because the length of the file name may not be consistent).
  • Next, record <fileName (padding to max length), the starting pos of the serialized bitmap, the size of the bitmap> in sequence.
  • Finally, record <serialized bitmap> in sequence
  • .

e.g:

Image RemovedImage Added

2.3 Classes

 BitmapDeleteIndex Implement DeleteIndex  based on RoaringBitmap 

...

Code Block
languagejava
titleDeleteMapIndexFile.java
public class DeleteMapIndexFile {  
   
    public long fileSize(String fileName);
     
    public Map<String, long[]> readDeleteIndexBytesOffsets(String fileName);
    
    public Map<String, DeleteIndex> readAllDeleteIndex(String fileName, Map<String, long[]> deleteIndexBytesOffsetsPair<Integer, Integer>> deleteIndexRanges);
    
    public DeleteIndex readDeleteIndex(String fileName, Pair<Integer, long[]Integer> deleteIndexBytesOffsetdeleteIndexRange);

    public String Pair<String, Map<String, Pair<Integer, Integer>>> write(Map<String, DeleteIndex> input);

    public void delete(String fileName);
}

...