ID | IEP-62 |
Author | Alex Plehanov |
Sponsor | Alex Plehanov |
Created |
|
Status | DRAFT |
Currently, for page replacement (page rotation between page-memory and disk) we use Random-LRU algorithm. It has a low maintenance cost and relatively simple implementation, but it has many disadvantages and affects performance very much when replacement is started. We even have warnings in the log when page replacement started and a special event for this. On some deployments, administrators even force to restart cluster nodes periodically to avoid page replacement.
Proposals to improve page replacement in Ignite:
Main idea: in some cases start background task to evict cold pages from page-memory (for example, pages, last touched more than 12 hours ago).
The task can be started:
Batch page replacement will be helpful in some workloads (when some data much colder than another). For example, when OLAP request over historical data raised pages to page-memory, and after such request, this data is not needed for a long time. Or when OLTP transactions mostly add new data and process recent data but rarely touch historical data. In these cases with the current approach, we will enter "page replacement mode" after some period of time and never leave it. Batch page replacement here can prevent the starting of Random-LRU page replacement, or if Random-LRU already started it can provide conditions to stop it.
Good page replacement algorithm should satisfy the requirements:
Our Random-LRU has low maintenance cost and sequential scan resistant, but to find the next page for replacement in the best case we scan 5 pages, in the worst case we can scan all data region segment. Also, due to random nature, it's not very effective in predicting the right page for replacement to minimize the page-fault rate. And it's much time required to totally evict old cold data.
Usually, database management systems and operating systems use modifications of LRU algorithms. These algorithms have higher maintenance costs (pages list should be modified on each page access), but often they are effective from a "page-fault rate" point of view and have O(1) complexity for a searching page to replace. Simple LRU is not sequential scan resistant, but modifications that utilize page access frequency are resistant to sequential scan.
We can try one of the modifications of LRU as well (for example, "segmented LRU" seems suitable for Ignite).
Ignite is a memory-centric product, so "low maintenance cost" is very critical. And there is a risk that page replacement algorithm can affect workloads, where page replacement is not used (enough RAM to store all data). Any page replacement solution should be carefully benchmarked with different workloads.
https://en.wikipedia.org/wiki/Page_replacement_algorithm
https://en.wikipedia.org/wiki/Cache_replacement_policies