Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Percent of dirty pages is trigger for checkpointing (e.g. 75%).
  • Timeout is also trigger, do checkpoint every N seconds

Pages Write Throttling

 

Sharp Checkpointing has side-effects when throughput of data updates is greater than throughput of physical storage device. Under heavy load of writes, operations per second rate periodically drops to zero:

 

 

When offheap memory accumulates too many dirty pages (pages with data not written to disk yet), Ignite node initiates checkpoint — process of writing сonsistent snapshot of all pages to disk storage. If dirty page is changed during ongoing checkpoint before being written to disk, its previous state is copied to a special data region — checkpoint buffer:

...

Slow storage devices cause long-running checkpoints. And if load is high while checkpoint is slow, two bad things can happen: 

  • Checkpoint buffer can overflow
  • Dirty pages threshold for next checkpoint can be reached during current checkpoint

 

Any of two events above will cause Ignite node to freeze all updates until the end of current checkpoint. That's why operations/sec graph falls to zero. 

Since Ignite 2.3, data storage configuration has writeThrottlingEnabled property. If it's enabled, there are two possible situations that can trigger throttling: 

...