Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Contents:

Table of Contents

Ignite

...

Persistent Store

File types

There are following file types used for persisting data: Cache pages or page store, Checkpoint markers, and WAL segments.

...

Ignite with enabled persistence uses following folder structure:

2.3+Older versions (2.1 & 2.2)

Image Modified

The Pst-subfolder name is same for all storage folders.

A name is selected on start, may be based on node consistentId.


Expand

Image Modified

Consistent ID may be configured using IgniteConfiguration or generated from local IPs set by default.


Subfolders Generation

The subfolder name is generated on start. By default new style naming is used, for example node00-e819f611-3fb9-4dbe-a3aa-1f6de4af5d02

...

Expand

1) A starting node binds to a port and generates old-style compatible consistent ID (e.g. 127.0.0.1:47500) using DiscoverySpi.consistentId(). This method still returns ip:port-based identifier.

2) The node scans the work directory and checks if there is a folder matching the consistent ID. (e.g. work\db\127_0_0_1_49999). If such a folder exists, we start up with this ID (compatibility mode), and we get file lock to this folder. See PdsConsistentIdProcessor.prepareNewSettings.

3) If there are no matching folders, but the directory is not empty, scan it for old-style consistent IDs. If there are old-style db folders, print out a warning (see warning text above), then switch to new style folder generation (step 4).

4) If there are existing new style folders, pick up the one with the smallest sequence number and try to lock the directory. Repeat until we succeed or until the list of new-style consistent IDs is empty. (e.g. work\db\node00-uuid, node01-uuid, etc).

5) If there are no more available new-style folders, generate a new one with next sequence number and random UUID as consistent ID. (e.g. work\db\node00-uuid, uuid overrides uuid in GridDiscoveryManager).

6) Use this consistent ID for the node startup (using value from GridKernalContext.pdsFolderResolver() and from PdsFolderSettings.consistentId()).

There is a system property to disable new-style generation and using old-style consistent ID (IgniteSystemProperties.IGNITE_DATA_STORAGE_FOLDER_BY_CONSISTENT_ID).


Page store

Ignite Durable Memory is the basis for all data structures. There is no cache state saved on heap now. 

...

Collection of pages (GridCacheDatabaseSharedManager.Checkpoint#cpPages) is a snapshot of dirty pages at checkpoint start. This collection allows writing pages which were changed since the last checkpoint.

Info

When the checkpoint process starts, pages marked for checkpoint are no longer marked as dirty ones in metrics.

Checkpoint Pool

In parallel with the process of writing pages to disk, some thread may need to update data in the page being written (or scheduled to being written).

...

  1. exponential backoff (start with ultra-short park, every next park will be <factor> times longer)
  2. and speed-based (collect the history of disk write speed measurements, extrapolate it to calculate "ideal" speed, and bound threads that generate dirty pages with that "ideal" speed)" speed). There are three main approaches:

    - Exponential backoff is used if over 2/3 of the checkpoint buffer is used up. If it is enabled, other throttling strategies are not used.
    - Clean pages protection is used if there are 0 checkpoint pages. It is used to protect pages at the start of the end of the checkpointing process.
    - Throttling is based on a comparison of the speed of checkpointing and dirty pages creation. Uses the speed of checkpointing at +10% as the throttling limit. 

Ignite node chooses one of them adaptively. 

...

WAL file segments and rotation structure is shown at the picture below:

 

 


A number of segments may be not needed anymore (depending on History Size setting). Old fashion WAL History size setting is set in checkpoints number (See also WAL history size section below), the new one is set in bytes.  History size setting is mentioned here https://apacheignite.readme.io/docs/write-ahead-log#section-wal-archive

Local Recovery Process

Let’s assume node start process is running with existent files.

...

Partition update counter is saved with update recods in WAL.

Node Join (with data from

...

persistence)

Consider partition on joining node was is owning state, update counter = 50. Existing nodes has update counter = 150

Node join causes partition map exchange, update counter is sent with other partition data. (Joining node will have new ID and from the point of view of dicsovery this node is a new node.)

Image RemovedImage Added

Coordinator observes older partition state and forces partition to moving state. Moving force is required to setup uploading newer data.

...