IDIEP-95
Author
Sponsor
Created

  

Status
DRAFT


Motivation

Ignite distributes table data across cluster nodes. Every row belongs to a specific node. Currently, clients are not aware of this distribution, which may result in additional network calls. For example, client calls server A to read a row, server A has to call server B where the row is stored.

In an optimal scenario, the client knows that the row is stored on the server B and makes a direct call there.

Description

Currently, clients can already establish connections to multiple server nodes. Handhsake response includes node id and name.

Update client implementation:

  1. Retrieve and maintain up-to-date partition assignment - an array of node ids, where Nth element is the leader node ID for partition N.
  2. Use HashCalculator to compute colocation key hash (see Row#colocationHash).
  3. Calculate partition number as rowKeyHash % partitionCount.
  4. Get node id from partition assignment (p1).
  5. If a connection to the resulting node exists, perform a direct call. Otherwise, use default connection.

Exceptions:

  • Transaction belongs to a specific node and client connection. When a non-null transaction is provided by the user, partition awareness logic is skipped.

Protocol Changes

1. Add PARTITION_ASSIGNMENT_GET operation

Request
UUIDtable ID
Response
arrayArray of node ids, where array index is the partition number

.

2. Update standard response message to include flags

Response
intType = 0
intRequest id
intFlags
intError code (0 for success)
stringError message (when error code is not 0)
...Operation-specific data


3. Include colocation flag in SCHEMAS_GET response.

Tracking Assignment Changes

There are three potential ways to keep partition assignment up-to-date on the client:

  1. Response flag. All server responses include flags field, and server sets a flag when the assignment has changed since the last response. It is up to the client to retrieve updated assignment when needed. This mechanism is used in Ignite 2.x. 
    Pros: Low overhead, no extra network traffic.
    Cons: Unlike Ignite 2.x, there is no concept of TopologyVersion in Ignite 3. So when there are multiple server connections, a notification for the same assignment update will come from all of them, and it is not possible to tell whether it was the same update or a new one. This may cause unnecessary assignment update requests. A workaround is to use only one connection to track assignment changes. Even if there is no activity on this connection, heartbeats (IEP-83) will trigger the update.

  2. Server → client notification. As soon as assignment changes, server sends a message to all clients.
    Pros: Immediate update for all clients.
    Cons: Increased network traffic and server load. Some clients may not need the update at all (not all APIs require this).

  3. PrimaryReplicaMissException (suggested in IGNITE-17394 - Getting issue details... STATUS comments).
    Pros: No protocol changes.
    Cons: Retry is required on replica miss (complicated & inefficient). Using exceptions for control flow.

The first approach (response flag) is battle tested and seems to be the most optimal. 

Discussion Links

Tickets

Key Summary T Created Updated Due Assignee Reporter P Status Resolution
Loading...
Refresh

  • No labels