IDIEP-139
Author
Sponsor
Created 17 Jul 2025
Status
DRAFT


Motivation

All rows stored in the grid are targeted with a concrete partition. Partitioning boosts performance by distributing both read and write operations, but currently sql and jdbc api`s are not aware about such distributions which leads to additional network interactions. If there are present duplicate requests with only a difference in transmitted parameters, client code can lazily (with some delay) calculate a concrete node which hosts requested or transmitted row(s) and makes a direct call there.

Description

Partition aware operations should be routed from the client directly to the server that hosts required data.

For queries contains dynamic parameters it need to have a possibility to map partition aware involved columns into direct node. Let`s use some kind of piggybacking (or probably lazy) approach to delivery such a mapping i.e. lets send first request(s) into any possible node (as it was without partition awareness implementation) and obtain such mapping with first response. In the case when query response contains partition aware metadata it need to be cached directly into size bounded
Map<PaCacheKey, PartitionAwarenessMetadata> structure where PaCacheKey represents schema and query with additional pre-calculated hash for further comparing boosting. It's worth mentioning that such approach will give different resolving for equivalent queries like:

SELECT FROM T WHERE p1=? AND p2=?
and
SELECT FROM T WHERE p2=? AND p1=?
but seems it`s acceptable. 

class PaCacheKey {
        private final String schema;
        private final String query;
        private final int hash;
}
class PartitionAwarenessMetadata {
   // Need for partition distribution calculation
   private final int tableId;

   // Mapping between dynamic params and colocation key columns
   private final int[] indexes;

   // If literals are present, hash items will be pre-calculated according to colocation ordering
   private final int[] hash;
}

Let`s use the same partition aware cache invalidation approach as it already done for KV implementation [1], i.e. update PA cache each time when changes have been observed.

JdbcStatement and ClientSql need to handle mapping query-> dynamic param mapping structure for colocation key(s).

Mapping structure size need to be configurable through thin client and jdbc settings:
IgniteClient.Builder#sqlPartitionAwarenessMetadataCacheSize
jdbc:ignite:thin://target.host?colocationMetadataSize=

Additionally lightweight client tx coordination [2] need to be applicable together with partition aware request mapping. Lets use simple approach at first time: if affinity meta mapping is found from cache - direct mode is used and proxy otherwise. Another words: if modified rows are belongs to only one node - direct mode is used.

Protocol changes:  if server side client request handler detects that PA feature supported it need to prepare PA structure (if possible) and send it back to the client.

Client protocol changes:

  • New feature flag SQL_PARTITION_AWARENESS

Let`s consider the scope of use cases this improvement can be applied to:

Consider a table like:

CREATE TABLE T (
    id int, 
    col1 int DEFAULT -1,
    col2 int, 
    PRIMARY KEY(id, col1, col2)) 
    COLOCATE BY(col1, col2)
)
Insert statements
// Colocation column values passed via params
INSERT INTO T (id, col1, col2) VALUES(1, ?, ?);

// Colocation column values partially passed via literals
INSERT INTO T (id, col1, col2) VALUES(1, ?, 100);
INSERT INTO T (id, col2) VALUES(?, 100), (?, 200);
Select statements
// Pure dynamic parameters
SELECT * FROM T WHERE col1 = ? AND col2 = ?;

// Colocation column values partially passed via literals
SELECT * FROM T WHERE col1 = ? AND col2 = 100;
SELECT * FROM T WHERE col1 = ? AND col2 = (SELECT 1);

// Colocation column values completely passed via literals
SELECT * FROM T WHERE col1 = 0 AND col2 = 100;

Applicability for Delete, Update and Merge statements - need to be the same as for SELECT above.

Not applicable cases:

  1. All cases where calculations for colocation keys are requires additional data demand
    SELECT * FROM T WHERE col1 = ? AND col2 = (SELECT MIN(a) FROM T2);
  2. Insertion with explicit casts:
    INSERT INTO T (id, col2) VALUES(0, '100'::INTEGER);
  3. Forms of INSERT INTO SELECT where condition is not covers all colocation columns
    INSERT INTO T SELECT * FROM T2 WHERE condition;
  4.  All forms of SELECT\DELETE\UPDATE\MERGE statements with IN, ANY, ALL, OR, Inner sub-queries, functions calls, functional defaults related to colocation keys.
  5. Multi-statement and explicit transaction statements also need to bypass partition awareness and fall back into default implementation.
  6. Probably no need to calculate partition aware information for Insert queries with pure literals defined for colocation columns.



Reference Links

[1] IEP-95: Client Partition Awareness

[2] Lightweight client tx coordination

Tickets

Key Summary T Created Updated Due Assignee Reporter P Status Resolution
Loading...
Refresh


  • No labels