This page describes the various strategies for executing queries in MetaModel.
Native vs greedy execution
Of particular interest is to specify in which cases MetaModel can delegate (aka "push down") query execution to a native query engine vs. having to execute the query in memory (often a greedy approach - Java code supplied by MetaModel).
The table below documents the execution capability in specific modules of MetaModel. Each column represents a query type. The query types are:
- Plain FROM:
- Simple queries of the form '
SELECT y FROM x
'. Possible values: - streaming: The dataset is implemented in a truly streaming fashion.
- paged: The dataset fetches pages/bulks of records.
- in-memory: The dataset has to consume ALL records into memory. This is ineffecient and may cause out of memory issues.
- Simple queries of the form '
- Simple COUNT:
- Queries of the form '
SELECT COUNT(*) FROM x
'. Possible values: - native: The module supports a effective native method of getting the count. Some modules also support additional criteria on COUNT queries, e.g. '
SELECT COUNT(*) FROM x WHERE z
' which is marked as 'native (incl. WHERE)'. - greedy: The module has to run through the dataset to do the counting. This is ineffecient but usually has little memory impact.
- Queries of the form '
- Simple WHERE:
- Are simple WHERE items being delegated natively, or are they evaluated client-side for each record?
- Primary key lookup:
- Queries that look up records by their primary keys:
SELECT y FROM x WHERE x.id = 42
.
- Queries that look up records by their primary keys:
- Groups and aggregates:
- Are GROUP BY and aggregation functions being delegated natively, or are they calculated in memory?
Plain FROM | Simple COUNT | Simple WHERE | Primary key lookup | Groups and aggregates | |
---|---|---|---|---|---|
CSV | Streaming | Greedy when exact. Native when approximated. | Client-side | No PK | Greedy |
JDBC | Streaming | Native (all variants) | Native | Native | Native |
Excel | Streaming XLSX Greedy XLS | Native | Client-side | No PK | Greedy |
POJO | In memory | Native | Client-side | No PK | Greedy |
Apache CouchDB | Streaming | Native | Native | Native | Greedy |
MongoDB | Streaming | Native | Native | Native | Greedy |
ElasticSearch | Paged | Native | Native | Native | Greedy |
Apache HBase | Streaming | Native | Client-side | Native | Greedy |
Apache Cassandra | Paged | Native | Client-side | Native | Greedy |
Apache Kafka | Streaming | Greedy | Native for certain fields Client-side with other | Greedy | Greedy |
Amazon DynamoDB | Paged | Native | Client-side | Native | Greedy |
JSON | Streaming | Greedy | Client-side | No PK | Greedy |
XML | Streaming SAX In-memory DOM | Greedy | Client-side | Greedy | Greedy |
Salesforce.com | Paged | Native | Native | Native | Greedy |
SugarCRM | Paged | Native | Native | Greedy | Greedy |