Link: Unresolved issues in storm-sql
Current milestone
Storm SQL Phase II
JIRA link
- STORM-1433Getting issue details... STATUS
Remaining works
- - STORM-1443Getting issue details... STATUS
- - STORM-1446Getting issue details... STATUS
- - STORM-2073Getting issue details... STATUS
Not prioritized yet
Expand supporting external components
JIRA link
- STORM-2075Getting issue details... STATUS
Done
- Kafka as Input / Output
- Redis as Output
Remaining works
- - STORM-2082Getting issue details... STATUS
- - STORM-2102Getting issue details... STATUS
- - STORM-2103Getting issue details... STATUS
- and etc.
- Any external modules which support Trident state can be candidates.
Consideration
- They should be rewritten if we replaces the backend of Storm SQL to higher-level core API
- Need to determine 'Widely used' data sources and only provides them for now
- <EPIC> STORM-2147: Automatic parallelism for input data source with metadata
-- Automatic parallelism support for Kafka data source
- <EPIC> STORM-2149: [Storm SQL] Schema support on input format and output format
-- CSV?
-- TSV?
-- Avro?
-- Schema Registry?
-- etc?
- <EPIC?> Supports more functions (scalar and aggregation)
-- DATE / TIMESTAMP related functions
-- etc?
Future work:
- change backend of SQL to higher-level core API (get rid of Trident)
- without supporting join, aggregation, sort, and so on
- higher-level core API should support exactly-once to replace Trident
- support Streaming SQL
- group by window
- join between stream and table (without support temporal)
- join between stream and stream
- join between stream and table (with support temporal)
- Project / Filter pushdown to table
- I'm not sure which stream data source can support this