You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Link: Unresolved issues in storm-sql

Current milestone

Storm SQL Phase II

STORM-1433 - Getting issue details... STATUS

Remaining works

 

Not prioritized yet

Expand supporting external components

JIRA link

STORM-2075 - Getting issue details... STATUS

Done

  • Kafka as Input / Output
  • Redis as Output

Remaining works

  • STORM-2082 - Getting issue details... STATUS
  • STORM-2102 - Getting issue details... STATUS
  • STORM-2103 - Getting issue details... STATUS
  • and etc.
    • Any external modules which support Trident state can be candidates.

Consideration

  • They should be rewritten if we replaces the backend of Storm SQL to higher-level core API
    • Need to determine 'Widely used' data sources and only provides them for now

 

- <EPIC> STORM-2147: Automatic parallelism for input data source with metadata
-- Automatic parallelism support for Kafka data source

- <EPIC> STORM-2149: [Storm SQL] Schema support on input format and output format
-- CSV?
-- TSV?
-- Avro?
-- Schema Registry?
-- etc?

- <EPIC?> Supports more functions (scalar and aggregation)
-- DATE / TIMESTAMP related functions
-- etc?

Future work:

- change backend of SQL to higher-level core API (get rid of Trident)
- without supporting join, aggregation, sort, and so on
- higher-level core API should support exactly-once to replace Trident

- support Streaming SQL
- group by window
- join between stream and table (without support temporal)
- join between stream and stream
- join between stream and table (with support temporal)

- Project / Filter pushdown to table
- I'm not sure which stream data source can support this

  • No labels