This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Page tree
Skip to end of metadata
Go to start of metadata

Welcome to the Apache CarbonData wiki. If you are interested in contributing to CarbonData, visit the contributing to CarbonData page to learn more.

Release plan(Around 3 months for one release)

DateVersion number
Aug 2016Apache CarbonData 0.1.0-incubating
Sep 2016Apache CarbonData 0.1.1-incubating
Nov 2016Apache CarbonData 0.2.0-incubating
Jan 2017Apache CarbonData 1.0.0-incubating
May 2017Apache CarbonData 1.1.0
Aug-Sep 2017Apache CarbonData 1.2.0
Jan-Feb 2018Apache CarbonData 1.3.0
Mar 2018Apache CarbonData 1.3.1
May 2018Apache CarbonData 1.4.0
Aug 2018Apache CarbonData 1.4.1
Oct 2018Apache CarbonData 1.5.0
Dec 2018Apache CarbonData 1.5.1
Jan 2019Apache CarbonData 1.5.2
Mar 2019Apache CarbonData 1.5.3
May 2019Apache CarbonData 1.6.0
Aug 2019Apache CarbonData 1.7.0

Road map plan:

1.0.x:

  • Support 2.1 integration in CarbonData
  • Remove kettle, support new data load solution
  • Support data update and delete SQL in Spark 1.6

1.1.x:

  • Add page in blocklet for improving scan cases' performance.
  • Support V3 format for improving TPC-H performance.
  • Support vector features by default
  • Support data update and delete SQL in Spark 2.1

1.2.x

  • Support to specify sort column for MDK(Multi-Dimension Key index )
  • Support partition
  • Support Presto integration
  • Support Hive integration
  • Data loading optimization using ColumnPage in write step and make it off heap

1.3.x:

  • Support streaming ingestion data to CarbonData 
  • Provide index framework for supporting user to add more index.
  • Support local dictionary
  • Ecosystem integration(eg. latest Apache Spark version 2.x)

1.4.x:

  • Support create carbondata on cloud storage(AWS S3, Huawei OBS)
  • Provide index framework for supporting user to add more index, like : text index using lucene
  • Ecosystem integration

 1.5.x:

  • Support MV(Materialised View), Bloom Filter (in production features)
  • Support CarbonData engine for improving concurrent visit and point query.
  • Ecosystem integration
  • Support alter add column in carbon file format
  • Supports multiple character separators in csv file during data loading
  • Compaction support for segments created with range_sort and global_sort
  • Support DDLs to operate on Driver Cache (Get cache size, clear cache)
  • Support building datamaps  and data load in parallel to reduce the overall time taken
  • Summary of loaded and bad records data after data loading

1.6.x:

  • Support storing of carbon Min-Max indexes in external system
  • MV DataMap Enhancements and Stabilisation
  • Query performance enhancements
  • Deeper Presto integration and stabilisation
  • UDF and UDAF support in Pre-aggregate tables
  • Support Read/Write from hive

1.7.x:

  • Load performance improvements
  • TPCDS [Query, load] performance improvements
  • Carbon Advisor for auto suggestion of ideal table schema including MV, index, sort col, range col, compression ...
  • Delete and update support in CarbonData SDK
  • Support C engine reader for CarbonData SDK
  • ES based datamap management
  • Support Spark DataSource API V2

1.8.x:

  • Support CarbonData metadata management using DB or other external OLTP system
  • Support MV on Streaming tables, partition tables, Time Series
  • Support MV creation from another MV

Pages Link

Committers
Apache CarbonData Performance Benchmark(0.1.0)
Events(Summit and Meetup materials)
Use cases and shared articles

 

 

 



  • No labels