Apache Tajo is a robust big data relational and distributed data warehouse system for Apache Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large-data sets stored on HDFS (Hadoop Distributed File System) and other data sources. By supporting SQL standards and leveraging advanced database techniques, Tajo allows direct control of distributed execution and data flow across a variety of query evaluation strategies and optimization opportunities.
- Official Apache Tajo Website: source code, bug issue-tracking, mailing-lists, etc.
- Overview of Tajo
- Powered By
- Architecture of Tajo
- Logos of Tajo
- Tajo Internal
- How To Contribute
- How To Setup Your Development Environment
- TPC-H Benchmark
- TPC-DS Benchmark
- How to update Apache Tajo website
- Coding Style
- Major Release Announcement Template
- How to write user documentations