This page is the Apache Arrow developer wiki. If you are involved in building or maintaining the project, this is a good page to have bookmarked. If you are a prospective user of the project, check out user-facing library and API documentation linked to from http://arrow.apache.org/.
Developer Resources
- Guide for Committers and Project Maintainers
- Release Management Guide
- How to Verify Release Candidates
- Open Patches (Pull Requests)
- JIRA Health Dashboard
Roadmap and Initiatives
Columnar Format
Columnar Format JIRA Dashboard
The "Arrow columnar format" is an open standard, language-independent binary in-memory format for columnar datasets. It can be used to create data frame libraries, build analytical query engines, and address many other use cases.
Columnar Computational Libraries
C++ Libraries
Feather File Format
GPU Support
Go Libraries
Java Libraries
JavaScript Libraries
Julia Libraries
We have been discussing involving the Julia community in Apache Arrow
- Discussion on ExpandingMan/Arrow.jl https://github.com/ExpandingMan/Arrow.jl/issues/28
- Feather implementation in pure Julia: https://github.com/JuliaData/Feather.jl
Machine Learning Framework Integrations
Modern machine learning frameworks can leverage technologies we are developing in Apache Arrow, and vice versa.
Plasma Shared Memory Store
Python Libraries
Projects:
RPC System (Arrow Flight)
- Jacques's initial proposal as pull request
- GitHub issue for GRPC Protobuf Performance issues in Java