This page tracks external software projects that supplement Apache Spark and add to its ecosystem.
spark-packages.org is an external, community-managed list of third-party libraries, add-ons, and applications that work with Apache Spark. You can add a package as long as you have a GitHub repository.
- Spark Job Server - REST interface for managing and submitting Spark jobs on the same cluster (see blog post for details)
- SparkR - R frontend for Spark
- MLbase - Machine Learning research project on top of Spark
- Apache Mesos - Cluster management system that supports running Spark
- Alluxio (née Tachyon) - Memory speed virtual distributed storage system that supports running Spark
- Spark Cassandra Connector - Easily load your Cassandra data into Spark and Spark SQL; from Datastax
- FiloDB - a Spark integrated analytical/columnar database, with in-memory option capable of sub-second concurrent queries
- ElasticSearch - Spark SQL Integration
- Spark-Scalding - Easily transition Cascading/Scalding code to Spark
- Zeppelin - an IPython-like notebook for Spark. There is also ISpark, and the Spark Notebook.
- IBM Spectrum Conductor with Spark - cluster management software that integrates with Spark
- SnappyData - an open source OLTP + OLAP database integrated with Spark on the same JVMs.
- GeoSpark - Geospatial RDDs and joins
- Spark Cluster Deploy Tools for OpenStack
Applications Using Spark
Moved permanently to http://spark.apache.org/third-party-projects.html