Page tree
Skip to end of metadata
Go to start of metadata



MADlib® is an open-source library for scalable in-database analytics.

It provides data-parallel implementations of mathematical, statistical,

graph and machine learning methods for structured and unstructured data.

Quick Start Guides

Get going with a minimum of fuss. 

General Information

Learn about MADlib.

Developer Documentation

Contribute to the project.


See how the pieces fit together. 

Release Notes

See what has been released.

Third Party Components

MADlib incorporates material from the following third-party components:

  1. argparse 1.2.1 provides an easy, declarative interface for creating command line tools
  2. Boost 1.47.0 (or newer) provides peer-reviewed portable C++ source libraries
  3. Eigen 3.2.2 is a C++ template library for linear algebra
  4. PyYAML 3.10 is a YAML parser and emitter for Python
  5. PyXB 1.2.4 is a Python library for XML Schema Bindings
  6. Porter2 stemmer reduces workds to common roots for comparison and operating on.
  7. UseLATEX.cmake contains CMAKE commands to use the LaTeX compiler


License information regarding MADlib and included third-party libraries can be found inside the license directory.  ASF licensing guidance for MADlib pertaining to its pre-Apache history as an open source project with BSD licensing is described here.


Related Software

  • PivotalR - lets the user run the functions of the open-source big-data machine learning package MADlib directly from R.

  • PyMADlib  - a nascent Python wrapper for MADlib, which brings you the power and flexibility of python with the number crunching power of MADlib.


  • No labels