Apache MADlib® is an open-source library for scalable in-database analytics.

It provides data-parallel implementations of mathematical, statistical,

graph and machine learning methods for structured and unstructured data.

License information regarding MADlib and included third-party libraries can be found inside the license directory.  ASF licensing guidance for MADlib pertaining to its pre-Apache history as an open source project with BSD licensing is described here.


  • PivotalR - lets the user run the functions of the open-source big-data machine learning package MADlib directly from R.

  • PyMADlib  - a nascent Python wrapper for MADlib, which brings you the power and flexibility of python with the number crunching power of MADlib.

