This page explains all of the elements needed to develop and plug-in a new MADlib® module.
Say you want to write a new MADlib module called
NewModule (code name:
newmod). Use the following directory tree structure as a reference for your module:
./src/ modules/ newmod/ # (optional) new directory for the module code newmod.cpp # (optional) C/C++ code for this module newmod.hpp # (optional) C/C++ header for this module ... ports/ postgres/ modules/ newmod/ newmod.sql_in # (REQUIRED) SQL file to create DB objects newmod.py_in # (optional) Python code (helpful for iterative algorithms) test/ # (optional) directory for SQL test scripts newmod.sql_in # (optional) test scripts that will be run during install-check ...
newmod.sql_in - SQL file which creates database objects for this method. This is the only required code file, because there could me a module/method written completely in SQL. There would be no need for Python or C/C++ code in such case. This file is pre-processed with m4 during installation phase and currently uses the following meta variables:
newmod.py_in - Python code for
newmod module. A Python layer helps in pre/post-processing data before the module logic kicks in and is also useful in running iterative algorithms. This logic can be implemented in the PostgreSQL procedural language, but is usually simplified in Python.
newmod.c/cpp - C/C++ code for
newmod module. This is the logic that is executed in each iteration. Implementing in C++ leads to performant code when compared with implementing in SQL or PL/pgSQL.
test/newmod.sql_in - SQL test script that is executed during install-check
In order to include the new module in the generic (not database dependent) installation, only the following config file must be edited:
name element must be added with an optional
- name: newmod depends: ['othermod1', 'othermod2']
If you must adjust any of the code to a particular database platform the files which requires changes must be replicated under a dedicated
./port/<portid>/module directory, see below. The database
<portid> you will be referring to must be already defined in
./config/Ports.yml config file.
./ports/ greenplum/ # Example port id: greenplum modules/ newmod/ # (REQUIRED) new directory for the module code newmod.sql_in # (optional) SQL file to create DB objects newmod.py_in # (optional) Python code ...