- For using MADlib 1.16 Deep Learning module features, users are required to install Keras, Tensorflow and their dependencies separately.
- Current MADlib 1.16 Deep Learning module is tested on the following configurations:
- Greenplum v5.19, CentOS 7
- Greenplum v5.19, Ubuntu 16
- Postgres 10 and 11, CentOS 7
- Postgres 10 and 11, Ubuntu 16
- Greenplum 6Beta, CentOS 7
- Deep Learning library versions
- Keras: 2.2.4
- Tensorflow: 1.13.1
- Scipy: 1.2.1
- cudNN: 7.1.4
- CUDA: Cuda compilation tools, release 9.0, V9.0.176
- GPU configuration: Nvidia Tesla P100-PCIE-16GB
- MADlib 1.16 Deep learning module is supported for the following Greenplum and python versions:
- >= Greenplum 5.0
- >= Python 2.7
Known issues with the MADlib 1.16 Deep Learning module on Greenplum/Postgres:
When compiling Greenplum from source, there are known issues related to changing the PYTHONPATH environment variable after compilation. In order for the deep learning features of MADlib to function properly, it requires keras, tensorflow, and all their dependencies to be installed in the same python directory that was set in the PYTHONPATH environment variable before compiling Greenplum. If a new directory is added to PYTHONPATH later, it will not get reflected on the segments unless Greenplum is recompiled and restarted.
NOTE: If Greenplum is installed using gppkg or another binary package and the PYTHONPATH is set as default, users should be able to `pip install` keras, tensorflow and all other dependencies in the appropriate location, that would be used by MADlib deep learning functions.
- Support for keras/tensorflow on Greenplum/Postgres on CentOS 6:
Currently, defacto, CentOS 6 comes with glibc 2.12, while, Tensorflow installation requires at least glibc 2.17. MADlib Deep Learning module on CentOS 6, requires installing Keras and Tensorflow which might need compiling glibc from source. Having a higher version of glibc with Greenplum 5 may impact database behavior.
- GPU memory management:
- Canceling execution of all deep learning operations on GPU intermittently, may not release GPU memory. However, logging out of the psql session will release all the memory.
- GPU memory cannot be released within the same session, even though the query finishes, eg., if a madlib_keras_fit(), with the argument gpus_per_host>=1, starts execution in a psql session(S1), it will use the underlying GPU memory, but once the query finishes successfully, the GPU memory will not be released (see https://github.com/keras-team/keras/issues/9379 for more info).
- It is advisable to logout of the current psql session when switching from using CPU to GPU for computation or vice versa. Internally in the code, the CUDA environment variable `CUDA_VISIBLE_DEVICES` is set based on the gpus_per_host flag. Once this variable is set to -1 (disable GPU), there is no way to reset it to using GPU and that session will always use only CPU.
- Recommended configuration for GPUs setup: 1 GPU available per segment. If the number of GPUs per segment host is less than the number of segments per segment host, different segments share the same GPU, which may fail in some scenarios.
- Recommended format when specifying metric, optimizer and loss values in compile_params argument: loss=mean_squared_error
Currently, MADlib does not support the format importing individual loss functions, like, loss=losses.mean_squared_error.