Problem

We would like to provide a simple way to deploy and run Apache Bookkeeper with an official Docker image.

We decided to not make a request to official docker images library in order to maintain implementation freedom and to not require a third party approval for every change to our dockerfiles. The choice is to create builds for Apache docker user, so we will have images named "apache/bookkeper:4.4.0" instead of "bookkeeper:4.4.0". 

Proposal

Current docker image proposal is hosted on https://github.com/caiok/bookkeeper-docker

Main emphasis points are shown below.

Docker Image

The proposed image:

  • Is very similar to Zookeeper one (here the github sources)
  • Is based on Alpine Linux, a very appreciated distro for docker image (for lightweightness and security orientation)
    • (warning) Problem: Alpine uses a lightweight libc implementation that could lead to some compatibility issues with some system specific java libraries. The only problem I encountered so far is with Linux native epoll in Netty 4 (an issue for BK 4.5.0).
  • Follows nearly all docker official image guidelines
    • (warning) Problem: it currenty lacks only the key import in gpg signature verification. The best practice is:

      # gpg: key F73C700D: public key "Larry Hastings <larry@hastings.org>" imported
      RUN curl -fSL "https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tar.xz" -o python.tar.xz \
          && curl -fSL "https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tar.xz.asc" -o python.tar.xz.asc \
          && export GNUPGHOME="$(mktemp -d)" \
          && gpg --keyserver ha.pool.sks-keyservers.net --recv-keys 97FC712E4C024BBEA48A61ED3A5CA953F73C700D \
          && gpg --batch --verify python.tar.xz.asc python.tar.xz \
          && rm -r "$GNUPGHOME" python.tar.xz.asc \
          # install

      In current docker build "gpg --keyserver" and "gpg --batch" are commented.
      Providing a way to verify downloaded release is not mandatory, but it's a nice to have.

Directory Structure and Repository

  • We will include bookkeeper build files in the main repository, in a directory named "bookkeeper-docker".
  • We will provide a docker build file for each released version (starting from 4.4.0). Each build will be contained in a subdirectory named like the label we want to assign to it (i.e "4.4.0/" for image "apache/bookkeer:4.4.0").
    • The docker build file and all related files contained in these subdirectories will be used by dockerhub to create automated builds for every version.
    • (question) Question: could be useful provide a docker build "onbuild" for the last snapshot of the current development version?
  • Main directory will contain the Readme to be shown in dockerhub image page
  • (question) Currently main directory contains a Makefile that shows how to use bookkeeper image in order to build up a fully functional ensemble of 1 zookkeper, 3 bookies and 2 bookkeeper tutorial (a simple dice rolling application) containers. This makefile has been very useful to me in order to test builds and could be useful to who is approaching bookkeeper providing a functional example to play with, but it could be easily removed, if desired.

 

  • No labels

4 Comments

  1. Hi Francesco Caliumi, FYI. Some of the use case is running bookkeeper docker image on DCOS or K8S, the normal way to start a bookkeeper service from my view is some kind like this:

    1, provide a configuration for bookkeeper through json file(DCOS/marathon), or .yaml file(K8S), all bookie instances will use this same configuration;

    2, when start the bookkeeper service, It will automatically start the wanted number of bookie instances(e.g. 3, or 1000 bookies) at the same time. 

    Under this condition, a race of metadata format in zookeeper should be avoid. e.g, when start 1000 bookies at the same time, only the first bookie will do the metada format in zookeeper for the bk service, while other instances will wait it done.   Do we need to add this race handling? Here is a reference(https://github.com/apache/bookkeeper/pull/169/files#diff-1fd0da033753906a29729c3b71ee9e6f) for it at line 51--86.  

  2. Hi Jia Zhai, I agree with the use case you have described. For 1) do you find current configuration means enough? I will add zkLedgersRootPath ASAP. 

    I struggled with metadata format too, and I resolved providing a very naif way to use the image for initializing it (like you could see in makefile https://github.com/caiok/bookkeeper-docker/blob/master/Makefile, target "run-format") but it's not very suited for automatic deploy. Your solution is very interesting and I will surely take a closer look and integrate to the current image.

     

    1. Hi Francesco, thanks for this.  For 1), It seems good. If there is other config needed, we could add it in the future. 
      And Yes,  for k8s/DCOS, automatic deploy is the requirement.

  3. I updated the image in https://github.com/caiok/bookkeeper-docker. Now it handles configuration of zkLedgersRootPath via environment variable and seems to resolve the metadata problem in a trivial way (I've tried and it works in my enviroment).