Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Security service is REST based service for user authentication against LDAP/LDAP + AWS/Azure OAuth2 depending on module configuration and cloud provider. LDAP only provides with authentication end point that allows to verify authenticity of users against LDAP instance. If you use AWS cloud provider LDAP + AWS authentication could be useful as it allows to combine LDAP authentication and verification if user has any role in AWS account

DataLab provides OAuth2(client credentials and authorization code flow) security authorization mechanism for Azure users. This kind of authentication is required when you are going to use Data Lake. If  If Data Lake is not enabled you have two options LDAP or OAuth2 If OAuth2 is in use security-service validates user's permissions to configured permission scope(resource in Azure). If Data Lake is enabled default permission scope(can be configured manually after deploy DataLab) is Data Lake Store account so only if user has any role in scope of Data Lake Store Account resource he/she will be allowed to log in If Data Lake is disabled but Azure OAuth2 is in use default permission scope will be Resource Group where DataLab is created and only users who have any roles in the resource group will be allowed to log in.

...

keytool -genkeypair -alias ssn -keyalg RSA -storepass KEYSTORE_PASSWORD -keypass KEYSTORE_PASSWORD -keystore ~/keys/ssn.keystore.jks -keysize 2048 -dname "CN=localhost"
keytool -exportcert -alias ssn -storepass KEYSTORE_PASSWORD -file ~/keys/ssn.crt -keystore ~/keys/ssn.keystore.jks
sudo keytool -importcert -trustcacerts -alias ssn -file ~/keys/ssn.crt -noprompt -storepass changeit -keystore ${JRE_HOME}/lib/security/cacerts

Create Windows server certificate

Pay attention that the last command has to be executed with administrative permissions. To achieve this the command line (cmd) should be run with administrative permissions.

"%JRE_HOME%\bin\keytool" -genkeypair -alias dlab -keyalg RSA -storepass KEYSTORE_PASSWORD -keypass KEYSTORE_PASSWORD -keystore <DRIVE_LETTER>:\home\%USERNAME%\keys\dlab.keystore.jks -keysize 2048 -dname "CN=localhost"
"%JRE_HOME%\bin\keytool" -exportcert -alias dlab -storepass KEYSTORE_PASSWORD -file <DRIVE_LETTER>:\home\%USERNAME%\keys\dlab.crt -keystore <DRIVE_LETTER>:\home\%USERNAME%\keys\dlab.keystore.jks
"%JRE_HOME%\bin\keytool" -importcert -trustcacerts -alias dlab -file <DRIVE_LETTER>:\home\%USERNAME%\keys\dlab.crt -noprompt -storepass changeit -keystore "%JRE_HOME%\lib\security\cacerts"

Useful command
"%JRE_HOME%\bin\keytool" -list -alias dlab -storepass changeit -keystore "%JRE_HOME%\lib\security\cacerts"
"%JRE_HOME%\bin\keytool" -delete -alias dlab -storepass changeit -keystore "%JRE_HOME%\lib\security\cacerts"

Where the <DRIVE_LETTER> must be the drive letter where you run the DataLab.


How to run locally

...

There is a possibility to run Self-Service and Provisioning Service locally. All requests from Provisioning Service to Docker are mocked and instance creation status will be persisted to Mongo (only without real impact on Docker and AWS). Security Service can`t be running on local machine because of local LDAP mocking complexity.Both

Both services, Self-Service and Provisioning Service are dependent on datalab/provisioning-infrastructure/ssn/templates/ssn.yml configuration file. Both services have main functions as entry point, SelfServiceApplication for Self-Service and ProvisioningServiceApplication for Provisioning Service. Services could be started by running main methods of these classes. Both main functions require two arguments:

  • Run mode (“server”)
  • Configuration file name (“self-service.yml” or “provisioning.yml” depending on the service). Both files are located in root service directory. These configuration files contain service settings and are ready to use.

The services start up order does matter. Since Self-Service depends on Provisioning Service, the last should be started first and Self-Service afterwards. Services could be started from local IDEA (Eclipse or Intellij Idea) “Run” functionality of toolbox.

Run application flow is following:

  • Create and run provisioning-service configuration:
    • Create Application with name provisining-service-application

      • Main class: com.epam.datalab.backendapi.ProvisioningServiceApplication
      • VM options: -Ddocker.dir=[PATH_TO_PROJECT_DIR]\infrastructure-provisioning\src\general\files\gcp
      • Program arguments : server [PATH_TO_PROJECT_DIR]\services\provisioning-service\provisioning.yml
      • Working directory:[PATH_TO_PROJECT_DIR]
      • Use classpath of module: provisioning-servise
      • PAY ATTENTION: JRE should be the same jre where created server certificate
    • Create and run self-service configuration:

      • Create Application with name self-service-application
        • Main class: com.epam.datalab.backendapi.SelfServiceApplication
        • Program arguments : server [PATH_TO_PROJECT_DIR]/services/self-service/self-service.yml
        • Working directory:[PATH_TO_PROJECT_DIR]
        • Use classpath of module: self-service
        • PAY ATTENTION: JRE should be the same jre where created server certificate
  • Try to access self-service Web UI by https://localhost:8443
User: test
Password: <any>


Infrastructure provisioning

...

DevOps components overview

The following list shows common structure of scripts for deploying DataLab

Folder structure

datalab
└───infrastructure-provisioning
    └───src
        ├───base
        ├───dataengine
        ├───dataengine-service
        ├───deeplearning            
        ├───edge
        ├───general
        ├───jupyter
        ├───jupyterlab
        ├───project
        ├───rstudio
        ├───ssn
        ├───superset
        ├───tensor
        ├───tensor-rstudio
        └───zeppelin


Each directory except general contains Python scripts, Docker files, templates, files for appropriate Docker image.

  • base – Main Docker image. It is a common/base image for other ones.
  • edge – Docker image for Edge node.
  • dataengine – Docker image for dataengine cluster.
  • dataengine-service – Docker image for dataengine-service cluster.
  • general – OS and CLOUD dependent common source.
  • ssn – Docker image for Self-Service node (SSN).
  • jupyter/rstudio/zeppelin/tensor/deeplearning – Docker images for Notebook nodes.

All Python scripts, Docker files and other files, which are located in these directories, are OS and CLOUD independent.

OS, CLOUD dependent and common for few templates scripts, functions, files are located in general directory.

general
├───api – all available API
├───conf – DataLab configuration
├───files – OS/Cloud dependent files
├───lib – OS/Cloud dependent functions
├───scripts – OS/Cloud dependent Python scripts
└───templates – OS/Cloud dependent templates

These directories may contain differentiation by operating system (Debian/RedHat) or cloud provider (AWS).

Directories of templates (SSN, Edge etc.) contain only scripts, which are OS and CLOUD independent.

If script/function is OS or CLOUD dependent, it should be located in appropriate directory/library in general folder.

The following table describes mostly used scripts:

Script name/PathDescription
DockerfileUsed for building Docker images and represents which Python scripts, templates and other files are needed. Required for each template.
base/entrypoint.pyThis file is executed by Docker. It is responsible for setting environment variables, which are passed from Docker and for executing appropriate actions (script in general/api/).
base/scripts/*.pyScripts, which are OS independent and are used in each template.
general/api/*.pyAPI scripts, which execute appropriate function from fabfile.py.
template_name/fabfile.pyIs the main file for template and contains all functions, which can be used as template actions.
template_name/scripts/*.pyPython scripts, which are used for template. They are OS and CLOUD independent.
general/lib/aws/*.pyContains all functions related to AWS.
general/lib/os/This directory is divided by type of OS. All OS dependent functions are located here.
general/lib/os/fab.pyContains OS independent functions used for multiple templates.
general/scripts/Directory is divided by type of Cloud provider and OS.
general/scripts/aws/*.pyScripts, which are executed from fabfiles and AWS-specific. The first part of file name defines to which template this script is related to. For example:
common_*.py – can be executed from more than one template.
ssn_*.py – are used for SSN template.
edge_*.py – are used for Edge template.
general/scripts/os/*.pyScripts, which are OS independent and can be executed from more than one template.

Docker actions overview

Available Docker images and their actions:

Docker imageActions
ssncreate, terminate
edgecreate, terminate, status, start, stop, recreate
jupyter/rstudio/zeppelin/tensor/deeplearningcreate, terminate, start, stop, configure, list_libs, install_libs, git_creds
dataengine/dataengine-servicecreate, terminate
Docker and python execution workflow on example of SSN node
  • Docker command for building images docker.datalab-base and docker.datalab-ssn:
sudo docker build --build-arg OS=debian  --file general/files/aws/base_Dockerfile -t docker.datalab-base . ;
sudo docker build --build-arg OS=debian  --file general/files/aws/ssn_Dockerfile -t docker.datalab-ssn . ;


Example of SSN Docker file:

FROM docker.datalab-base:latest

ARG OS

COPY ssn/ /root/
COPY general/scripts/aws/ssn_* /root/scripts/
COPY general/lib/os/${OS}/ssn_lib.py /usr/lib/python2.7/datalab/ssn_lib.py
COPY general/files/aws/ssn_policy.json /root/files/
COPY general/templates/aws/jenkins_jobs /root/templates/jenkins_jobs

RUN chmod a+x /root/fabfile.py; \
    chmod a+x /root/scripts/*

RUN mkdir /project_tree
COPY . /project_tree

Using this Docker file, all required scripts and files will be copied to Docker container.

  • Docker command for building SSN:
docker run -i -v /root/KEYNAME.pem:/root/keys/KEYNAME.pem –v /web_app:/root/web_app -e "conf_os_family=debian" -e "conf_cloud_provider=aws" -e "conf_resource=ssn" -e "aws_ssn_instance_size=t2.medium" -e "aws_region=us-west-2" -e "aws_vpc_id=vpc-111111" -e "aws_subnet_id=subnet-111111" -e "aws_security_groups_ids=sg-11111,sg-22222,sg-33333" -e "conf_key_name=KEYNAME" -e "conf_service_base_name=datalab_test" -e "aws_access_key=Access_Key_ID" -e "aws_secret_access_key=Secret_Access_Key" -e "conf_tag_resource_id=datalab" docker.datalab-ssn --action create ;
  • Docker executes entrypoint.py script with action create. Entrypoint.py will set environment variables, which were provided from Docker and execute general/api/create.py script:
 elif args.action == 'create':
        with hide('running'):
            local("/bin/create.py")
  • general/api/create.py will execute Fabric command with run action:
  try:
        local('cd /root; fab run')
  • Function run() in file ssn/fabfile.py will be executed. It will run two scripts general/scripts/aws/ssn_prepare.py and general/scripts/aws/ssn_configure.py:
    try:
        local("~/scripts/{}.py".format('ssn_prepare'))
    except Exception as err:
        traceback.print_exc()
        append_result("Failed preparing SSN node. Exception: " + str(err))
        sys.exit(1)

    try:
        local("~/scripts/{}.py".format('ssn_configure'))
    except Exception as err:
        traceback.print_exc()
        append_result("Failed configuring SSN node. Exception: " + str(err))
        sys.exit(1)


  • The scripts general/scripts/<cloud_provider>/ssn_prepare.py an general/scripts/<cloud_provider>/ssn_configure.py will execute other Python scripts/functions for:
  1. ssn_prepate.py: 1. Creating configuration file (for AWS) 2. Creating Cloud resources.
  2. ssn_configure.py: 1. Installing prerequisites 2. Installing required packages 3. Configuring Docker 4. Configuring DataLab Web UI
  • If all scripts/function are executed successfully, Docker container will stop and SSN node will be created.

Example of Docker commands

SSN:

docker run -i -v <key_path><key_name>.pem:/root/keys/<key_name>.pem -e "region=<region>" -e "conf_service_base_name=<Infrastructure_Tag>" -e  “conf_resource=ssn" -e "aws_access_key=<Access_Key_ID>" -e "aws_secret_access_key=<Secret_Access_Key>" docker.datalab-ssn --action <action>

All parameters are listed in section "Self-ServiceNode" chapter.

Other images:

docker run -i -v /home/<user>/keys:/root/keys  -v /opt/datalab/tmp/result:/response -v /var/opt/datalab/log/<image>:/logs/<image>  -e <variable1> –e <variable2> docker.datalab-<image> --action <action>

How to add a new template

First of all, a new directory should be created in infrastructure-provisioning/src/.

For example: infrastructure-provisioning/src/my-tool/

The following scripts/directories are required to be created in the template directory:

my-tool
├───scripts
└───fabfile.py

fabfile.py – the main script, which contains main functions for this template such as run, stop, terminate, etc.

Here is example of run() function for Jupyter Notebook node:

Path: infrastructure-provisioning/src/jupyter/fabfile.py

def run():
    local_log_filename = "{}_{}_{}.log".format(os.environ['conf_resource'], os.environ['edge_user_name'], os.environ['request_id'])
    local_log_filepath = "/logs/" + os.environ['conf_resource'] + "/" + local_log_filename
    logging.basicConfig(format='%(levelname)-8s [%(asctime)s]  %(message)s',
                        level=logging.DEBUG,
                        filename=local_log_filepath)

    notebook_config = dict()
    notebook_config['uuid'] = str(uuid.uuid4())[:5]

    try:
        params = "--uuid {}".format(notebook_config['uuid'])
        local("~/scripts/{}.py {}".format('common_prepare_notebook', params))
    except Exception as err:
        traceback.print_exc()
        append_result("Failed preparing Notebook node.", str(err))
        sys.exit(1)

    try:
        params = "--uuid {}".format(notebook_config['uuid'])
        local("~/scripts/{}.py {}".format('jupyter_configure', params))
    except Exception as err:
        traceback.print_exc()
        append_result("Failed configuring Notebook node.", str(err))
        sys.exit(1)

This function describes process of creating Jupyter node. It is divided into two parts – prepare and configure. Prepare part is common for all notebook templates and responsible for creating of necessary cloud resources, such as EC2 instances, etc. Configure part describes how the appropriate services will be installed.

To configure Jupyter node, the script jupyter_configure.py is executed. This script describes steps for configuring Jupyter node. In each step, the appropriate Python script is executed.

For example:

Path: infrastructure-provisioning/src/general/scripts/aws/jupyter_configure.py

    try:
        logging.info('[CONFIGURE JUPYTER NOTEBOOK INSTANCE]')
        print('[CONFIGURE JUPYTER NOTEBOOK INSTANCE]')
        params = "--hostname {} --keyfile {} --region {} --spark_version {} --hadoop_version {} --os_user {} --scala_version {}".\
            format(instance_hostname, keyfile_name, os.environ['aws_region'], os.environ['notebook_spark_version'],
                   os.environ['notebook_hadoop_version'], os.environ['conf_os_user'],
                   os.environ['notebook_scala_version'])
        try:
            local("~/scripts/{}.py {}".format('configure_jupyter_node', params))

In this step, the script infrastructure-provisioning/src/jupyter/scripts/configure_jupyter_node.py will be executed.

Example of script infrastructure-provisioning/src/jupyter/scripts/configure_jupyter_node.py:

if __name__ == "__main__":
    print("Configure connections")
    env['connection_attempts'] = 100
    env.key_filename = [args.keyfile]
    env.host_string = args.os_user + '@' + args.hostname

    print("Configuring notebook server.")
    try:
        if not exists('/home/' + args.os_user + '/.ensure_dir'):
            sudo('mkdir /home/' + args.os_user + '/.ensure_dir')
    except:
        sys.exit(1)

    print("Mount additional volume")
    prepare_disk(args.os_user)

    print("Install Java")
    ensure_jre_jdk(args.os_user)

This script call functions for configuring Jupyter node. If this function is OS dependent, it will be placed in infrastructure-provisioning/src/general/lib/<OS_family>/debian/notebook_lib.py

All functions in template directory (e.g. infrastructure-provisioning/src/my-tool/) should be OS and cloud independent.

All OS or cloud dependent functions should be placed in infrastructure-provisioning/src/general/lib/ directory.

The following steps are required for each Notebook node:

  • Configure proxy on Notebook instance – the script infrastructure-provisioning/src/general/scripts/os/notebook_configure_proxy.py
  • Installing user’s key – the script infrastructure-provisioning/src/base/scripts/install_user_key.py

Other scripts, responsible for configuring Jupyter node are placed in infrastructure-provisioning/src/jupyter/scripts/

  • scripts directory – contains all required configuration scripts.

  • infrastructure-provisioning/src/general/files/<cloud_provider>/my-tool_Dockerfile – used for building template Docker image and describes which files, scripts, templates are required and will be copied to template Docker image.

  • infrastructure-provisioning/src/general/files/<cloud_provider>/my-tool_descriptsion.json – JSON file for DataLab Web UI. In this file you can specify:

  • exploratory_environment_shapes – list of EC2 shapes
  • exploratory_environment_versions – description of template

Example of this file for Jupyter node for AWS cloud:

{
  "exploratory_environment_shapes" :
  {
    "For testing" : [
      {"Size": "S", "Description": "Standard_DS1_v2", "Type": "Standard_DS1_v2","Ram": "3.5 GB","Cpu": "1", "Spot": "true", "SpotPctPrice": "70"}
    ],
    "Memory optimized" : [
      {"Size": "S", "Description": "Standard_E4s_v3", "Type": "Standard_E4s_v3","Ram": "32 GB","Cpu": "4"},
      {"Size": "M", "Description": "Standard_E16s_v3", "Type": "Standard_E16s_v3","Ram": "128 GB","Cpu": "16"},
      {"Size": "L", "Description": "Standard_E32s_v3", "Type": "Standard_E32s_v3","Ram": "256 GB","Cpu": "32"}
    ],
    "Compute optimized": [
      {"Size": "S", "Description": "Standard_F2s", "Type": "Standard_F2s","Ram": "4 GB","Cpu": "2"},
      {"Size": "M", "Description": "Standard_F8s", "Type": "Standard_F8s","Ram": "16.0 GB","Cpu": "8"},
      {"Size": "L", "Description": "Standard_F16s", "Type": "Standard_F16s","Ram": "32.0 GB","Cpu": "16"}
    ]
  },
  "exploratory_environment_versions" :
  [
    {
      "template_name": "Jupyter notebook 5.7.4",
      "description": "Base image with jupyter node creation routines",
      "environment_type": "exploratory",
      "version": "jupyter_notebook-5.7.4",
      "vendor": "Azure"
    }
  ]
}

Additionally, following directories could be created:

  • templates – directory for new templates;

  • files – directory for files used by newly added templates only;

All Docker images are being built while creating SSN node. To add newly created template, add it to the list of images in the following script:

Path: infrastructure-provisioning/src/general/scripts/aws/ssn_configure.py

    try:
        logging.info('[CONFIGURING DOCKER AT SSN INSTANCE]')
        print('[CONFIGURING DOCKER AT SSN INSTANCE]')
        additional_config = [{"name": "base", "tag": "latest"},
                             {"name": "edge", "tag": "latest"},
                             {"name": "jupyter", "tag": "latest"},
                             {"name": "rstudio", "tag": "latest"},
                             {"name": "zeppelin", "tag": "latest"},
                             {"name": "tensor", "tag": "latest"},
                             {"name": "emr", "tag": "latest"}]

For example:

...

   {"name": "my-tool", "tag": "latest"},

...

Azure OAuth2 Authentication

...


DataLab supports OAuth2 authentication that is configured automatically in Security Service and Self Service after DataLab deployment. Please see explanation details about configuration parameters for Self Service and Security Service below. DataLab supports client credentials(username + password) and authorization code flow for authentication.

Azure OAuth2 Self Service configuration

azureLoginConfiguration:    
    tenant: xxxx-xxxx-xxxx-xxxx
    authority: https://login.microsoftonline.com/
    clientId: xxxx-xxxx-xxxx-xxxx
    redirectUrl: https://datalab.azure.cloudapp.azure.com/
    responseMode: query
    prompt: consent
    silent: true
    loginPage: https://datalab.azure.cloudapp.azure.com/
    maxSessionDurabilityMilliseconds: 288000000

where:

  • tenant - tenant id of your company
  • authority - Microsoft login endpoint
  • clientId - id of the application that users log in through
  • redirectUrl - redirect URL to DataLab application after try to login to Azure using OAuth2
  • responseMode - defines how Azure sends authorization code or error information to DataLab during log in procedure
  • prompt - defines kind of prompt during Oauth2 login
  • silent - defines if DataLab tries to log in user without interaction(true/false), if false DataLab tries to login user with configured prompt
  • loginPage - start page of DataLab application
  • maxSessionDurabilityMilliseconds - max user session durability. user will be asked to login after this period of time and when he/she creates ot starts notebook/cluster. This operation is needed to update refresh_token that is used by notebooks to access Data Lake Store

To get more info about responseMode, prompt parameters please visit Authorize access to web applications using OAuth 2.0 and Azure Active Directory

Azure OAuth2 Security Service configuration

azureLoginConfiguration:    
    tenant: xxxx-xxxx-xxxx-xxxx
    authority: https://login.microsoftonline.com/
    clientId: xxxx-xxxx-xxxx-xxxx
    redirectUrl: https://datalab.azure.cloudapp.azure.com/
    validatePermissionScope: true
    permissionScope: subscriptions/xxxx-xxxx-xxxx-xxxx/resourceGroups/xxxx-xxxx/providers/Microsoft.DataLakeStore/accounts/xxxx/providers/Microsoft.Authorization/
    managementApiAuthFile: /datalab/keys/azure_authentication.json

where:

  • tenant - tenant id of your company
  • authority - Microsoft login endpoint
  • clientId - id of the application that users log in through
  • redirectUrl - redirect URL to DataLab application after try to login to Azure using OAuth2
  • validatePermissionScope - defines(true/false) if user's permissions should be validated to resource that is provided in permissionScope parameter. User will be logged in onlu in case he/she has any role in resource IAM described with permissionScope parameter
  • permissionScope - describes Azure resource where user should have any role to pass authentication. If user has no role in resource IAM he/she will not be logged in
  • managementApiAuthFile - authentication file that is used to query Microsoft Graph API to check user roles in resource described in permissionScope