DataLab services could be ran in development mode. This mode emulates real work an does not create any resources on cloud provider environment.

Folder structure

  • infrastructure-provisioning – code of infrastructure-provisioning module;
  • services – back-end services source code;
    • billing – billing module for AWS cloud provider only;
    • common – reusable code for all services;
    • provisioning-service – Provisioning Service;
    • security-service – Security Service;
    • self-service – Self-Service and UI;
    • settings – global settings that are stored in mongo database in development mode;


In order to start development of Front-end Web UI part of DataLab - Git repository should be cloned and the following packages should be installed:

  • Git 1.7 or higher
  • Python 2.7 with library Fabric v1.14.0
  • Docker 1.12 - Infrastructure provisioning

Java back-end services

Java components description


Common is a module, which wraps set of reusable code over services. Commonly reused functionality is as follows:

  1. Models
  2. REST client
  3. Mongo persistence DAO
  4. Security models and DAO


Self-Service provides REST based API’s. It tightly interacts with Provisioning Service and Security Service and actually delegates most of user`s requests for execution.

API class nameSupported actionsDescription
BillingResourceGet billing invoice
Export billing invoice in CSV file
Provides billing information.
ComputationalResourceConfiguration limits
Used for computational resources management.
Manage EDGE node.
Used for exploratory environment management.
GitCredsResourceUpdate credentials
Get credentials
Used for exploratory environment management.
InfrastructureInfoResourceGet info of environment
Get status of environment
Used for obtaining statuses and additional information about provisioned resources
InfrastructureTemplatesResourceGet computation resources templates
Get exploratory environment templates
Used for getting exploratory/computational templates
KeyUploaderResourceCheck key
Upload key
Used for Gateway/EDGE node public key upload and further storing of this information in Mongo DB.
LibExploratoryResourceLib groups
Lib list
Lib search
Lib install
User’s authentication.
User’s authentication.
UserSettingsResourceGet settings
Save settings
User’s preferences.

Some class names may have endings like Aws or Azure(e.g. ComputationalResourceAws, ComputationalResourceAzure, etc...). It means that it's cloud specific class with a proper API

Provisioning Service

The Provisioning Service is key, REST based service for management of cloud specific or Docker based environment resources like computational, exploratory, edge, etc.

API class nameSupported actionsDescription
Docker actions for computational resources management.
DockerResourceGet Docker image
Run Docker image
Requests and describes Docker images and templates.
Provides Docker actions for EDGE node management.
Provides Docker actions for working with exploratory environment management.
GitExploratoryResourceUpdate git gredsDocker actions to provision git credentials to running notebooks
InfrastructureResourceStatusDocker action for obtaining status of DataLab infrastructure instances.
LibExploratoryResourceLib list
Install lib
Docker actions to install libraries on netobboks

Some class names may have endings like Aws or Azure(e.g. ComputationalResourceAws, ComputationalResourceAzure, etc...). It means that it's cloud specific class with a proper API

Security service

Security service is REST based service for user authentication against AWS/Azure OAuth2 depending on module configuration and cloud provider.

DataLab provides OAuth2(client credentials and authorization code flow) security authorization mechanism for Azure users. This kind of authentication is required when you are going to use Data Lake.  If Data Lake is enabled default permission scope(can be configured manually after deploy DataLab) is Data Lake Store account so only if user has any role in scope of Data Lake Store Account resource he/she will be allowed to log in If Data Lake is disabled but Azure OAuth2 is in use default permission scope will be Resource Group where DataLab is created and only users who have any roles in the resource group will be allowed to log in.


Front-end components description

Web UI sources are part of Self-Service.

Sources are located in datalab/services/self-service/src/main/resources/webapp

Main pagesComponents and Services
Login pageLoginComponent
applicationSecurityService handles http calls and stores authentication tokens on the client and attaches the token to authenticated calls;
healthStatusService and appRoutingService check instances states and redirect to appropriate page.
Home page (list of resources)HomeComponent
nested several main components like ResourcesGrid for notebooks data rendering and filtering, using custom MultiSelectDropdown component;
multiple modal dialogs components used for new instances creation, displaying detailed info and actions confirmation.
Health Status pageHealthStatusComponent
HealthStatusGridComponent displays list of instances, their types, statutes, ID’s and uses healthStatusService for handling main actions.
Help pagesStatic pages that contains information and instructions on how to access Notebook Server and generate SSH key pair. Includes only NavbarComponent.
Error pageSimple static page letting users know that opened page does not exist. Includes only NavbarComponent.
Reporting pageReportingComponent
ReportingGridComponent displays billing detailed info with built-in filtering and DateRangePicker component for custom range filtering;
uses BillingReportService for handling main actions and exports report data to .csv file.

How to setup local development environment

The development environment setup description is written with assumption that user already has installed Java8 (JDK), Maven3 and set environment variables (JAVA_HOME, M2_HOME).­­­­­­ The description will cover Mongo installation, Mongo user creation, filling initial data into Mongo, Node.js installation

Install Mongo database

use admin
     user: "admin",
     pwd: "<password>",
     roles: [ { role: "dbAdminAnyDatabase", db: "admin" },
              { role: "userAdminAnyDatabase", db: "admin" },
              { role: "readWriteAnyDatabase", db: "admin" } ]
use <database_name>
     user: "admin",
     pwd: "<password>",
     roles: [ "dbAdmin", "userAdmin", "readWrite" ]
  • Load collections form file datalab/services/settings/(aws|azure)/mongo_settings.json
mongoimport -u admin -p <password> -d <database_name> -c settings mongo_settings.json
  • Load collections form file datalab/infrastructure-provisioning/src/ssn/files/mongo_roles.json
mongoimport -u admin -p <password> -d <database_name> --jsonArray -c roles mongo_roles.json

If this command doesn't work for you, try to check Or, use some UI client (f.e: MongoDB Compass )

Setting up environment options

  • Set option CLOUD_TYPE to aws/azure, DEV_MODE to true, mongo database name and password in configuration file datalab/infrastructure-provisioning/src/ssn/templates/ssn.yml
<#assign CLOUD_TYPE="aws">
<#assign DEV_MODE="true">
  database: <database_name>
  password: <password>
  • Add system environment variable DATALAB_CONF_DIR=<datalab_root_folder>/datalab/infrastructure-provisioning/src/ssn/templates or create two symlinks in datalab/services/provisioning-service and datalab/services/self-service folders for file datalab/infrastructure-provisioning/src/ssn/templates/ssn.yml.


ln -s ../../infrastructure-provisioning/src/ssn/templates/ssn.yml ssn.yml


mklink ssn.yml ..\\..\\infrastructure-provisioning\\src\\ssn\\templates\\ssn.yml
  • For Unix system create two folders and grant permission for writing:

Install Node.js

  • Download Node.js from
  • Install Node.js
  • Make sure that the installation folder of Node.js has been added to the system environment variable PATH
  • Install latest packages
npm install npm@latest -g

Build Web UI components

  • Change folder to datalab/services/self-service/src/main/resources/webapp and install the dependencies from a package.json manifest
npm install
  • Replace CLOUD_PROVIDER options with aws|azure in dictionary file datalab/services/self-service/src/main/resources/webapp/src/dictionary/global.dictionary.ts
import { NAMING_CONVENTION } from './(aws|azure).dictionary';

export * from './(aws|azure).dictionary';
  • Build web application
npm run

Prepare HTTPS prerequisites

To enable a SSL connection the web server should have a Digital Certificate. To create a server certificate, follow these steps:

  • Create the keystore.

  • Export the certificate from the keystore.

  • Sign the certificate.

  • Import the certificate into a truststore: a repository of certificates used for verifying the certificates. A truststore typically contains more than one certificate.

Please find below set of commands to create certificate, depending on OS.

Create Unix/Ubuntu server certificate

Pay attention that the last command has to be executed with administrative permissions.

keytool -genkeypair -alias ssn -keyalg RSA -storepass KEYSTORE_PASSWORD -keypass KEYSTORE_PASSWORD -keystore ~/keys/ssn.keystore.jks -keysize 2048 -dname "CN=localhost"
keytool -exportcert -alias ssn -storepass KEYSTORE_PASSWORD -file ~/keys/ssn.crt -keystore ~/keys/ssn.keystore.jks
sudo keytool -importcert -trustcacerts -alias ssn -file ~/keys/ssn.crt -noprompt -storepass changeit -keystore ${JRE_HOME}/lib/security/cacerts

Create Windows server certificate

Pay attention that the last command has to be executed with administrative permissions. To achieve this the command line (cmd) should be run with administrative permissions.

"%JRE_HOME%\bin\keytool" -genkeypair -alias dlab -keyalg RSA -storepass KEYSTORE_PASSWORD -keypass KEYSTORE_PASSWORD -keystore <DRIVE_LETTER>:\home\%USERNAME%\keys\dlab.keystore.jks -keysize 2048 -dname "CN=localhost"
"%JRE_HOME%\bin\keytool" -exportcert -alias dlab -storepass KEYSTORE_PASSWORD -file <DRIVE_LETTER>:\home\%USERNAME%\keys\dlab.crt -keystore <DRIVE_LETTER>:\home\%USERNAME%\keys\dlab.keystore.jks
"%JRE_HOME%\bin\keytool" -importcert -trustcacerts -alias dlab -file <DRIVE_LETTER>:\home\%USERNAME%\keys\dlab.crt -noprompt -storepass changeit -keystore "%JRE_HOME%\lib\security\cacerts"

Useful command
"%JRE_HOME%\bin\keytool" -list -alias dlab -storepass changeit -keystore "%JRE_HOME%\lib\security\cacerts"
"%JRE_HOME%\bin\keytool" -delete -alias dlab -storepass changeit -keystore "%JRE_HOME%\lib\security\cacerts"

Where the <DRIVE_LETTER> must be the drive letter where you run the DataLab.

How to run locally

There is a possibility to run Self-Service and Provisioning Service locally. All requests from Provisioning Service to Docker are mocked and instance creation status will be persisted to Mongo (only without real impact on Docker and AWS).

Both services, Self-Service and Provisioning Service are dependent on datalab/provisioning-infrastructure/ssn/templates/ssn.yml configuration file. Both services have main functions as entry point, SelfServiceApplication for Self-Service and ProvisioningServiceApplication for Provisioning Service. Services could be started by running main methods of these classes. Both main functions require two arguments:

  • Run mode (“server”)
  • Configuration file name (“self-service.yml” or “provisioning.yml” depending on the service). Both files are located in root service directory. These configuration files contain service settings and are ready to use.

The services start up order does matter. Since Self-Service depends on Provisioning Service, the last should be started first and Self-Service afterwards. Services could be started from local IDEA (Eclipse or Intellij Idea) “Run” functionality of toolbox.

Run application flow is following:

  • Create and run provisioning-service configuration:
    • Create Application with name provisining-service-application

      • Main class: com.epam.datalab.backendapi.ProvisioningServiceApplication
      • VM options: -Ddocker.dir=[PATH_TO_PROJECT_DIR]\infrastructure-provisioning\src\general\files\gcp
      • Program arguments : server [PATH_TO_PROJECT_DIR]\services\provisioning-service\provisioning.yml
      • Working directory:[PATH_TO_PROJECT_DIR]
      • Use classpath of module: provisioning-servise
      • PAY ATTENTION: JRE should be the same jre where created server certificate
    • Create and run self-service configuration:

      • Create Application with name self-service-application
        • Main class: com.epam.datalab.backendapi.SelfServiceApplication
        • Program arguments : server [PATH_TO_PROJECT_DIR]/services/self-service/self-service.yml
        • Working directory:[PATH_TO_PROJECT_DIR]
        • Use classpath of module: self-service
        • PAY ATTENTION: JRE should be the same jre where created server certificate
  • Try to access self-service Web UI by https://localhost:8443
User: test
Password: <any>

Infrastructure provisioning

DevOps components overview

The following list shows common structure of scripts for deploying DataLab

Folder structure


Each directory except general contains Python scripts, Docker files, templates, files for appropriate Docker image.

  • base – Main Docker image. It is a common/base image for other ones.
  • edge – Docker image for Edge node.
  • dataengine – Docker image for dataengine cluster.
  • dataengine-service – Docker image for dataengine-service cluster.
  • general – OS and CLOUD dependent common source.
  • ssn – Docker image for Self-Service node (SSN).
  • jupyter/rstudio/zeppelin/tensor/deeplearning – Docker images for Notebook nodes.

All Python scripts, Docker files and other files, which are located in these directories, are OS and CLOUD independent.

OS, CLOUD dependent and common for few templates scripts, functions, files are located in general directory.

├───api – all available API
├───conf – DataLab configuration
├───files – OS/Cloud dependent files
├───lib – OS/Cloud dependent functions
├───scripts – OS/Cloud dependent Python scripts
└───templates – OS/Cloud dependent templates

These directories may contain differentiation by operating system (Debian/RedHat) or cloud provider (AWS).

Directories of templates (SSN, Edge etc.) contain only scripts, which are OS and CLOUD independent.

If script/function is OS or CLOUD dependent, it should be located in appropriate directory/library in general folder.

The following table describes mostly used scripts:

Script name/PathDescription
DockerfileUsed for building Docker images and represents which Python scripts, templates and other files are needed. Required for each template.
base/entrypoint.pyThis file is executed by Docker. It is responsible for setting environment variables, which are passed from Docker and for executing appropriate actions (script in general/api/).
base/scripts/*.pyScripts, which are OS independent and are used in each template.
general/api/*.pyAPI scripts, which execute appropriate function from
template_name/fabfile.pyIs the main file for template and contains all functions, which can be used as template actions.
template_name/scripts/*.pyPython scripts, which are used for template. They are OS and CLOUD independent.
general/lib/aws/*.pyContains all functions related to AWS.
general/lib/os/This directory is divided by type of OS. All OS dependent functions are located here.
general/lib/os/fab.pyContains OS independent functions used for multiple templates.
general/scripts/Directory is divided by type of Cloud provider and OS.
general/scripts/aws/*.pyScripts, which are executed from fabfiles and AWS-specific. The first part of file name defines to which template this script is related to. For example:
common_*.py – can be executed from more than one template.
ssn_*.py – are used for SSN template.
edge_*.py – are used for Edge template.
general/scripts/os/*.pyScripts, which are OS independent and can be executed from more than one template.

Docker actions overview

Available Docker images and their actions:

Docker imageActions
ssncreate, terminate
edgecreate, terminate, status, start, stop, recreate
jupyter/rstudio/zeppelin/tensor/deeplearningcreate, terminate, start, stop, configure, list_libs, install_libs, git_creds
dataengine/dataengine-servicecreate, terminate
Docker and python execution workflow on example of SSN node
  • Docker command for building images docker.datalab-base and docker.datalab-ssn:
sudo docker build --build-arg OS=debian  --file general/files/aws/base_Dockerfile -t docker.datalab-base . ;
sudo docker build --build-arg OS=debian  --file general/files/aws/ssn_Dockerfile -t docker.datalab-ssn . ;

Example of SSN Docker file:

FROM docker.datalab-base:latest


COPY ssn/ /root/
COPY general/scripts/aws/ssn_* /root/scripts/
COPY general/lib/os/${OS}/ /usr/lib/python2.7/datalab/
COPY general/files/aws/ssn_policy.json /root/files/
COPY general/templates/aws/jenkins_jobs /root/templates/jenkins_jobs

RUN chmod a+x /root/; \
    chmod a+x /root/scripts/*

RUN mkdir /project_tree
COPY . /project_tree

Using this Docker file, all required scripts and files will be copied to Docker container.

  • Docker command for building SSN:
docker run -i -v /root/KEYNAME.pem:/root/keys/KEYNAME.pem –v /web_app:/root/web_app -e "conf_os_family=debian" -e "conf_cloud_provider=aws" -e "conf_resource=ssn" -e "aws_ssn_instance_size=t2.medium" -e "aws_region=us-west-2" -e "aws_vpc_id=vpc-111111" -e "aws_subnet_id=subnet-111111" -e "aws_security_groups_ids=sg-11111,sg-22222,sg-33333" -e "conf_key_name=KEYNAME" -e "conf_service_base_name=datalab_test" -e "aws_access_key=Access_Key_ID" -e "aws_secret_access_key=Secret_Access_Key" -e "conf_tag_resource_id=datalab" docker.datalab-ssn --action create ;
  • Docker executes script with action create. will set environment variables, which were provided from Docker and execute general/api/ script:
 elif args.action == 'create':
        with hide('running'):
  • general/api/ will execute Fabric command with run action:
        local('cd /root; fab run')
  • Function run() in file ssn/ will be executed. It will run two scripts general/scripts/aws/ and general/scripts/aws/
    except Exception as err:
        append_result("Failed preparing SSN node. Exception: " + str(err))

    except Exception as err:
        append_result("Failed configuring SSN node. Exception: " + str(err))

  • The scripts general/scripts/<cloud_provider>/ an general/scripts/<cloud_provider>/ will execute other Python scripts/functions for:
  1. 1. Creating configuration file (for AWS) 2. Creating Cloud resources.
  2. 1. Installing prerequisites 2. Installing required packages 3. Configuring Docker 4. Configuring DataLab Web UI
  • If all scripts/function are executed successfully, Docker container will stop and SSN node will be created.

Example of Docker commands


docker run -i -v <key_path><key_name>.pem:/root/keys/<key_name>.pem -e "region=<region>" -e "conf_service_base_name=<Infrastructure_Tag>" -e  “conf_resource=ssn" -e "aws_access_key=<Access_Key_ID>" -e "aws_secret_access_key=<Secret_Access_Key>" docker.datalab-ssn --action <action>

All parameters are listed in section "Self-ServiceNode" chapter.

Other images:

docker run -i -v /home/<user>/keys:/root/keys  -v /opt/datalab/tmp/result:/response -v /var/opt/datalab/log/<image>:/logs/<image>  -e <variable1> –e <variable2> docker.datalab-<image> --action <action>

How to add a new template

First of all, a new directory should be created in infrastructure-provisioning/src/.

For example: infrastructure-provisioning/src/my-tool/

The following scripts/directories are required to be created in the template directory:

└─── – the main script, which contains main functions for this template such as run, stop, terminate, etc.

Here is example of run() function for Jupyter Notebook node:

Path: infrastructure-provisioning/src/jupyter/

def run():
    local_log_filename = "{}_{}_{}.log".format(os.environ['conf_resource'], os.environ['edge_user_name'], os.environ['request_id'])
    local_log_filepath = "/logs/" + os.environ['conf_resource'] + "/" + local_log_filename
    logging.basicConfig(format='%(levelname)-8s [%(asctime)s]  %(message)s',

    notebook_config = dict()
    notebook_config['uuid'] = str(uuid.uuid4())[:5]

        params = "--uuid {}".format(notebook_config['uuid'])
        local("~/scripts/{}.py {}".format('common_prepare_notebook', params))
    except Exception as err:
        append_result("Failed preparing Notebook node.", str(err))

        params = "--uuid {}".format(notebook_config['uuid'])
        local("~/scripts/{}.py {}".format('jupyter_configure', params))
    except Exception as err:
        append_result("Failed configuring Notebook node.", str(err))

This function describes process of creating Jupyter node. It is divided into two parts – prepare and configure. Prepare part is common for all notebook templates and responsible for creating of necessary cloud resources, such as EC2 instances, etc. Configure part describes how the appropriate services will be installed.

To configure Jupyter node, the script is executed. This script describes steps for configuring Jupyter node. In each step, the appropriate Python script is executed.

For example:

Path: infrastructure-provisioning/src/general/scripts/aws/

        params = "--hostname {} --keyfile {} --region {} --spark_version {} --hadoop_version {} --os_user {} --scala_version {}".\
            format(instance_hostname, keyfile_name, os.environ['aws_region'], os.environ['notebook_spark_version'],
                   os.environ['notebook_hadoop_version'], os.environ['conf_os_user'],
            local("~/scripts/{}.py {}".format('configure_jupyter_node', params))

In this step, the script infrastructure-provisioning/src/jupyter/scripts/ will be executed.

Example of script infrastructure-provisioning/src/jupyter/scripts/

if __name__ == "__main__":
    print("Configure connections")
    env['connection_attempts'] = 100
    env.key_filename = [args.keyfile]
    env.host_string = args.os_user + '@' + args.hostname

    print("Configuring notebook server.")
        if not exists('/home/' + args.os_user + '/.ensure_dir'):
            sudo('mkdir /home/' + args.os_user + '/.ensure_dir')

    print("Mount additional volume")

    print("Install Java")

This script call functions for configuring Jupyter node. If this function is OS dependent, it will be placed in infrastructure-provisioning/src/general/lib/<OS_family>/debian/

All functions in template directory (e.g. infrastructure-provisioning/src/my-tool/) should be OS and cloud independent.

All OS or cloud dependent functions should be placed in infrastructure-provisioning/src/general/lib/ directory.

The following steps are required for each Notebook node:

  • Configure proxy on Notebook instance – the script infrastructure-provisioning/src/general/scripts/os/
  • Installing user’s key – the script infrastructure-provisioning/src/base/scripts/

Other scripts, responsible for configuring Jupyter node are placed in infrastructure-provisioning/src/jupyter/scripts/

  • scripts directory – contains all required configuration scripts.

  • infrastructure-provisioning/src/general/files/<cloud_provider>/my-tool_Dockerfile – used for building template Docker image and describes which files, scripts, templates are required and will be copied to template Docker image.

  • infrastructure-provisioning/src/general/files/<cloud_provider>/my-tool_descriptsion.json – JSON file for DataLab Web UI. In this file you can specify:

  • exploratory_environment_shapes – list of EC2 shapes
  • exploratory_environment_versions – description of template

Example of this file for Jupyter node for AWS cloud:

  "exploratory_environment_shapes" :
    "For testing" : [
      {"Size": "S", "Description": "Standard_DS1_v2", "Type": "Standard_DS1_v2","Ram": "3.5 GB","Cpu": "1", "Spot": "true", "SpotPctPrice": "70"}
    "Memory optimized" : [
      {"Size": "S", "Description": "Standard_E4s_v3", "Type": "Standard_E4s_v3","Ram": "32 GB","Cpu": "4"},
      {"Size": "M", "Description": "Standard_E16s_v3", "Type": "Standard_E16s_v3","Ram": "128 GB","Cpu": "16"},
      {"Size": "L", "Description": "Standard_E32s_v3", "Type": "Standard_E32s_v3","Ram": "256 GB","Cpu": "32"}
    "Compute optimized": [
      {"Size": "S", "Description": "Standard_F2s", "Type": "Standard_F2s","Ram": "4 GB","Cpu": "2"},
      {"Size": "M", "Description": "Standard_F8s", "Type": "Standard_F8s","Ram": "16.0 GB","Cpu": "8"},
      {"Size": "L", "Description": "Standard_F16s", "Type": "Standard_F16s","Ram": "32.0 GB","Cpu": "16"}
  "exploratory_environment_versions" :
      "template_name": "Jupyter notebook 5.7.4",
      "description": "Base image with jupyter node creation routines",
      "environment_type": "exploratory",
      "version": "jupyter_notebook-5.7.4",
      "vendor": "Azure"

Additionally, following directories could be created:

  • templates – directory for new templates;

  • files – directory for files used by newly added templates only;

All Docker images are being built while creating SSN node. To add newly created template, add it to the list of images in the following script:

Path: infrastructure-provisioning/src/general/scripts/aws/

        additional_config = [{"name": "base", "tag": "latest"},
                             {"name": "edge", "tag": "latest"},
                             {"name": "jupyter", "tag": "latest"},
                             {"name": "rstudio", "tag": "latest"},
                             {"name": "zeppelin", "tag": "latest"},
                             {"name": "tensor", "tag": "latest"},
                             {"name": "emr", "tag": "latest"}]

For example:


   {"name": "my-tool", "tag": "latest"},


Azure OAuth2 Authentication

DataLab supports OAuth2 authentication that is configured automatically in Security Service and Self Service after DataLab deployment. Please see explanation details about configuration parameters for Self Service and Security Service below. DataLab supports client credentials(username + password) and authorization code flow for authentication.

Azure OAuth2 Self Service configuration

    tenant: xxxx-xxxx-xxxx-xxxx
    clientId: xxxx-xxxx-xxxx-xxxx
    responseMode: query
    prompt: consent
    silent: true
    maxSessionDurabilityMilliseconds: 288000000


  • tenant - tenant id of your company
  • authority - Microsoft login endpoint
  • clientId - id of the application that users log in through
  • redirectUrl - redirect URL to DataLab application after try to login to Azure using OAuth2
  • responseMode - defines how Azure sends authorization code or error information to DataLab during log in procedure
  • prompt - defines kind of prompt during Oauth2 login
  • silent - defines if DataLab tries to log in user without interaction(true/false), if false DataLab tries to login user with configured prompt
  • loginPage - start page of DataLab application
  • maxSessionDurabilityMilliseconds - max user session durability. user will be asked to login after this period of time and when he/she creates ot starts notebook/cluster. This operation is needed to update refresh_token that is used by notebooks to access Data Lake Store

To get more info about responseMode, prompt parameters please visit Authorize access to web applications using OAuth 2.0 and Azure Active Directory

Azure OAuth2 Security Service configuration

    tenant: xxxx-xxxx-xxxx-xxxx
    clientId: xxxx-xxxx-xxxx-xxxx
    validatePermissionScope: true
    permissionScope: subscriptions/xxxx-xxxx-xxxx-xxxx/resourceGroups/xxxx-xxxx/providers/Microsoft.DataLakeStore/accounts/xxxx/providers/Microsoft.Authorization/
    managementApiAuthFile: /datalab/keys/azure_authentication.json


  • tenant - tenant id of your company
  • authority - Microsoft login endpoint
  • clientId - id of the application that users log in through
  • redirectUrl - redirect URL to DataLab application after try to login to Azure using OAuth2
  • validatePermissionScope - defines(true/false) if user's permissions should be validated to resource that is provided in permissionScope parameter. User will be logged in onlu in case he/she has any role in resource IAM described with permissionScope parameter
  • permissionScope - describes Azure resource where user should have any role to pass authentication. If user has no role in resource IAM he/she will not be logged in
  • managementApiAuthFile - authentication file that is used to query Microsoft Graph API to check user roles in resource described in permissionScope

  • No labels