View Source

Motivation and principal Use case

Ranger’s Standard plugins are focused on authorizing based on static resources and users/ groups , e.g. Hive plugin allows for configuring authorization by database/table/column/udf. However, customer scenarios warrant access control decisions supplemented by other criteria like IP address, Time of the Day or Geo location from where a user is connecting.

Use cases like the access decision based on time of the day are often called Dynamic Policies. Such policies are called dynamic because answer to the same question changes depending on some dynamic aspect of the access context, e.g. time.

Provision for Extensibility hooks added to Ranger 0.5 was precisely to allow for such extensions of “standard” plugins by (more savvy users or) service providers. Though these extensibility hook point are generic in that they should be useful for any type of extension to authorization - be it static or dynamic.

How does Apache Ranger enforce policies?

To understand the design of the extensibility hooks framework it is essential to have an understanding of Ranger’s processing pipeline and related terms. This section provides a quick review of most important points that are relevant to subject discussion.

Ranger plugin and Ranger Admin

Ranger plugins are loaded in the process of the server that they are authorizing.
1. For example, when ranger is authorizing HBase, ranger plugin code runs in the hbase master and region server processes.
2. For that to happen, Ranger plugin code must be packaged as a jar and made available in the classpath of the hbase master and region server processes.
Policies are housed by the Ranger Admin in a database. At periodic intervals , a plugin polls the Ranger Admin to retrieve the updated version of policies. The policies are cached locally by the plugin and used for access control
The Policy evaluation and policy enforcement happens within the service process.
1. The heart of this processing is the “Policy Engine”.
2. It uses a memory resident cached sets of policies.

Steps of Ranger Policy Evaluation

Policy evaluation within a Ranger plugin can be thought of as consisting of 3 distinct phases:

Request Creation Phase:

This phase builds the Authorization request by gathering the context of the access which is to be authorized.

For example, for HDFS it may be the path of the file being accessed, access type (read, write, execute etc) requested along with other contextual information like user, group, time and ip addresses information.

Policy Evaluation phase:

This is where the authorization request is evaluated by the policy engine to decide if access should be allowed or not. It results in an authorization Result.

The Policy engine compares the information in the authorization request against the set of active policies to make its decision.
It makes decision about authorization and auditability of an access.

Post evaluation Phase

This is the phase where things that are to be done post evaluation are tackled, e.g. generating and logging the audit message to the right audit store, if required.

Overview of extensibility

Customers are looking to add dynamic rules within the policy evaluation. Ranger team has built an Extensibility Framework for evaluating dynamic rules - using Context Enricher and Condition Evaluator hooks. Any user looking to add dynamic rules would need to implement/use appropriate condition evaluators, and if necessary context enrichers.

Steps to extend a plugin

To help understand these steps better let’s assume that we want to implement following two use cases.

Time-of-day: Allow access to a resource during certain time of the day. For example, if the policy specifies “8am-5pm”, the policy should allow access only during this time period (let's defer the issue of timezone for now to keep the use case simple).
Project-assignment-restriction: Allow access to a resource on a need-to-know basis as defined by assignment to a project. Let’s say users move in and out of projects. For compliance reasons, access to data needs to be restricted to a user based on the project the user is currently assigned to. If a policy specifies “risk-modeling” or “debt-recovery” then only users who are currently assigned to one of these projects should be able to access the data.

Now, let us now look at the steps needed to add these sorts of abilities to HDFS service.

Identify the additional information to be added to the request context for your use case. Write a context enricher, if necessary, to add the necessary information to the request context. Refer to the Context Enricher section below for details.
1. Time-of-day use case: Information required to enforce this dynamic rule is the the time when the resource is accessed. Since this information is already present in the access request, there is no need to write a context enricher.
2. Project-assignment-restriction use case: Information required to enforce this dynamic rule is the projects the current user is assigned to. Write a context enricher which adds the assigned projects information to the access request, for example with the following:
  1. Let’s assume that user/group to project assignment is kept in disk file
  2. During initialization, the context enricher reads in the disk file into an in-memory map of users or user-groups to projects.
  3. During enrichment phase, the context enricher extracts the requesting user’s name and group membership from the request and uses them to retrieve currently assigned projects from in-memory map.
  4. The context enricher pushes the currently assigned projects information into the request context.
Implement/use appropriate condition evaluators to enforce conditions specified on the policy. Refer to the Condition Evaluators section below for details.
1. Time-of-day use case: Ranger ships with a standard condition evaluator which does that (RangerTimeOfDayMatcher). Please review its code.
  1. It extracts the access time from the access request and compares against the valid times specified on the policy.
  2. If access time lies within any of the time windows then it returns true.
2. Project-assignment-restriction use case: implement a condition evaluator that compares the user’s project membership (added to the request context by the context enricher) against the project names specified in the policy and return true if user is in one of the projects specified in the policy.
Update the service type definition with the context richer and condition evaluator details so that Policy engine knows to wire them in the authorization pipeline and to let the policy authors enter the condition values in policies. Refer to the Appendix section below for an example for steps and commands that achieve that.
Shutdown the HDFS service and Ranger Admin.
Deploy the Context enricher and condition evaluators’ code to the HDFS namenode and Ranger Admin.
1. Typically for ranger server this would be: /usr/hdp/<version-#>/ranger-admin/ews/webapp/WEB-INF/lib
2. For a service like HDFS this would typically be /usr/hdp/<version-#>/hadoop/lib.
3. Note that for the service the jars need to be copied only to the nodes that take part in the authorization. This differs from service to service. For example, in case of HDFS only namenode participates in authorization. Where as for HBase both Master and Region server participate in authorization.
Restart the both the Ranger Admin and then the HDFS namenode.
Verify that Ranger UI shows the new policy condition.
Create a policy with values for the custom condition and validate that authorization is as expected.

Steps and command to update service definition

Get HDFS service definition from Ranger Admin:
1. ```
curl --user admin:admin --get "http://node-1:6080/service/public/v2/api/servicedef/name/hdfs"
```
2. This returns json response. Note down the id of the service type.
3. Look at sample output #1 below.
Now add context enricher to HDFS service definition using the following command.
1. ```
curl --user admin:admin --put “http://node-1:6080/service/public/v2/api/servicedef/name/hdfs”
```
2. Include the edited json in the body of the post. This should echo back the request body.
3. Look at Sample output # 2 below.
Next add condition evaluator to HDFS service definition using the following command:
1. ```
curl --user admin:admin --put “http://node-1:6080/service/public/v2/api/servicedef/name/hdfs”
```
2. Include the edited json in the body of the post. This should echo back the request body.
3. Look at Sample output #3 below.
Copy the condition evaluator jar to ranger server’s classpath. Copy both condition evaluator and context enricher to the classpath of the service where the ranger plugin runs.

Context Enricher

What is a context enricher?

A Context Enricher is any Java class that extends abstract class RangerAbstractContextEnricher.
A series of context enrichers can be configured in a service type definition.
Before the access request is evaluated by the policy engine, context enrichers are invoked to update the request context with additional information.

Writing a context enricher

Context Enricher is not expected to maintain any state and in general should be written to be reentrant.
Context enricher object would be garbage collected and new one created whenever a new set of policies is available to the plugin. Since the policies change infrequently, this is expected to be of minimal performance impact.

Example Context enrichers

Ranger source code has a ranger-examples sub-project which maintains demo examples of these. The project also is meant to serve as a template maven project that can be cloned and used to build your customer components. Examples project has two context enrichers RangerSampleProjectProvider and RangerSampleCountryProvider. Both of these provide a way to read a key/value map from a disk file (in the form of standard java properties list) that can be used to enrich the context.

Note about the sample enrichers

These context enrichers are provided just to illustrate the semantics of context enrichers and what all they have access to during enrichment phase. They are not meant to be deployed in production as is. For example, a production class enricher would keep its key-value data in database or some other durable data store.

Condition Evaluators

What is a condition evaluator?

A Condition Evaluator is a java class that extends the abstract class RangerAbstractConditionEvaluator.

Condition evaluator servers two related but distinct roles. First is the role it plays during policy evaluation:

It is invoked in by the policy engine during the Policy Evaluation phase.
A series of condition evaluators can be configured for a service type which would be evaluated in series. To be allow access, all evaluators should succeed.
During evaluation the condition evaluator has access to the entire access request.
1. boolean isMatched(RangerAccessRequest request);

Second role of the condition evaluator is during the policy authoring phase:

Ranger Admin uses the information in policy condition definition to prompt the policy author to provide values.
Condition Evaluator should be reentrant.
Condition Evaluator object would be garbage collected and new one created whenever a new set of policies is available to the plugin. Since the policies change infrequently this is expected to be of minimal performance impact.

Writing a condition evaluator

Condition Evaluator should be reentrant.
Condition Evaluator object would be garbage collected and new one created whenever a new set of policies is available to the plugin. Since the policies change infrequently this is expected to be of minimal performance impact.

Example condition evaluators

Ranger source code has a ranger-examples sub-project which maintains demo examples of these. The project also is meant to serve as a template maven project that can be cloned and used to build your customer components. Examples project has a sample condition evaluator RangerSampleSimpleMatcher. In addition Ranger production code uses the following two condition evaluators: RangerIpMatcher. The former is used by the standard Knox plugin.

Appendix

Updating HDFS Service Definition

Step #1

Purpose	To find out the id of the HDFS service so that it can be used in get and put calls
REST Call	curl --user admin:admin --get "http://node-1:6080/service/public/v2/api/servicedef/name/hdfs"
JSON response	See below
Points to Note	id of the service-def is 1. Both the contextEnrichers and conditionEvaluator collections are empty

JSON response:

       "id": 1,

      "guid": "0d047247-bafe-4cf8-8e9b-d5d377284b2d",

      "isEnabled": true,

      "createTime": 1434003778000,

      "updateTime": 1434003778000,

      "version": 1,

      "name": "hdfs",

      "implClass": "org.apache.ranger.services.hdfs.RangerServiceHdfs",

      "label": "HDFS Repository",

      "description": "HDFS Repository",

      "configs": [  ... detailed elided for ease of readability ... ],

      "resources": [ ... detailed elided for ease of readability ... ],

      "accessTypes": [ ... detailed elided for ease of readability ... ],

      "policyConditions": [],

      "contextEnrichers": [],

      "enums": [ ... detailed elided for ease of readability ... ]

Sample output # 2

Purpose	Update the contextEnrichers collection on the HDFS service-definition via PUT.
REST Call	curl --user admin:admin --put "http://node-1:6080/service/public/v2/api/servicedef/name/hdfs"
JSON payload	Edit the response received from the previous step and modify the policy conditions section as indicated below.
JSON response	per the semantics of PUT the response should be semantically similar to the JSON payload. The update times might differ.
Points to Note	Context enricher attribute is an array so that multiple context enrichers can be specified, if needed, which would be invoked in order.

JSON payload:

  "id": 1,

  "guid": "0d047247-bafe-4cf8-8e9b-d5d377284b2d",

  "isEnabled": true,

  "createTime": 1434003778000,

  "updateTime": 1434003778000,

  "version": 1,

  "name": "hdfs",

  "implClass": "org.apache.ranger.services.hdfs.RangerServiceHdfs",

  "label": "HDFS Repository",

  "description": "HDFS Repository",

  "configs": [  ... detailed elided for ease of readability ... ],

  "resources": [ ... detailed elided for ease of readability ... ],

  "accessTypes": [ ... detailed elided for ease of readability ... ],

  "contextEnrichers": [

      "itemId": 1, "name": "project-provider",

      "enricher": "org.apache.ranger.plugin.contextenricher.RangerSampleProjectProvider",

      "enricherOptions": { "contextName" : "PROJECT", "dataFile":"/etc/ranger/data/userProject.txt"}

],

  "policyConditions": [],

  "enums": [ ... detailed elided for ease of readability ... ]

Sample output #3

Purpose	Update the policyConditions collection on the HDFS service-definition via PUT.
REST Call	curl --user admin:admin --put "http://node-1:6080/service/public/v2/api/servicedef/name/hdfs"
JSON payload	Edit the response received from the previous step and modify the policy conditions section as indicated below.
JSON response	per the semantics of PUT the response should be semantically similar to the JSON payload. The update times might differ.
Points to Note	the policyConditions attribute of service def is an array. So multiple policy conditions could be specified which would be evaluated in order. Both of these PUT calls could be combined into a single call. They are listed here separately only for clarity.

JSON Response:

  "id": 1,

  "guid": "0d047247-bafe-4cf8-8e9b-d5d377284b2d",

  "isEnabled": true,

  "createTime": 1434003778000,

  "updateTime": 1434003778000,

  "version": 1,

  "name": "hdfs",

  "implClass": "org.apache.ranger.services.hdfs.RangerServiceHdfs",

  "label": "HDFS Repository",

  "description": "HDFS Repository",

  "configs": [  ... detailed elided for ease of readability ... ],

  "resources": [ ... detailed elided for ease of readability ... ],

  "accessTypes": [ ... detailed elided for ease of readability ... ],

  "contextEnrichers": [

      "itemId": 1, "name": "project-provider",

      "enricher": "org.apache.ranger.plugin.contextenricher.RangerSampleProjectProvider",

      "enricherOptions": { "contextName" : "PROJECT", "dataFile":"/etc/ranger/data/userProject.txt"}

],

  "policyConditions":

      "itemId": 1,

      "name": "ip-range",

      "evaluator": "org.apache.ranger.plugin.conditionevaluator.RangerSampleSimpleMatcher",

      "evaluatorOptions": { CONTEXT_NAME="PROJECT"},

      "validationRegEx":"",

      "validationMessage": "",

      "uiHint":"",

      "label": "Project Matcher",

      "description": "Projects"

  "enums": [ ... detailed elided for ease of readability ... ]

Source code for org.apache.ranger.plugin.conditionevaluator.RangerSimpleMatcher

https://github.com/apache/incubator-ranger/blob/master/agents-common/src/main/java/org/apache/ranger/plugin/conditionevaluator/RangerSimpleMatcher.java