Title: High Level Design of Role Based Access Controller in SQOOP 2
JIRA : SQOOP-1834 and its sub tickets, SQOOP-2048 and its sub tickets
Problem
Sqoop 2 needs a pluggable role based access controller (RBAC), which is responsible for the authorization to Sqoop 2 resources, such as server, connector, link, job, etc.
Basic Idea
- The access controller is pluggable
- Set controller class in sqoop.properties
org.apache.sqoop.accessController.class=org.apache.sqoop.accessController.DefaultSqoopAuthorizerImpl
- The default implement in Sqoop 2 could be a FAKE controller (always return true)
- The access controller class could be implemented by other controller framework, such as Sentry
- Connector
Resource, actions and rules
Server has three children: Connector, Link, Job.
- It is a hierarchy mode. If a user has the privilege of {server, all}, then he/she has all privileges of {connector, all}, {link, all} and {job, all}.
- If a user has the privilege of {job, all}, then he/she has both privileges of {job, read} and {job, write}.
- If a user want to create a link, then he/she need to have the privilege of {server, create}
Resource | Global Namespace |
---|---|
Server |
|
Connector |
|
Link |
|
Job |
|
Action | Privilege needed |
---|---|
show connector |
|
show link |
|
create link |
|
update link |
|
delete link |
|
enable link |
|
disable link |
|
show job |
|
create job |
|
update job |
|
delete job |
|
enable job |
|
disable job |
|
start job |
|
stop job |
|
show submission |
|
Authorization framework
- Config in sqoop.properties
#org.apache.sqoop.authorization.handler=org.apache.sqoop.security.DefaultAuthorizationHandler #org.apache.sqoop.authorization.controller=org.apache.sqoop.security.DefaultAccessController #org.apache.sqoop.authorization.validator=org.apache.sqoop.security.DefaultAuthorizationValidator
- Four metadata classes.
- Role
- principal
- This class defines user or group.
- Type: user, group, role.
- principal could be granted a role. i.e. if we want to grant a admin role to user hadoop, then grantRole (principal (name=hadoop, type=user), role (name=admin)).
- Resource
- This class defines four resources in Sqoop 2.
- Type: server, connector, link, job.
- Privilege
- Action: all, read, write.
- with_grant_option: boolean, defines whether the role could grant this privilege to other role.
- Five classes will be added into Sqoop-core as org.apache.sqoop.security package.
- AuthorizationManager
- Similar with other Sqoop Manager, ie. ConnectorManager, RepositoryManager, etc., the AuthorizationManager handles two singleton instances, AuthorizationManager and AuthorizationHandler.
- The initialize function is run when starting the Sqoop server
- The initialize function will initial AuthorizationHandler, according to the handler name (DefaultAuthorizationhandler or SentryAuthorizationHandler) from configuration file (sqoop.properties).
- AuthorizationHandlerFactory
- It is a factory design mode.
- It is to use ClassUtils.loadClass to refact the real AuthorizationHandler in getAuthorizationHandler function.
- AuthorizationHandler
- It is an abstract class.
- There is a default implementation (DefaultAuthorizationHandler) in Sqoop-security component.
- It handles two singleton instances, AccessController and AuthorizationValidator.
- All function will be delegated to these two instances to handle. AccessController to handle grantRole, revokeRole, grantPrivilege and revokePrivilege. AuthorizationValidator to handle checkPrivilege.
- AccessController
- It is an abstract class.
- There is a default implementation (DefaultAccessController) in Sqoop-security component.
- This class is responsible to manage roles, privileges.
- AuthorzationValidator
- It is an abstract class.
- There is a default implementation (DefaultAuthorizationValidator) in Sqoop-security component.
- This class is responsible to check privileges.
- AuthorizationManager
- Three classes will be added into Sqoop-security as org.apache.sqoop.security package.
- DefaultAuthorizationHandler
- This class extends abstract AuthorizationHandler.
- It handles two singleton instances, DefaultAccessController and DefaultAuthorizationValidator.
- DefaultAccessController
- This class extends abstract AccessController.
- Default AuthorzationValidator
- This class extends abstract AuthorizationValidator.
- As default/simple implementation, it always returns true and will not check the privilege actually.
- DefaultAuthorizationHandler
- All functions in RequestHandler, which handles all requests, ie. create link, will be added privilege validation check.
/** * Create or Update link in repository. * * @param ctx Context object * @return Validation bean object */ private JsonBean createUpdateLink(RequestContext ctx, boolean create) { AuthorizationEngine.createLinkPrivilige(); ...... }
- Privilege check request will be analyzed by AuthorizationEngine.
Override public void createLinkPrivilige() throws SqoopAccessControlException { List<Privilege> privileges; privileges.add(new Privilege(new Resource("Link", "1"), "Create", null)); privileges.add(new Privilege(new Resource("Connector", "1"), "Read", null)); AuthorizationManager.getAuthenticationHandler.checkPrivileges(privileges); }
- Privilege check will be passed to real AccessController from AuthorizationHandler.
@Override public void checkPrivileges(List<principal> principals) throws SqoopAccessControlException { authValidator.checkPrivileges(principals); }
Command line tool
- The grant/revoke privilege should be run in command line in Sqoop client
- The commands are showed below
Create/Drop Role
CREATE ROLE role_name DROP ROLE role_name SHOW ROLE
Grant/Revoke Roles
GRANT ROLE role_name [, role_name] ... TO principal_specification [, principal_specification] ... REVOKE ROLE role_name [, role_name] ... FROM principal_specification [, principal_specification] ... principal_specification: USER user_name | GROUP group_name | ROLE role_name
Viewing Granted Roles
SHOW ROLE GRANT principal_specification SHOW PRINCIPAL ON ROLE role_name principal_specification: USER user_name | GROUP group_name | ROLE role_name
Grant/Revoke Privileges
GRANT privilege_action_type [, privilege_action_type] ... ON resource [, resource] ... TO principal_specification [, principal_specification] ... [WITH GRANT OPTION] REVOKE [GRANT OPTION FOR] privilege_action_type [, privilege_action_type] ... ON resource [, resource] ... FROM principal_specification [, principal_specification] ... REVOKE ALL PRIVILEGES FROM principal_specification [, principal_specification] ... privilege_action_type: ALL | CREATE | READ | WRITE resource: SERVER server_name | CONNECTOR connector_name | LINK link_name | JOB job_name principal_specification: USER user_name | GROUP group_name | ROLE role_name
Viewing Granted Privileges
SHOW GRANT principal_specification [ON resource] principal_specification: USER user_name | GROUP group_name | ROLE role_name resource: SERVER server_name | CONNECTOR connector_name | LINK link_name | JOB job_name
- Restful call API is handled by org.apache.sqoop.handler.AuthorizationEngine.java in sqoop-server
- POST /authorization/roles/create
- Create new role with {name}
- DELETE /authorization/role/{role-name}
- GET /authorization/roles
- Show all roles
- GET /authorization/principals?role_name={name}
- Show all principals in role with {name}
- GET /authorization/roles?principal_type={type}&principal_name={name}
- Show all roles in principal with {name, type}
- PUT /authorization/roles/grant
- Grant a role to a user/group/role
- PUT data of JsonObject role(name) and principal (name, type)
- PUT /authorization/roles/revoke
- Revoke a role to a user/group/role
- PUT data of JsonObject role(name) and principal (name, type)
- PUT /authorization/privileges/grant
- Grant a privilege to a principal
- PUT data of JsonObject principal(name, type) and privilege (resource-name, resource-type, action, with-grant-option)
- PUT /authorization/privileges/revoke
- Revoke a privilege to a principal
- PUT data of JsonObject principal(name, type) and privilege (resource-name, resource-type, action, with-grant-option)
- If privilege is null, then revoke all privileges for principal(name, type)
- GET /authorization/privileges?principal_type={type}&principal_name={name}&resource_type={type}&resource_name={name}
- Show all privileges in principal with {name, type} and resource with {resource-name, resource-type}
- If resource is null, then show all privileges in principal with {name, type}
- POST /authorization/roles/create
Sentry implementation
- Sentry could be used as an alternative access controller
- Config in sqoop.properties
#org.apache.sqoop.authorization.handler=org.apache.sqoop.security.SentryAuthorizationHandler #org.apache.sqoop.authorization.controller=org.apache.sqoop.security.SentryAccessController #org.apache.sqoop.authorization.validator=org.apache.sqoop.security.SentryAuthorizationValidator
- Use Sentry to check access privilege
- Set access privilege using hue (optional)
Database design
- Role table
- Id
- Name
- Comment
- Role name could be admin, developer, user, etc.
- Role_User_Group table
- Id
- Role_id
- User_name
- Group_name
- Comment
- The information of user and group comes from Linux or LDAP etc.
- Only one of user name and group name is set. If user name is set and leave group name empty, it means that this user has this rule. If group name is set and leave user name empty, it means that all users in this group has this rule.
- One user/group could have one or multiple roles.
- Privilege table
- Id
- Role_id
- Resource_id
- Resource_type
- Action_type
- Comment
- Resource type could be the existing resource table, such as connector, link, job, etc.
- Resource type could be added in the future, say config etc.
- If resource_id is 0, it means all resource of this type, ie. resource_id=0 and resource_type=link means all links.
- Use resource id and resource type to identify the resource, ie. resource_id=1 and resource_type=link means the resource of “select * from link where id =1”.
- Action type could be read, create, update, delete, use etc.
- Accordingly, MRole, MRoleUserGroup and MPrivilege classes are added into package org.apache.sqoop.model.
14 Comments
Veena Basavaraj
Not sure every design feature yet, but overall like the structure of this doc!
Qian Xu
About the "Resource and actions" section. In Link level, Create and Update are separated. In Job level, Create/Start are grouped. Update/Stop are grouped. Can you revise this part?
Jarek Jarcec Cecho
Very nicely written proposal Richard! I do have few questions for the privilege model:
richard
Jarek Jarcec Cecho, thanks for your comment. Here are my answers:
Abraham Elmahrek
Some more notes:
richard
Abraham Elmahrek, the admin role, which is granted all privileges, is set in sqoop.properties and will be initally created when starting Sqoop server. And there is a with_grant_option which indicates whether this user could grant his/her privilege to other users. All user have access to RBAC APIs, but they may not have privilege to grant role/privilege. The RBAC will check the principal before running the command.
Jarek Jarcec Cecho
Thank you for answering all the questions richard! I have a few more notes in the discussion:
I wasn't oppose to the global privileges originally, but as I'm following the discussion I'm starting to think that having those will unnecessary complicate the system. It seems that we want to have and admin user in the configuration (similar concept is already in Sentry and other systems so it make sense to me) - this user can change any privileges and already have the "global privileges". Which I think is further diminishing the value of global privileges (e.g. their "usability").
Additional and smaller concern is that the global privileges will also complicate a bit admin's life as there are now two privileges for the same - e.g. I can grant user access to global links and specifically to one link object. I would expect that revoke on the link object should prevent the user to access it, but in this case as the user also have the global privilege it would still be accessible.
Abraham Elmahrek
I have a few more thoughts on the privilege model:
Jarek Jarcec Cecho
Great points Abraham Elmahrek!
richard
Thanks Abraham Elmahrek and Jarek Jarcec Cecho.
I have modified the page: remove "submission" type and add action "start_stop", "status" in "job" type.
For action hierarchy, I have added a hierarchy map. Please help to review. Thanks.
Abraham Elmahrek
richard
Abraham Elmahrek, thanks for your comment. Here is my thought.
So, I suggest that
Jarek Jarcec Cecho
Thank you for the nice summary richard. I think that I see and I would agree with all your points with the small exception of the CREATE privilege.
Hive do have several layers of objects - Server -> Database -> Table where each of the objects can have CREATE privilege. In practice this is used to divide and conquer - global administrators give CREATE privilege on given database to certain subset of users, so that those users can do anything they need (they are full admins, but only within this database). Also in order to create an object, one don't need READ privilege on any other object - the semantics of CREATE is that you can create any children for current node.
In our case, the situation is different - the CREATE make sense only on Server instance as we have only two levels of hierarchy. And as server is a singleton (there is only one server) you can't use this privilege to divide and conquer. You either have it for everything or for nothing - that pretty much overlaps with admin role that can do everything. Also we are effectively using READ privilege to distinguish whether user can create certain objects, so it seems that the CREATE is not adding much value.
richard
Thanks Jarek Jarcec Cecho for clarification. I want to confirm one thing. Maybe my misunderstanding.
Does server only have ALL privilege?
From my perspective, there are four types of action privileges on SERVER level: READ, WRITE, CREATE and ALL, which means that if user has READ privilege on SERVER, then he/she has the READ privilege of CONNECTOR, LINK and JOB. In this case, the SERVER privilege could be divided and conquered.
If SERVER has ALL privilege only, why this? Why not make SERVER has READ, WRITE and CREATE?