Current state: Accepted
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
As more enterprises have started using Kafka, there is a increasing demand for authorization for who can publish or consume from the topics. Authorization can be based on different available session attributes or context, like user, IP, common name in certificate, etc. Having an extendable authorization interface will help us to implement the core requirements in the initial phase and make it enterprise ready. Having a pluggable interface will enable other security focused products to provide more advanced and enterprise grade implementations.
Binary log format
The network protocol and api behavior
The APIs will now do authorizations so the clients will see a new exception if they are not authorized for an operation.
Any class in the public packages under clientsConfiguration, especially client configuration
Command line tools and arguments
Describe topic will display acl info.
- Anything else that will likely break existing users in some way when they upgrade
Public Interfaces and classes
This is session from https://reviews.apache.org/r/27204/.
In order to produce to a topic, the principal will require WRITE on the TOPIC.
In order to consume from a topic using the new consumer API, the principal will need: READ on TOPIC and READ on CONSUMER-GROUP.
In order to edit topic config/add acls , the principal will require ALTER on the TOPIC.
Out of the box implementation of the Authorizer.
Self contained and no dependencies with any other vendor or providers.
Will contain a ACLCache that will cache the broker acls and topic specific acls with a TTL of 1 hour.
Deny will take precedence over Allow in competing acls. i.e. if 2 Acls are defined, one that allows an operation from all hosts and one that denies the operation from host1, the operation from host1 will be denied.
When no Acl is attached to a resource , this implementation will always fail close(deny all requests).
When any Acl is attached to a resource only users that are in the allowed list will have access. All users with no explicit allow acls will be denied access by default.
Will allow principals that have READ or WRITE permission the DESCRIBE Operation as well without having to specify explicit acls.
It will use zookeeper as the storage layer for acls. Acls will be stored in json format described below under /kafka-acls/resource-type/<resource-name>.
Example Acl Json That will be stored in zookeeper
Changes to existing classes
KafkaServer will initialize the authorizer based on value of authorizer.class.name config.
KafkaAPI will have an additional field authorizer, which will be passed by KafkaServer at the time of server initialization. KafkaAPI will call authorizer.authorize for all requests that needs to be authorized.If the function returns false , KafkaApi will throw an AuthorizationException
KafkaConfig will have 2 additional configurations.
I have considered zookeeper node ACLs as out of scope for this document, if we decide to make it part of this KIP we will have to change ZKUtils so it can set acls on all zkNodes. I already have an implementation for this (not tested yet ) however we will have to wait for KIP-4 to be merged.
- authorizer.class.name: FQCN of the authroizer class to be used. Provided class must implement Authorizer interface.
- kafka.superusers: list of users that will be given superuser access. These users will have access to everything. Users should set this to the user kafka broker processes are running as to avoid duplicate configuration for every single topic like ALLOW REPLICATION to BROKER_USER for TOPIC from ALL hosts.
authorizer.config.path: path to a properties file that will contain authorizer specific configuration. In case of DefaultAuthorizer implementation this config can contain the following 2 configs:
zookeeper.url: comma separated list of zookeeper host: port that default authorizer should use to store all the acls. Useful when the acl zookeeper store needs to be different from kafka zookeeper.
allow.everyone: The default authorizer implementation denies access to everyone and expects specific allow acls to be defined to grant access. If this flag is set to true it will allow access from everyone unless an explicit deny is found to revoke access.
Authentication and session initialization details are out of scope of this document. We will assume that the authentication is done before any authorization happens and the session instance is properly initialized. As mentioned above, we assume that on secure connection session has principal set to authenticated user and on non secure connections it is set to a special principal such that it's name() function returns "Anonymous".
Since this is pluggable architecture, users can easily replace the default provider implementation by writing their own custom provider and providing that class's FQCN as the value of config authorizer.class.name. On kafka server side on server initialization KafkaServer will read the value of authorizer.class.name, create an instance of the class name specified and call it's init method with KafkaConfig parameter. This instance will be passed as a constructor argument to KafkaAPI.
If the value of authorizer.class.name is null, in secure mode the cluster will fail with ConfigException. In non secure mode in absence of config value for authorizer.class.name the server will allow all requests to all topics , even if the topic has configured acls. This is done purely for backwards compatibility and it will be a security hole. To avoid this we can always default to SimpleAclAuthorizer which will allow only access to topics that has acl configured to allow access for Anonymous users.
Acl Management (CLI)
Please see Kafka Authorization Command Line Interface
Out of scope
* Admin APIs (Create/Alter/Delete e.g.) will not invoke authorizer until KIP-4 is done.
* Setting correct acls on zookeeper nodes.
Compatibility, Deprecation, and Migration Plan
What impact (if any) will there be on existing users?
This shouldn't affect any existing users
If we are changing behavior how will we phase out the older behavior?
No. The default implementation would maintain all existing usability behavior
If we need special migration tools, describe them here.
Mirror maker will have to start using new acl management tool,
When will we remove the existing behavior?
We originally proposed to store the acls as part of TopicConfig and no ACL management APIs were exposed. This had the advantage of simplicity of implementation ,less number of public APIs and Classes (ACL, KafkaPrincipal ,Resource were all private)and out of the box support for mirror maker and cleanup of acls with topic deletion and reusing some of the existing infrastructure around propagating topic config changes. However this approach had the draw back of mixing acls with topic config which seems like braking seperation fo concerns and it could have caused confusion to users using custom authorizer as custom authorizer could completely ignore the acls set using topic config. To overcome this we moved exposing ACL management APIs as public APIs that all authorizer must implement and enforcing all the authorizer to maintain their own ACL storage our side of topic config.