DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
Status
Current state: Under Discussion
Discussion thread: here
JIRA: here [TBD]
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
This KIP only applies to StandardAuthorizer
Motivation
Kafka authorizes access to resources like topics, consumer groups etc. by way of ACLs.
Currently, when adding new ACLs in Kafka, we have two types of resource patterns for topics and groups: (Documentation https://kafka.apache.org/documentation/#security_authz_cli look for --resource-pattern-type [pattern-type])
- LITERAL
- PREFIXED
If we create a GLOB pattern type to match the globular wildcard characters ('*", and "?") it would help organizations streamline their ACL management processes by reducing the number of ACLs.
Example scenarios :
Let's say we need to create ACLs for the following 6 topics: nl-accounts-localtopic, nl-accounts-remotetopic, de-accounts-localtopic, de-accounts-remotetopic, cz-accounts-localtopic, cz-accounts-remotetopic
Currently, we achieve this using existing functionality by creating three prefixed ACLs as shown below:
kafka-acls --bootstrap-server localhost:9092 \ --add \ --allow-principal User:CN=serviceaccount,OU=ServiceAccountUsers,O=Unknown,L=Unknown,ST=Unknown,C=Unknown \ --producer \ --topic nl-accounts- \ --resource-pattern-type prefixed kafka-acls --bootstrap-server localhost:9092 \ --add \ --allow-principal User:CN=serviceaccount,OU=ServiceAccountUsers,O=Unknown,L=Unknown,ST=Unknown,C=Unknown \ --producer \ --topic de-accounts- \ --resource-pattern-type prefixed kafka-acls --bootstrap-server localhost:9092 \ --add \ --allow-principal User:CN=serviceaccount,OU=ServiceAccountUsers,O=Unknown,L=Unknown,ST=Unknown,C=Unknown \ --producer \ --topic cz-accounts- \ --resource-pattern-type prefixed
However, if we supported a GLOB pattern where '?' matched a single character and "*" matches any number of characters, we could accomplish this with a single ACL, as illustrated here:
kafka-acls --bootstrap-server localhost:9092 \ --add \ --allow-principal User:CN=serviceaccount,OU=ServiceAccountUsers,O=Unknown,L=Unknown,ST=Unknown,C=Unknown \ --producer \ --topic ??-accounts-* \ --resource-pattern-type glob
The same applies to consumer groups as well as other ResourceTypes
kafka-acls --bootstrap-server localhost:9092 \ --add \ --allow-principal User:CN=serviceaccount,OU=ServiceAccountUsers,O=Unknown,L=Unknown,ST=Unknown,C=Unknown \ --producer \ --group *-testgroup-* \ --resource-pattern-type glob
Notes:
- PREFIX patterns can be evaluated by the GLOB processor by adding an "*" to the end of the prefix pattern.
- LITERAL patterns can be evaluated by the GLOB processor.
- Deprecation of PREFIX and LITERAL may be undertaken but it is not part of this KIP.
ACL precedence :
The ACL precedence does not change.
- Kafka evaluates both the PREFIX and LITERAL ACLs for the topic.
- If either ACL is a deny ACL, it will take precedence and block access.
- In the absence of a DENY ACL the most specific matching ACL will take precedence.
When GLOBS are in use the most specific (fewest skipped letters) pattern is selected.
Glob ACL → Search ↓ | Hello | HelloWo*d | HelloW* | He*m*m |
|---|---|---|---|---|
| Hello | X | |||
| HelloWorld | X | |||
| HelloWonderful | X | |||
| HelloWorldWide | X | |||
| HelloDolly | X | |||
| Hermesmoon | X | |||
| Helmsman | X | |||
| HelloWiked | X | |||
| Helpme |
Support for adding ACLs to such 'match resource patterns' will greatly simplify ACL operations.
Usability :
With the ACL system becoming a complex web of patterns, it is incumbent upon the development team to provide tools to assist in permissions problem determination.
- There should be a tool that will provide a list of all ACLs that impact the decision to allow or deny access for a principal to a topic based on principal ID, host, and operation. This will assist operators in rapidly determining the reason for access denied errors.
- There should be a tool to show the effects of adding an ACL. Using the example from above adding *-accounts-*", should list that nl-accounts-localtopic, nl-accounts-remotetopic, de-accounts-localtopic, de-accounts-remotetopic, cz-accounts-localtopic, and cz-accounts-remotetopic are affected.
- There should be a tool to show the effects of adding a topic. Using the example from above adding *us-accounts-privatetopic", should list that "*-accounts-*" will influence the permissions calculations for the new topic.
I would like to propose the need of this kind of tooling in a different KIP.
Public Interfaces
Modification of the org.apache.kafka.common.resource.ResourceType class to add "GLOB" as a type.
- Modification of the org.apache.kafka.common.resource.ResourcePatternFilter to properly filter for GLOB patterns.
- Modification of the org.apache.kafka.common.acl.AccessControlEntryData to handle GLOB patterns.
- Addition of GLOB matching utilities or a GLOB class that will match strings to GLOB patterns. Potential source candidates include Plexus matching utility code.
- Modification of the metadata/authorizer packages to account for GLOB patterns in resource name as well as principal and host strings
- Change to the Kafka principal prohibiting the colon ":" as part of the type string. This is currently implied but not enforced.
- Addition of a static "parse(String)" method to the KafkaPrincipal to parse a principal string into a KafkaPrincipal object. Everything up to the first colon ':' will be assigned as the principal type, everything after will be assigned to the principal name. This will centralize the parsing and reduce the possibility of parsing errors in the codebase.
- Modification of the org.apache.kafka.server.authorizer.Authorizer to reimplement or remove the authorizeByResourceType implementation to account for GLOB types.
- Modification of the kafka.admin.AclCommand class to update multiple methods like getResourceFilter and objects for parsing arguments AclCommandOptions
- Modification of the org.apache.kafka.jmh.acl.AuthorizerBenchmark class to update multiple methods like setup and prepareAclCache
- Modification of org.apache.kafka.jmh.acl.StandardAuthorizerUpdateBenchmark class to update prepareAcls method
- Modification of org.apache.kafka.metadata.authorizer.StandardAuthorizerData class to update authorize method
- Modification of org.apache.kafka.controller.AclControlManager class to update validateNewAcl method
Proposal for Locating matches in a GLOB based Authorizer
Search requirements
- Matching DENY ACLs always override Matching ALLOW Acls.
- More specific string matching overrides less specific string matching.
Notes and Nomenclature
- This proposal is based on utilizing the Trie implementation found in KAFKA-17423.
- The Trie is based on the resource name, each node contains a set of ACLs that are associated with the name.
- The string we are searching for is called the TARGET
- The string we are comparing against is call the CANDIDATE.
- GLOB characters are "?" and "*".
- "?" matches a single character
- "*" matches zero or more characters.
Matching
Matching the resource name
It is important to remember that the patterns with GLOBS are the CANDIDATE stored in the Trie and are not the TARGET being searched for.
The Trie naturally split resource names where there is a distinction between two names (e.g. foobar and foocar will result in a "foo" node with two children "bar" and "car"). When inserting pattern with GLOB characters the insert algorithm will create child nodes that contain only the the GLOB character.
The Trie implementation starts at the root node which contains the empty string. It then begins a recursive descent of the trie by executing the Descent Process on the root node.
Descent process
- Is there a matching DENY ACL on this node? If so return DENY.
- is there a matching LITERAL ACL on this node? if so return the result.
- is there a child node that continues the TARGET pattern? If so recurse in to the Descent process.
- if this point is reached stop descent process and begin Ascent process.
Ascent process
- Is this the root node? if so return NO MATCH.
- Is there a matching GLOB ACL on this node? if so return the result.
- Is there a "?" character child of this node? if so execute the descent process on the "?" node.
- Is there a "*" character child of this node? if so execute the descent process for each potential block of characters. (This handles the multiple character nature of the '*' wildcard)
- if this point is reached move to the parent node and execute the Ascent process again.
The above process will match the resource names and will distribute them through a Trie so that the search will be much faster.
Matching Kafka Principal
We define an GlobPrincipal as
public class GlobPrincipal {
final private KafkaPrincipal principal;
final private Predicate<KafkaPrincipal> matcher;
public GlobPrincipal(String pattern) {
this(KafkaPrincipal.parse(pattern));
}
public GlobPrincipal(KafkaPrincipal principal) {
this.principal = principal;
this.matcher = globMatcher(principal);
}
private static boolean hasGlob(String s) {
return StringUtils.isEmpty(s) || s.contains("*") || s.contains("?");
}
private static Predicate<KafkaPrincipal> globMatcher(KafkaPrincipal principal) {
if (hasGlob(principal.toString()))
{
Predicate<String> typePredicate;
if (hasGlob(principal.getPrincipalType()) {
MatchPattern typePattern = new MatchPattern(principal.getPrincipalType());
typePredicate = typePattern::matches;
} else {
typePredicate = s -> principal.getPrincipalType().equals(s);
}
Predicate<String> namePredicate;
if (hasGlob(principal.getName()) {
MatchPattern namePattern = new MatchPattern(principal.getName());
namePredicate = namePattern::matches;
} else {
namePredicate = s -> principal.getName().equals(s);
}
return other -> typePredicate.test(other.getPrincipalType()) && namePredicate.test(other.getName());
} else {
return other -> toString().equals(other.toString());
}
}
@Override
final public int hash() {
return principal.hash();
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null) return false;
if (getClass() != o.getClass()) return false;
GlobPrincipal that = (GlobPrincipal) o;
return this.principal.equals(that.principal);
}
@Override
public String getName() {
return principal.getName();
}
public KafkaPrincipal getPrincipal() {
return principal;
}
public matches(String name) {
return matches(KafkaPrincipal.parse(name));
}
public matches(KafkaPrincipal other) {
return matcher.test(other);
}
When stored in the ACL list associated with in the Trie the principal will be stored as a GlobPrincipal. This allows the ACLs on the nodes to be in a sorted order for faster traversal and the matches will include the wildcard matching.
For other uses it performs GLOB character detection and will perform proper matching. This class will need to be used within the Client and Server code to perform matching. An additional "GlobPrincipals" class may be created to store a collection of glob principals and determine if any of the contained GlobPrincipal instances match a principal or string.
Matching Host
We define an GlobHost as
public class GlobHost {
final private String pattern;
final private Predicate<String> matcher;
public GlobHost(String pattern) {
this.pattern = pattern;
this.matcher = globMatcher(pattern);
}
private static boolean hasGlob(String s) {
return StringUtils.isEmpty(s) || s.contains("*") || s.contains("?");
}
private static Predicate<String> globMatcher(String pattern) {
if (hasGlob(pattern))
{
MatchPattern matchPattern = new MatchPattern(pattern);
return matchPattern::matches;
} else
return pattern::equals
}
}
@Override
final public int hash() {
return pattern.hash();
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null) return false;
if (getClass() != o.getClass()) return false;
GlobHost that = (GlobHost) o;
return this.pattern.equals(that.pattern);
}
@Override
public String toString() {
return pattern;
}
public matches(String other) {
return matcher.test(other);
}
When stored in the ACL list associated with in the Trie the principal will be stored as a GlobPrincipal. This allows the ACLs on the nodes to be in a sorted order for faster traversal and the matches will include the wildcard matching.
For other uses it performs GLOB character detection and will perform proper matching. This class will need to be used within the Client and Server code to perform matching. An additional "GlobHosts" class may be created to store a collection of glob principals and determine if any of the contained GlobHost instances match a principal or string.
Pattern Types
A LITERAL match matches all the characters without wildcard expansion.
Trie stored→ Search ↓ | Hello | HelloWorld | H*World |
|---|---|---|---|
| Hello | T | F | F |
| HelloWorld | F | T | F |
| HappyWorld | F | F | F |
A PREFIX match matches all the characters with wildcard expansion
Trie stored→ Search ↓ | Hello | HelloWorld | H*World |
|---|---|---|---|
| Hello | T | F | F |
| HelloWorld | T | T | F |
| HappyWorld | F | F | F |
A GLOB match only matches GLOB and PREFIX labeled ACLs in the Trie, all LITERAL matches are ignored.
Trie stored→ Search ↓ | Hello | HelloWorld | H*World |
|---|---|---|---|
| Hello | T | F | F |
| HelloWorld | F | T | T* |
| HappyWorld | F | F | T |
* Search algorithm would return the LITERAL "HelloWorld" match before the wildcard match was found.
The GLOB pattern can replace both the LITERAL and the PREFIX types as the following rewrite table demonstrates
| Pattern Type | Pattern | Equivalent GLOB pattern |
|---|---|---|
Literal | SomeName | SomeName |
| Prefix | SomeName | SomeName* |
| Literal | * | (empty string) |
The current Trie implementation of the Literal "*" is to place the StandardACLs on the root node so they are located before any node that has characters.
Proposed Changes
Main changes include :
- Updating Authorizer
- AdminClient changes
- Updating cli
Detailed changes also include
- Modification of the org.apache.kafka.server.authorizer.Authorizer to push the authorizeByResourceType down into the AuthorizerData interface. This allows for authorizers that accept or do not accept wildcards.
- Modification of the kafka.security.authorizer.AclAuthorizer class to
- update authorizeByResourceType method and other methods
- update matchingAcls (this is performance sensitive, as it impacts latency of every producer and consumer client to get authorization. Verify AuthorizerBenchmark)
- Modification of the kafka.admin.AclCommand class to update multiple methods like getResourceFilter and objects for parsing arguments AclCommandOptions
- Modification of kafka.server.AuthHelper class to update authorize method
- Modification of the org.apache.kafka.jmh.acl.AuthorizerBenchmark class to update multiple methods like setup and prepareAclCache
- Modification of org.apache.kafka.jmh.acl.StandardAuthorizerUpdateBenchmark class to update prepareAcls method
- Modification of org.apache.kafka.metadata.authorizer.StandardAuthorizerData class to update authorize method
- Modification of org.apache.kafka.controller.AclControlManager class to update validateNewAcl method
- Updating tests
Compatibility, Deprecation, and Migration Plan
- What impact (if any) will there be on existing users?
- Existing ACLs will continue to work as expected. Addition of GLOB allows for new formats for resource names, host names, and Kafka principals
- If we are changing behavior how will we phase out the older behavior?
- This question is not applicable, as it introduces a new functionality to authorize. Old behavior will still continue to exist. Deprecation may occur at a later time.
- If we need special migration tools, describe them here.
- No, not required.
- When will we remove the existing behavior?
- It is not required to remove any existing behavior
Test Plan
Describe in few sentences how the KIP will be tested. We are mostly interested in system tests (since unit-tests are specific to implementation details). How will we know that the implementation works as expected? How will we know nothing broke?
- Create a bunch of topics with similar prefixes and suffixes and more with these patterns.
- Create an ACL with LITERAL on one topic, and verify authorization on all other topics. Shouldn't be allowed.
- Create an ACL with PREFIXED pattern type, and verify the created ACL (both LITERAL and PREFIXED) work as expected
- As performance is the key, its very important to test with large set of ACLs in multiple combinations concurrently and validate against the defined bench marks
- Existing system test will continue to function as they do today.
- We will add additional system tests to show that the GLOB implementation handles the wildcards correctly.
Trie vs KRAFT Standard Search times
The data is available in KAFKA-17423 - Getting issue details... STATUS
However, the testing indicates that the Trie search times are at least an order of magnitude faster than the existing system.
Rejected Alternatives
If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.
- Rejected using the technique from bioinformatics called sequence characterization as it was too slow for the hot path. A poster describing this technique is available.
- Modifying the existing PREFIXED pattern to include internal GLOB characters. This break the Client code with no easy way forward.