Current state: DISCARDED
Discussion thread: https://firstname.lastname@example.org/msg99126.html
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
The discussion on this KIP lead us to create KIP-519: Make SSL context/engine configuration extensible which is in ACCEPTED state now and this particular KIP is not needed anymore.
Current Kafka versions supports file based KeyStore and TrustStore via ssl.keystore.location and ssl.truststore.location configurations along with required passwords configurations.
This configuration requirement creates challenges for larger Kafka deployment with following setup,
- Having a company internal CA authority issuing the certificates
- Thousands of brokers
- 10x/100x number of Kafka Client boxes
- Requirement to use client auth
- Need to avoid storing key/trust store files on file system for stronger security
- Polyglot client base
The challenges are,
- Deploying the key/trust store files to the thousands of brokers
- Deploying the key/trust store files to 10x/100x Kafka Client boxes
- Keep all those key/trust store files and their passwords secure
- Operationally manage key rotations
- Lack of unified way to distribute KeyStore and TrustStores for different languages
Primarily the requirement of having the KeyStore and TrustStore locations on the file system manifests itself into many of the challenges listed above.
Assuming we have a custom Key Manager which can provide secure API to provision/access the keys we must find an alternative to the file based KeyStore and TrustStore.
The primary motivation here is to essentially provide an optional way to allow custom way to load KeyStore and TrustStore instead of relying on the file system. Of course, that needs to be accompanied by having an ability to load required passwords (for KeyStore, TrustStore and Keys) accordingly.
We will introduce an optional way to load KeyStore and TrustStore along with their required passwords as applicable.
This will be done via
- Introducing two new configurations- ssl.keystore.loader and ssl.truststore.loader.
- For each of the new configuration, we will have public interfaces as described below,
Why we do not specify key/trust store password as input method arguments in the interfaces?
We are not specifying the key/trust store passwords in the KeyStoreLoader/TrustStoreLoader load() method. This is because we want to avoid the dependency in the caller class to load the password. This implementation leaves it open to the Loader implementation to read required configuration or use other mechanism for fetching the password. Typically if you have a Key Manager solution you might be using some sort of 'auth-token' in order to access the Key Manager's API and might not require key/trust store password (you will still need password for unlocking the keys though).
Kafka Client library and Kafka Broker both uses SslEngineBuilder class to load KeyStore and TrustStore from the file based configurations.
- As documented in public interfaces section, we will introduce two interfaces to allow pluggable implementation to provide key/trust stores loading
- We will make changes to the SslEngineBuilder#createSSLContext() method to invoke the key/trust store loading from new ssl configurations we introduce.
- Pseudocode changes in the SslEngineBuilder#createSSLContext() looks like below
- We will make changes to the SslEngineBuilder#shouldBeRebuilt() method appropriately
Compatibility, Deprecation, and Migration Plan
- What impact (if any) will there be on existing users?
Existing users using file based key/trust stores are not going to be impacted at all.
- If we are changing behavior how will we phase out the older behavior?
Older behavior does not change so no need to phase it out.
- If we need special migration tools, describe them here.
No special migration tools needed.
- When will we remove the existing behavior?
We will keep the existing behavior and add optional new behavior.
Using existing ssl.provider config
We did experiment using ssl.provider config and we wrote a sample provider like below. However it didn't work for us since our provider does not have implementation for SSLContext.TLS/TLSv1.1/TLSv1.2 etc.
We must not have to add implementation for SSL context classes in our provider since we only intend to customize the TrustStoreManager in below example.
Writing a Java security provider and registering it from JRE
Alternative to using ssl.provider configuration is to register the Java security provider in JRE's jre/lib/security/java.security file. This way we won't run into the limitation mentioned in the above rejected approach.
However there are following challenges,
- We have to modify the java.security file on the system which creates the similar challenge as of hosting jks on local file system - meaning maintaining per box, deployment etc
- Assuming modifying java.security file is not a challenge (See KIP-492) , we still have to write a Provider with custom algorithm for TrustManagerFactory and KeyManagerFactory
- When we write those factories implementation there is no easy way to re-use validation logic (example: OpenJDK TrustManagerImpl) done by existing Providers.
- The X509ExtendedTrustManager class is having all the methods abstract so we can't re-use any standard implementations easily
- We will end up copying the validation logic (example: OpenJDK TrustManagerImpl) which is "security domain" centric
- The validation logic deals with essentially three things as of now,
- client side cert checks
- server side cert checks
- certificate path validations
- end point identification verification
- The above validation logic we should not have to deal with it in the first place for the purpose of just loading keys/certs from a different source than the file based key/trust stores.
NOTE: You can only realize the above challenges once you try to write the Provider with Trust/Key Manager factories. We would highly encourage you to try writing (using any other open-source library's provider as an example may not give you the idea) a provider to do this before you decide to comment on this approach.
One suggestion could be - Why not use Java's inbuilt rails to "use any provider's implementation" for key/trust manager AND just plugin our own keys/certs?
That is exactly what we are suggesting to do. Below is the example from our pseudo-code for using TrustManagerFactory.getInstance(). Same applies for KeyManagerFactory.
Other reason for rejecting this approach
Provider for "standard algorithms" are written in order to be re-used. If we just write a Provider tied to a specific way for loading Trust/Key stores defeats the purpose of the re-use of the Providers.
Provide a way to delegate SSLContext creation
We could create a new configuration like ssl.context.loader/ssl.context.initializer and use the implementation class to obtain the object of javax.net.ssl.SSLContext instead of using SslEngineBuilder#createSSLContext().
This is more work than actually needed and we looked at Keys/Secrets' Managers like Hashicorp's Vault as a sample integration and realized that the Vault API provides us way to get the Keys/Secrets/Certs but not the SSLContext object we need. This will be true for most Key Managers since it's primary responsibility is to manage keys/secrets/certs but not the SSLContext.
Also, if we do provide a way to create SSLContext in a custom way, we must still honor the "provider" value used in SslEngineBuilder#createSSLContext().
Overall, we didn't find enough justification to follow this path.
Generated the required SSL configuration values from the Key Manager API
If we have a Key Manager solution which provides APIs like Hashicorp's Vault we need to find a way to generate the required ssl configurations for Kafka (key/trust stores files, passwords etc) from the same.
- Build a mechanism to download the required keys/certs from the Key Manager API and create required key/trust store files
- Standardize the path to the key/trust stores files on the disk
- Protect the key/trust stores files & Kafka Broker/Client files with appropriate permissions
- Generate the required configurations for Kafka Client boxes and Kafka Brokers
... AND we will not need any customization we are talking about here.
However, this approach does not scale for managing deployments across many brokers and client boxes and it does not solve for any of the challenges mentioned in the "Motivation" section.
Hence this approach was rejected.