Current state: Under Discussion
Discussion thread: here (<- link to https://mail-archives.apache.org/mod_mbox/flink-dev/)
JIRA: here (<- link to https://issues.apache.org/jira/browse/FLINK-XXXX)
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Table of Contents
Today, Flink relies entirely on network isolation to protect itself from unauthorized access. Unfettered access to Flink allows:
Job submission and control.
Access to job parameters which may include credentials and other secrets, and to the cluster’s keytab.
Access to Flink state including queryable state, ZooKeeper state, and checkpoint state.
Impersonation of JobManager and TaskManager instances.
Service authorization refers to hardening of a Flink cluster against unauthorized use with a minimal authentication and authorization layer. This proposal covers all Flink deployment modes, including standalone, K8s, YARN, and Mesos.
Note that multi-user support is out of the scope of this proposal; future features along those lines would be fine-grained authorization, improved job isolation, etc.
Our technical goal is to authorize only specific users to access the Flink cluster, and to authorize intra-cluster communication (e.g. JM-to-TM). This involves adding authentication and authorization logic across Flink’s various endpoints, on both the client and server side. We’ll also consider the evolution of the client occurring under FLIP-6, which is expected to be pure HTTP(S).
The overall approach is to use SSL client authentication wherever possible, plus some environment-specific (e.g. YARN, K8s, Mesos) handling for the web interface (API & UI). In a later section we’ll cover why special-case handling is needed (hint: proxies).
The following sections cover high-level design aspects.
Each deployed Flink cluster shall have an associated key-pair to authenticate internal communication. This key-pair may simply be a self-signed certificate, generated at an appropriate time.
Internal vs External Connectivity
It is helpful to divide connectivity into two categories: internal connectivity and external connectivity (see diagram 1). Internal connectivity (e.g. JM-to-TM) is governed purely by Flink code and configuration, e.g. the truststore is controlled, and both sides possess the cluster key-pair. External connectivity (e.g. web interface) is intended for use by user agents and other external components.
The legacy Flink client (pre FLIP-6) uses internal connectivity, whereas the new HTTP(s) client uses external connectivity.
The internal cluster key-pair isn’t well-suited for use as the SSL certificate on the web interface. We shall introduce new configuration elements to allow for a different certificate there. This will make it possible to use SSL certificates issued by external PKI infrastructure, and which are compatible with the truststore of an ordinary user agent.
Akka Support and Trust Stores
Akka supports mutual authentication as of Akka 2.4, on which Flink 1.4 is based. Akka’s authorization model is very simple; any authenticated user is granted full access. The truststore (which is used to validate peer certificates) is effectively an access control list. This leads to some unexpected results. For example, configuring Akka to use your system’s default truststore would permit a wide array of certificates from authorities across the web.
For our purposes, we desire to grant access only to the internal cluster identity. This is easily accomplished with a truststore containing only that identity.
Other internal endpoints shall use the same approach for consistency. See RFC-2818 for additional background (see “narrow the scope of acceptable certificates”).
The hostname verification flag is of no practical use and shall be deprecated.
Configuring the JMX server for mutual authentication is left as a future exercise. Secure deployments should disable the remote management functionality of JMX.
SSL Client Authentication
For internal communication, Flink shall use SSL client authentication based on a trust store as described in the Akka section of the overview. Expect some code changes on both the server and client side of each endpoint.
We’ll enable client authentication across all internal endpoints with a single configuration option. The existing keystore and truststore options are probably reusable for this purpose.
Most of the internal endpoints are Netty-based. Some refactoring of the Netty bootstrap code would improve reuse. In any case, we’ll adjust the SSL engine to require client auth. Additional logging of the client identity should be added.
The quaeryable state interface shall be treated as intra-cluster communication.
Implement SSL on the queryable state client and server, and then implement SSL client authentication.
External Endpoints (Web, API, Mesos Artifact Server)
Adjust the external endpoints to have an SSL configuration that is independent from that of the internal endpoints. The rationale is to be able to use an SSL certificate provisioned by the cluster manager, e.g. the K8s PKI, whereas the internal endpoints shall use a self-signed certificate with a restrictive truststore.
Integration with Cluster Managers
Flink is deployable into diverse cluster environments - YARN, Kubernetes (K8s), Mesos, and standalone mode. Each system brings unique networking and security aspects that impact how a user interacts with the Flink cluster. See the Appendix for additional background.
Two recurring integration themes are:
Proxy-based Access - the Flink web interface (UI/API) may be accessed via a reverse proxy that imposes restrictions on the authentication method. For example:
SSL termination occurs at the proxy, crippling SSL client authentication.
The HTTP request is forwarded with low fidelity (e.g. headers are dropped).
The proxy authenticates the user in a proprietary way, and may or may not forward the user’s identity onto Flink.
Single Sign-On - the ability to make use of the management user’s established identity (according to the cluster manager) for seamless authenticated access to Flink's control plane.
Taken on its own, single sign-on may be out-of-scope, but the necessity of interoperating with the proxy makes it relevant. For example, see the YARN section which describes how the logged in user (as determined by the proxy) is involved in an authorization decision. It is likely that support for accessing a secure Flink cluster via a cluster's proxy will be developed in stages.
A section for each cluster manager follows.
Enable TLS client authentication on the web endpoint, based on a configured truststore (not the same truststore as is used for internal communication). In practice, the configuration should point to the K8s certificate bundle that is automatically provided to pods at /var/run/secrets/kubernetes.io/serviceaccount/ca.crt (ref).
We may expect the web endpoint to be configured with a TLS server certificate provisioned from the K8s CA.
Today the Flink UI may be accessed via the YARN RM proxy, as is typical for YARN applications. The RM proxy introduces complications, including that the authentication method is Kerberos and that the proxy imposes limitations on the HTTP requests that may be sent. The proposal is therefore split into two options.
Option 1 - Rely on Direct Web/API Access
Expect the user to access the web endpoint directly, bypassing the RM proxy. This approach is basically equivalent to that proposed for Kubernetes (see above).
A YARN AppMaster may advertise both a tracking URL (which is generally accessed via the RM proxy) and an RPC endpoint. Leverage the latter to discover the web/API endpoint.
Option 2 - Access via RM Proxy
Develop a Netty-based filter similar to the servlet-based AmIpFilter provided by Hadoop. The filter validates that the client request originated from the RM Proxy, by comparing the client IP address to the resolved IP(s) of the proxy host configured in yarn-site.xml.
Flink should authorize the request by checking the incoming (proxy-provided) username against the name of the Hadoop user who launched the application.
The REST client must support SPNEGO (Kerberos) authentication to invoke API methods via the RM Proxy of a secure Hadoop cluster. Note that the proxy supports a limited set of HTTP verbs; consider passing the verb as a request parameter.
As with YARN, DCOS services are typically access via the DCOS Admin Router, which acts as a layer-7 proxy. The proposal is again split into two options.
Option 1 - Rely on Direct Web/API Access
Expect the user to access the web endpoint directly, bypassing the Admin Router. This approach is basically equivalent to that proposed for Kubernetes (see above).
Option 2 - Access via Admin Router
Develop an authentication filter that validates that the client request came from the admin router. A simple check of the client IP address against a configured whitelist should suffice.
The whitelist should be configured to include master.mesos in DCOS deployments, which resolves to all known master addresses. Note that the admin router runs on the master node(s).
The admin router will handle authorization. As of DCOS 1.10, it is possible to install services into a namespace or folder associated with a group of users. This will be the basis for isolating Flink clusters within a DCOS environment. See Integration with DC/OS access controls for more information.
The client should support bearer authentication, meaning a simple token for the Authorization header.
The artifact server, which is hosted in the Job Manager, enables the Task Manager to download the Flink distribution, configuration, and cluster keytab (!). A Mesos component called the Fetcher (ref) downloads these artifacts to the Task Manager container. The fetcher supports HTTPS using the system truststore and doesn’t support any form of authentication.
Use a pre-authenticated URL to protect the artifacts. A simple in-memory secret may be used to sign each URL with an HMAC (similar to how S3 requests are signed, ref). In the event of JM failover, the URLs will become invalid, not-yet-started tasks will fail, and the RM will react accordingly.
Use the same authentication and authorization strategy as with Kubernetes.
It makes sense to generate a self-signed certificate for internal cluster communication. It is possible but tricky to generate a certificate using pure Java (see OPENDJ-2382). A simple alternative may be to use keytool from the appropriate startup script.
Compatibility, Deprecation, and Migration Plan
- What impact (if any) will there be on existing users?
- If we are changing behavior how will we phase out the older behavior?
- If we need special migration tools, describe them here.
- When will we remove the existing behavior?
Describe in few sentences how the FLIP will be tested. We are mostly interested in system tests (since unit-tests are specific to implementation details). How will we know that the implementation works as expected? How will we know nothing broke?
Proxy-Based REST Access
This proposal suggests two options for accessing the REST API from the Flink client - direct and via the cluster's proxy (e.g. YARN RM Proxy). While proxy-based access is detailed in this proposal, direct access is considered the likely approach in the short term.
Some background information is provided int the following sections.
Accessing Services Running on Clusters
There are a few ways to connect to a service running in a Kubernetes cluster (ref). This is relevant when considering how the Flink UI and REST API would be accessed and which authentication methods would be useable.
Access thru cluster IPs (i.e. service abstraction).
Access via an apiserver-provided proxy.
Access via an Ingress resource.
Public IPs enable direct access and should support any authentication method.
The apiserver proxy acts as an HTTP reverse proxy (a layer-7 proxy). It authenticates the user using whatever authentication method is configured on the API server, then forwards the HTTP request to the service. The proxy terminates SSL.
Kubernetes provides a TLS infrastructure (ref) that acts as a certificate authority (CA) and provides an API for requesting certificates for various purposes. The primary scenarios appear to be:
To issue client certificates to kubelets.
To support TLS authentication on the API server.
Note that the CA certificate is automatically made available to pods. Here’s an interesting discussion on how a service may obtain certificates for its own purposes:
User Authentication / SSO
Kubernetes supports a variety of authentication methods for management user access to the API Server. No support for single sign-on to workloads.
Web Application Proxy
YARN provides a web application proxy thru which users may access the web endpoint of a running application. The YARN UI links to the application UI via the proxy. See Securing YARN Application Web UIs and REST APIs for more information.
The proxy has a number of features and limitations relevant to authentication:
The proxy handles user authentication (Kerberos-based). Some distributions support other authentication methods also (e.g. Hortonworks via Apache Knox).
The proxy does not forward arbitrary HTTP headers (see whitelist).
The application must validate that incoming web requests originate from the proxy (to prevent bypassing proxy-based authentication). A servlet filter is provided for the purpose, and works by checking the client IP address against the proxy address in yarn-site.xml. Code references: (AmIpFilter, AmIpFilterInitializer, example)
The proxy provides a cookie to the application indicating the logged in user’s Hadoop user name.
Since the Flink web endpoints are pure Netty-based (not servlet-based), Flink cannot use the YARN-provided AmIpFilter filter directly. Netty provides an extensible IpFilter handler on which to develop an alternative.
Hadoop 2.x uses SPNEGO (Kerberos) to authenticate HTTP connections. Hadoop 3.x is working towards using OAuth tokens. See HDFS-8155 for example.
User Authentication / SSO
YARN supports single sign-on by passing the user’s identity from the RM proxy to the app master.
The below information pertains to Enterprise DCOS.
DCOS Admin Router
The admin router provides access to the web endpoint of each installed service such as Flink, functioning as a layer-7 reverse proxy. The proxy has some notable features and limitations:
The proxy handles user authentication (token-based).
The proxy forwards the user authentication token to the web endpoint, however I see no examples of it being used for authentication at the service layer.
DCOS now provides a TLS infrastructure that is similar to that of Kubernetes, including a certificate authority and an API for provisioning certificates. The two intended scenarios are:
Providing a service account for installed services with which to authenticate to Mesos as a framework.
Providing TLS certificates for service endpoints (VIP-based).
The dcos-commons framework automatically requests HTTPS certificates for all VIP endpoints of the service.
User Authentication / SSO
[DCOS 1.9] DCOS does not support fine-grained access control over services. All user accounts are equally capable of managing the DCOS cluster (ref). When forwarding HTTP requests to the service, the admin router checks only that the incoming token refers to a known user account.
[DCOS 1.10] Improved support for namespaced services belonging to particular users/groups. The admin router validates that the target service is accessible by the requester. See Integration with DC/OS access controls for more information.
A service may be able to use the incoming user token for fine grained access control. See Out-of-band Verification of an RS256 Authentication JWT.