Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Currently the Flink Kubernetes operator only supports running a Flink deployment in Kubernetes native mode and not standalone mode. One of the main concerns with running in kubernetes native mode is the fact that the Jobmanager needs to have access to the Kubernetes API which can be seen as a security concern in multi-tenant Flink + Kubernetes set-ups. Some scenarios may also wish to have a static allocation of taskmanagers to limit resource allocation for a single Flink cluster.
Supporting standalone mode in the operator also means the operator can support older Flink versions that don’t have Flink Kubernetes native features. Supporting more Flink versions increases the adoption of the operator as a way to manage Flink clusters and provides those users an easier path to upgrade their cluster.
The public interface is the FlinkDeployment custom resource descriptor (CRD), see below.
apiVersion: flink.apache.org/v1alpha1 kind: FlinkDeployment metadata: namespace: default name: basic-example spec: image: flink:1.14.3 flinkVersion: v1_14 flinkConfiguration: taskmanager.numberOfTaskSlots: "2" serviceAccount: flink jobManager: replicas: 1 resource: memory: "2048m" cpu: 1 taskManager: resource: replicas: 1 // (only needed for standalone clusters)* memory: "2048m" cpu: 1 mode: (native, standalone)
We propose adding a
mode to the
spec of the FlinkDeployment CRD to allow both standalone and native clusters to be deployed. This would allow 2 new types of Flink clusters to be created: standalone-application, standalone-session. This will default to native to maintain compatibility.
replicas will be added to the
taskManager spec to specify the number of TaskManager pods to spin up, this will only be used for standalone clusters.
All interactions with the Flink cluster is currently done via the
FlinkService which is integrated with the Kubernetes native nature of the cluster. This will be forked into a
FlinkStandaloneService to enable communication with both cluster types.
With standalone mode being supported the operator can also support deploying Flink clusters older than 1.14 (as far back as 1.2). Doing this we can increase the potential user-base of the operator and provide those users.
Supported Flink images are available on the docker repo from version 1.11  so these can be supported by the connector in standalone mode. Previous Flink versions could also be used by the standalone mode, but not fully supported.
|Flink Version||Native Support (no change)||Standalone Support (application)||Standalone Support (session)|
✅: Fully supported
?: Compatible but not supported
?: Not supported
Note: 1.13 support for native mode isn't implemented yet but should be possible 
Reactive Mode support
With standalone mode the door is open to support reactive mode for Flink cluster deployed by the operator. However as reactive mode is currently an MVP (minimum viable produce) feature  and would only be limited to the application mode this FLIP will not include support for this feature.
Compatibility, Deprecation, and Migration Plan
mode will default to
native to maintain compatibility with the released 0.1.0 version.