This page is a collection of material describing the architecture and internal functionality of Apache Flink.
It is intended as a reference both for advanced users, who want to understand in more detail how their program
is executed, and for developers and contributors that want to contribute to the Flink code base, or develop
applications on top of Flink.
High-level Overview
System Overview - A high-level sketch of the basic concepts behind Flink
Architecture and Program Representations - The journey of a program, from the fluent API to the parallel data stream execution.
Parallelism and Scheduling - Parallel tasks that a program consists of, cluster resources, scheduling and distribution of parallel tasks
Data exchange between tasks - The JobGraph as a flexible representation for batch and streaming programs. Operators and intermediate results.
Processes and Communication - High level interactions between User Program, Client, JobManager, and TaskManager
Type System, Type Extraction, Serialization - Background and details about Flink's type handling and serialization
System Details
Project structure - Maven projects and artifacts, dependencies
Hadoop Versions and Dependency Shading
System Components - Internals of JobManager and TaskManager and the Actor Systems
Life Cycle of a Job - Detailed steps for the execution of a job. From the API through the parallel execution to the result
Memory Management (Batch API) - Details on Flink's custom memory management and memory guarantees
Optimizer Internals - Details on the optimizations and the mechanism of the batch program optimizer
Network Stack, Plugable Data Exchange
Task Failures and Error Handling - Exceptions in tasks involving data transfer need to be attributed properly to expose root failure causes, rather than exceptions that happen as follow-up of failure/canceling
Testing Utilities and Mini Clusters