Status
Current state: Accepted
Discussion thread: thread
JIRA: KAFKA-9366
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
In May 2012, the log4j dev team released log4j 1.2.17 and stopped their support to 1.x releases. And from then on, Apache Kafka is still using it in its core and the other subprojects.
The problem caused by the obsolete log4j version is not limited to security problems like CVE-2019-17571. Most users are now familiar with log4j2 (2.x) syntax, not 1.x. For this reason, when they are trying to customize the logging of Apache Kafka or Kafka Connect, they have to work with outdated, dismissed old configuration format.
This KIP proposes to upgrade the log4j 1.x dependencies into log4j2 from the Server-side of Kafka. (For the exact definition of 'server-side', please refer to the 'Which modules will be influenced?' subsection.)
Public Interfaces
This KIP proposes the following:
- Replace server-side dependency from log4j into log4j2, along with their slf4j bindings.
- User-interfacing configurations (like broker logging config), provide additional log4j2-equivalent configuration with backward compatibility.
- For non-user interfacing configurations (like test config), all of them will be migrated into log4j2.
- Migrate from the properties format to the YAML format.
Proposed Changes
0. Which modules will be influenced?
The following modules will be updated:
- clients: core, metadata, raft, server-common, and storage modules are directly dependent on clients module. So, We should include it.
- connect
- core
- metadata
- raft
- storage
- streams: this module directly depends on clients.
- embedded zookeeper
The following modules are not the scope of this proposal with some reasons:
- log4j-appender: This module should not be touched for the users, and its log4j2 equivalent should be provided independently. However, it is above the scope of this proposal.
- tools: VerifiableLog4jAppender depends on log4j-appender. So, we can't migrate them until log4j2-appender is ready.
- trogdor: As of this KIP was passed, trogdor was a part of tools. So, it was excluded.
1. clients
slf4j, log4j dependencies (org.slf4j:slf4j-log4j12, log4j:log4j) will be upgraded into log4j2 (org.apache.logging.log4j:log4j-slf4j-impl, org.apache.logging.log4j). The test logging configuration (src/test/resources/log4j.properties) will be migrated into log4j2. (In this case, we don't care about the backward-compatibility.)
2. connect
slf4j, log4j 1.x dependencies will be upgraded into log4j2 and additional log4j2 configuration file will be provided.
For backward compatibility, Kafka broker will use the log4j configuration file (connect-log4j2.properties) by default. But for informational purpose, the following message will be shown when user launches connect-standalone.sh, connect-mirror-maker.sh, and connect-distributed.sh:
DEPRECATED: using log4j 1.x configuration. To use log4j 2.x configuration, run with: 'export KAFKA_LOG4J_OPTS=\"-Dlog4j.configurationFile=file:$base_dir/../config/connect-log4j2.properties\"'
As the message above states, the user can run Kafka broker with log4j2 config file by setting `export KAFKA_LOG4J_OPTS="-Dlog4j.configurationFile={log4j2-config-file-path}"`. Thanks to log4j12-api, a compatibility bridge between log4j and log4j2, Kafka broker can be run without any changes. Since a log4j2 equivalent for traditional built-in log4j config (log4j2.properties) will be provided, the user can make use of it if they want.
The test logging configuration (src/test/resources/log4j.properties) will be updated into log4j2.
3. core
Like connect, slf4j, log4j 1.x dependencies will be upgraded into log4j2 and additional log4j2 configuration file will be provided.
For backward compatibility, Kafka broker will use the log4j configuration file (log4j2.properties) by default. But for informational purpose, the following message will be shown when user launches kafka-server-start.sh:
DEPRECATED: using log4j 1.x configuration. To use log4j 2.x configuration, run with: 'export KAFKA_LOG4J_OPTS="-Dlog4j.configurationFile=file:$base_dir/../config/log4j2.properties"'
The test logging configuration (src/test/resources/log4j.properties) will be migrated into log4j2, also.
4. metadata
slf4j, log4j 1.x dependencies will be upgraded into log4j2 and the he test logging configuration (src/test/resources/log4j.properties) will be migrated into log4j2.
5. raft
Similar to connect and core. If the user launches test-kraft-server-start.sh, the following message will be shown:
DEPRECATED: using log4j 1.x configuration. To use log4j 2.x configuration, run with: 'export KAFKA_LOG4J_OPTS="-Dlog4j.configurationFile=file:$base_dir/../config/kraft-log4j2.properties"'
6. storage
slf4j, log4j 1.x dependencies will be upgraded into log4j2 and the he test logging configuration (src/test/resources/log4j.properties) will be migrated into log4j2.
7. streams
slf4j, log4j 1.x dependencies will be upgraded into log4j2 and the he test logging configuration (src/test/resources/log4j.properties) will be migrated into log4j2.
Archetype log4j configuration will be updated into log4j2 equivalent (log4j2.properties).
8. embedded zookeeper
Kafka provides an embedded zookeeper functionality with zookeeper-server-start.[sh|bat]
. Since zookeeper's dynamic log level change feature depends on log4j 1.x (especially, Log4j MBean registration feature. see 'Run a JMX console' section here.), we will not support this functionality anymore. If the user runs zookeeper-server-start.[sh|bat]
, the following message will be displayed:
Running with log4j 2.x - Log4j MBean registration is not supported.
9. configuration file
The properties configuration format, introduced in version log4j 2.4, is neither the default nor an original format. While it offers some readability benefits, its verbosity and quirks make it harder to maintain. Moreover, the hierarchical structure of log4j Core 2.x is better suited to formats like XML, JSON, or YAML, while *.properties struggles to reflect this structure effectively.
Adopting YAML requires only adding jackson-dataformat-yaml, with minimal overhead (~400 KiB). Although parsers like SnakeYAML may raise CVE concerns, the jackson team has consistently addressed vulnerabilities promptly. log4j also mitigates risks by internally managing configuration binding and limiting instantiation to @Plugin-annotated classes.
Switching to YAML improves readability, better supports log4j’s structure, and enhances maintainability, making it a more practical and secure choice.
Compatibility, Deprecation, and Migration Plan
Logging configuration file compatibility
At some time or other, the default logging configuration format will be switched into log4j2. In that point, the informational message launcher scripts of core, connect, and raft will be also changed into like the following:
Using log4j 2.x configuration. To use log4j 1.x configuration, run with: 'export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:$base_dir/../config/log4j.properties"'
Root Logger name compatibility
Apache Kafka 3.9.0 supports dynamic logger configuration features for Broker and Connect servers. (below) Since these features were originally designed for log4j 1.x, it uses 'root' for the root logger's name.
# Kafka broker dynamic configuration bin/kafka-configs.sh --bootstrap-server {kafka-cluster-endpoint} \ --entity-type broker-loggers --entity-name {broker-id} \ --alter --add-config {logger-name}=TRACE # Kafka connect REST API curl -s -X PUT -H "Content-Type:application/json" \ http://{kafka-connect-endpoint}/admin/loggers/{logger-name} \ -d '{"level": "TRACE"}'
Although log4j2 changed its root logger name to '' (empty string) but, the root logger name will remain AS-IS for the following reasons:
- Kafka Connect's REST API can't support it, for
/admin/loggers
is already used for the other purpose. - From the user's perspective, defining a 'root' named not-root logger is unreasonable. The logger names generally follow {package-name}.{class-name} form. So, concerning this scenario is not practical nor realistic.
So, we will keep the root logger's name as 'root' even with log4j2.
Rejected Alternatives
Following log4j2's root logger name (empty string)
It breaks API compatibility and removes the ability to control the root logger. (REST API inherently assume every resource has a non-empty name.) So rejected.
Co-work note
This KIP was initially proposed, designed, and worked by Lee Dongjin, but TengYao Chi finalized the implementation.