Status
Current state: Under Discussion
Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]
JIRA: KAFKA-10299
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
A previous contribution I made to https://github.com/aiven/aiven-kafka-connect-transforms was suggested as by a member of confluent as being a nice addition to the out of the box Kafka Connect SMTs. The discussion is here https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057. The proposed change would add a new Kafka Connect SMT which would allow for keys or values to be hashed using the configured algorithm. The addition of this would allow for sensitive fields to be obfuscated to prevent private information such as ssn or other identifiable information from flowing.
Public Interfaces
One new class connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Hash.java and a helper class connect/transforms/src/main/java/org/apache/kafka/connect/transforms/util/Hex.java are proposed additions. No modifications required to existing interfaces.
Proposed Changes
The proposed change can be viewed here https://github.com/apache/kafka/pull/9057 it would allow for hashing specific fields within a kafka connect message value, or the entire value, additionally the key could be hashed if desired. The configuration would look something like the folllowing. where type is Either Key or Value.
transforms=HashEmail
transforms.HashEmail.type=org.apache.kafka.connect.transforms.Hash$Value
transforms.HashEmail.field.name=email
transforms.HashEmail.function=sha1
Compatibility, Deprecation, and Migration Plan
- NA
Rejected Alternatives
NA