This section helps you set up quick-start jobs for ingesting data from HDFS to Kafka topic. We currently do not support the ability to write from HDFS to multiple Kafka topics. Also, we do not support partitioning by keys when writing to Kafka. We do allow topics with multiple partitions. However, in this case, the data will be distributed across partitions in a round robin manner.
We will illustrate both the Standalone and Mapreduce modes of operation.
Standalone
This example assumes Wikipedia data has been written to HDFS by following the instructions in the Wikipedia example (with some minor modifications to write to HDFS).
...