Status

Current state[One of "Under Discussion", "Accepted", "Rejected"]

Discussion thread: https://lists.apache.org/thread/nkn6vn1dp5r9t2wy65qy2lt1chmph46n

JIRA or Github Issue: 

Released: <Doris Version>

Google Doc: <If the design in question is unclear or needs to be discussed and reviewed, a Google Doc can be used first to facilitate comments from others.>

Motivation

Doris-kafka-connector is a scalable and reliable data transfer tool that can accurately transfer data from kafka to doris.

Related Research

Detailed Design

Currently doris-kafka-connector only supports data sink to doris, which has the following advantages:

  1. Easy to expand. It can be deployed on a single machine or expanded to a distributed environment according to the amount of data in kafka.
  2. Easy to use. The kafka framework natively supports connect, making deployment and use very simple. There are currently two deployment methods:
    1. Standalone mode: Standalone mode is the simplest mode, where a single process is responsible for executing all connectors and tasks.
    2. Distributed mode: Distributed mode provides scalability and automatic fault tolerance for Kafka Connect.
  3. Supports multiple data formats. Currently, it has been tested to sink json, csv, and avro data formats into doris through Doris-kafka-connector.

Scheduling


  • No labels