Abstract
TubeMQ is a distributed messaging queue (MQ) system developed by Tencent Big Data since 2013. It focuses on high-performance storage and transmission of massive data in big data scenarios.After nearly seven years of massive data precipitation, TubeMQ has certain advantages in production practice (stability + performance) and low cost compared to many open source MQ projects.
Proposal
TubeMQ is suitable for high concurrency, massive data and tolerates a small amount of data loss scenarios under abnormal conditions, such as massive log collection, indicator statistics and monitoring, etc.
TubeMQ does not support highly reliable data transmission services yet. It could be on a future project roadmap, as many other MQs. but not today.
Rationale
Just like other message queue systems, TubeMQ is built on the publish-subscribe pattern, aka pub-sub.
In this pattern, producers publish messages to topics while consumers subscribe to those topics. After incoming messages get proceeded, consumers send an acknowledgement back to producer. Once a subscription has been created, all messages will be tracked and retained by TubeMQ, even if the consumer go offline for some reasons. Retained messages will be discarded only when a consumer acknowledges that they've been successfully processed.
The overall architecture of TubeMQ is just like following:
Portal is responsible for interact with user and admin system which include two parts: API and web portal.
Master is controller of the cluster, which include one or multiple master node(s) which is responsible for managing state of cluster, resource scheduling, authentication check and maintaining of metadata. As a reliable system, TubeMQ provides HA solution for master node.
Broker is responsible for data store which include a cluster of broker nodes. Every broker node is managing a set of topics, include: append, delete, update, query of topic information. In TubeMQ, these brokers can be horizontal scaled and can be very large size for massive data case.
Client is responsible for producing and consuming messages. When a pub-sub topic get setup, we can support two ways (push and pull) for delivering message from producers to consumers.
Zookeeper is for storing offset of messages which is used to recover topic during some components get failed.
Initial Goals
The initial goal will be to move the current codebase in github’s repository under Tencent account to Apache and integrate with the Apache development process and infrastructure.
A primary goal of incubation will be to grow and diversify the TubeMQ community. We are well aware that the project community is largely comprised of individuals from a single company. We aim to change that during incubation.
Current Status
As previously mentioned, TubeMQ is under active development at Tencent, and is being used in processing large volumes of data for most services and products.
Meritocracy
We value meritocracy and we understand that it is the basis for an open community that encourages multiple companies and individuals to contribute and be invested in the project’s future. We will encourage and monitor participation and make sure to extend privileges and responsibilities to all contributors.
Community
TubeMQ is currently being used by developers at Tencent and a growing number of users are actively using it in production environments. TubeMQ has received contributions from developers working outside of Tencent since it was open sourced on github in September 2019 By bringing TubeMQ to Apache we aim to assure current and future contributors that the TubeMQ community is neutral, meritocratic, and open, in order to broaden and diversity the user and developer community.
Core Developers
TubeMQ was initially developed at Tencent and is under active development. We believe Tencent will be of interest to a broad range of users and developers and that incubating the project at the ASF will help us build a diverse, sustainable community.
Alignment
TubeMQ utilizes other Apache projects such as Hadoop, HBase and Zookeeper. We anticipate integration with additional Apache projects as the TubeMQ community and interest in the project grows.
Known Risks
Orphaned Products
Tencent is committed to the future development of TubeMQ and understands that graduation to a TLP, while preferable, is not the only positive outcome of incubation.
Should the TubeMQ project be accepted by the Incubator, the prospective PPMC would be willing to agree to a target incubation period of 2 years or less, knowing that every Incubator project incurs a certain cost in terms of ASF infrastructure and volunteer time.
Inexperience with Open Source
Five of the initial mentors/committers are Apache Members and Incubator PMC Members. They will work with the other community members to teach them the Apache Way.
Homogenous Developers
The majority of the committers work at Tencent, though we are committed to recruiting and developing additional committers from a wide spectrum of industries and backgrounds. Since being open sourced, many contributors that are outside of Tencent have engaged and begun contributing to the project.
Reliance on Salaried Developers
It is expected that Tencent development will occur on both salaried time and on volunteer time, after hours. Most of the initial committers are paid by Tencent to contribute to this project. However, they are all passionate about the project, and we are both confident and hopeful that the project will continue even if no salaried developers contribute to the project.
Relationships with Other Apache Products
As mentioned in the Rationale section, TubeMQ utilizes a number of existing Apache projects (Avro, Zookeeper, etc.), and we expect that list to expand as the community grows and diversifies. Any Apache project in the big data space that needs to process data in streaming way would be potentially relevant, such as Flink, Spark Streaming, etc. To provide convenient access to event streams served by TubeMQ, we plan to open source connectors to different stream computation engines. With these connectors, Spark and Flink can read and write data directly from/to topics served by TubeM.
An Excessive Fascination with the Apache Brand
We are applying to the Incubator process because we think it is the next logical step for the TubeMQ project after open-sourcing the code. This proposal is not for the purpose of generating publicity. Rather, we want to make sure to create a very inclusive and meritocratic community, outside the umbrella of a single company. Tencent has a long history of contributing to Apache projects and the TubeMQ developers and contributors understand the implication of making it an Apache project.
Required Resources
Mailing lists
- dev@tubemq.incubator.apache.org
- commits@tubemq.incubator.apache.org
- private@tubemq.incubator.apache.org
The podling may also create a user mailing list, if needed.
Source Control and Issue Tracking
The TubeMQ podling would use Apache’s gitbox integration to sync between github and Apache infrastructure. The podling would use github issues and pull requests for community engagement.
Current Resources
Source and Intellectual Property Submission Plan
The TubeMQ source code in Github is currently licensed under Apache License v2.0 and the copyright is assigned to Tencent. If TubeMQ becomes an Incubator project at the ASF, Tencent will transfer the source code and trademark ownership to the Apache Software Foundation via a Software Grant Agreement.
External Dependencies
External dependencies licensed under Apache License 2.0
- citrus r3.1.4 https://github.com/webx/citrus
- commons-cli 1.2 https://github.com/apache/commons-cli
- commons-codec 1.10 https://github.com/apache/commons-codec
- commons-lang 2.6 https://github.com/apache/commons-lang
- commons-io 2.1 https://github.com/apache/commons-io
- easymock 2.5.2 https://github.com/easymock/easymock
- fastjson 1.2.60 https://github.com/alibaba/fastjson
- guava 13.0 https://github.com/google/guava
- hbase 0.94.27 https://github.com/apache/hbase
- ini4j 0.5.1 https://sourceforge.net/projects/ini4j
- mina apache-2.0.12-src https://github.com/apache/mina
- netty 3.8.0.Final https://github.com/netty/netty
- openmct v0.9.0 https://gitee.com/ford25v6/openmct
- powermock 1.6.5 https://github.com/powermock/powermock
- velocity 1.7 https://github.com/apache/velocity-engine
- velocity-tools 2.0 https://github.com/apache/velocity-tools
- zookeeper 3.4.3 https://github.com/apache/zookeeper
- Apache Avro 1.7.6 https://github.com/apache/avro
- jetty 6.1.26 http://central.maven.org/maven2/org/mortbay/jetty
- Berkeley DB Java Edition 7.3.7 http://download.oracle.com/otn/berkeley-db
- spring-core 4.1.6.RELEASE https://github.com/spring-projects/spring-framework
- spring-context 4.1.6.RELEASE https://github.com/spring-projects/spring-framework
- spring-jdbc 4.1.6.RELEASE https://github.com/spring-projects/spring-framework
- spring-orm 4.1.6.RELEASE https://github.com/spring-projects/spring-framework
- servlet-api 2.5 http://central.maven.org/maven2/org/mortbay/jetty/servlet-api
- jackson-mapper-asl 1.9.13 http://www.java2s.com/Code/JarDownload/jackson-mapper
- Metamorphosis metamorphosis-all-1.4.4 https://github.com/killme2008/Metamorphosis
External dependencies licensed under the MIT License
- datatables 1.10.7 https://github.com/DataTables/DataTables
- JustWriting 1.0.0 https://github.com/GingJan/JustWriting
- jquery 1.11.3 https://github.com/jquery/jquery
- slf4j 1.6.2 https://github.com/qos-ch/slf4j
- mockito 2.0.2-beta https://github.com/mockito/mockito
External dependencies licensed under the New BSD License
- protobuf 2.5.0 https://github.com/google/protobuf
External dependencies licensed under the Eclipse Public License 1.0
- junit 4.11 https://github.com/junit-team/junit4
Cryptography
Not applicable.
Initial Committers
- David Nalley
- Goson Zhang
- Guangxu Cheng gxcheng@apache.org
- Jerry Shao jshao@apache.org
- Jie Jiang
- Junjie Chen
- Justin Mclain
- Junping Du junping_du@apache.org
- Kayne Wu
- Lamber Liu
- Osgoo Li
- Peng Chen
- Sijie Guo
- Xiang Li xiangli@apache.org
- Yiheng Wang
- Yuhong Liu
- Zak Wu
- Zili Chen tison@apache.org
Sponsors
- Champion and mentor: David Nalley ke4qqq@apache.org (ASF Member, Incubator PMC)
- Mentor: Junping Du junping_du@apache.org (ASF Member, Incubator PMC)
- Mentor: Justin Mclain jmclean@apache.org (ASF Member, VP of Incubator)
- Mentor: Sijie Guo sijie@apache.org (ASF Member, Incubator PMC)
- Mentor: Zhijie Shen zjshen@apache.org (ASF Member, Incubator PMC)
Sponsoring Entity
- The Apache Incubator