Motivation
The current CatalogTable and SeaTunnelRowType are disjointed, and SeaTunnelRowType cannot contain certain specific information. There is a need to smoothly transition data type objects operated at the overall connector level from SeaTunnelRowType to CatalogTable.
This solution primarily addresses the following issues:
- Shifting the creation of CatalogTable to the internal SeaTunnelSource, where the Source determines the CatalogTable to be handled based on config information and returns it to the framework layer.
- Supporting the capability for the same Source to handle different Catalogs, allowing access to CatalogTable information through Catalog or enabling the Source to generate CatalogTable using its own config.
- Establishing a binding between Catalog and Source, with the Source creating Catalog through a provided unified utility class.
- Removing sub-configurations related to Catalog.
Process Compare
The main change in the solution is to relocate the logic for creating Catalog and generating CatalogTable into SeaTunnelSource. This relocation is determined by the Source's corresponding Config and its type, dictating how Catalog and CatalogTable should be generated.
Before the STIP:
After the STIP:
Config Change
We have removed the dedicated Catalog sub-configuration module from Source/Sink.
Before the STIP:
|
After the STIP:
|
How to implement this?
Considering that currently some connectors create catalogs through dedicated catalog sub-configurations, but there is no support for selecting a catalog (i.e., the catalog
factory is bound to the data source and follows a one-to-many relationship, where all CatalogFactory selections can be obtained through the current Source/Sink configuration).
- Remove the instantiation of CatalogFactory from the MultipleTableJobConfigParser, thereby eliminating the dependency on a unified catalog sub-configuration.
- Extract the logic for instantiating the corresponding Catalog and provide it for direct use by Source/Sink. This way, Source/Sink can obtain the corresponding Catalog without the need to modify existing configurations.
Public Interfaces
SeaTunnelSource
Add a `getProducedCatalogTable` method with a default implementation to SeaTunnelSource, and deprecate `getProducedType`.
|
TableFactoryContext
The current `TableFactoryContext` is provided to both Source and Sink for passing CatalogTable information. It needs to be split into `SourceTableFactoryContext` and `SinkTableFactoryContext`, while also removing the `catalogTables` information from `SourceTableFactoryContext`. Note: The classloader information currently is not being used in the design and may be considered for removal.
|
Pseudo code after modification
|
Compatibility
The Catalog section needs adjustments, including partial modifications to connectors supporting Catalog-related functionality in the Source. Compatibility at the user level will be maintained.