Status
Current state: "Under Discussion"
...
Page properties | |
---|---|
|
...
...
|
...
...
|
|
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
...
- make it visible whether a schema is resolved or unresolved and when the resolution happens
- offer a unified API for FLIP-129, FLIP-136, and catalogs
- allow arbitrary data types and expressions in the schema for watermark spec or columns
- have access to other catalogs for declaring a data type or expression via CatalogManager
- cleaned up TableSchema
- remain backwards compatible in the persisted properties and API
...
Schema
--> stores Expression, AbstractDataType
ResolvedSchema
--> stores ResolvedExpression, DataType
CatalogBaseTable.getResolvedSchema(): ResolvedSchema
--> stores ResolvedSchema
Catalog.getTable(ObjectPath, SchemaResolver): CatalogBaseTable
--> creates Schema and resolves it with the help of SchemaResolver
SchemaResolver.resolve(Schema): ResolvedSchema
--> references parser, catalog manager, etc. for resolving SQL and Table API expressions
CatalogTableImpl.fromProperties(Map<String, String> properties, SchemaResolver): CatalogTableImpl
--> construct Schema -> create ResolvedSchema -> verify against remaining properties
CatalogTableImpl.toProperties(): Map<String, String>
--> no change in properties yet
TableSchema extends Schema or ResolvedSchema
Since TableSchema is a hybrid of both, it would not be a smooth experience in any case.
It is saver and more user-friendly to build up a separate stack with better class hierarchy in correct package.
1) Reuse `table.api.TableColumn`, `table.api.WatermarkSpec`, `table.api.constraints.UniqueConstraint` but move them to `table.catalog`.
2) Introduce two Schema classes `table.api.Schema` for FLIP-129/FLIP-136 and `table.catalog.ResolvedSchema`.
3) Let `table.catalog.ResolvedSchema` implement most methods of `TableSchema` and resolve it with the help of a schema resolver.
4) Let `TableSchema` extend from `ResolvedSchema`.
5) Update API to properly split into `Schema` and `ResolvedSchema`. `Schema` will be returned by `CatalogBaseTable` such that expressions and data types can reference objects from other catalogs as well. `ResolvedSchema` will be returned by `QueryOperation`.
7) `TableColumn` and `WatermarkSpec` will be gradually reworked to remove the hybrid resolved/unresolved properties in members.
6) `DynamicTableFactory` will provide the resolved physical row data type and constraints as dedicated methods to avoid confusion which methods to call on `TableSchema.toRowDataType` or `TableSchema.toPhysicalRowDataType`.
We aim to let `table.api.TableSchema.Builder` untouched for backwards compatibility. However, implementations that use the rather new `add(TableColumn)` would need an update due to relocation and gradual refactorings.