...
Proposed Changes
Nested notation
Using dots Dotted notation tends to be the most intuitive way to describe the path paths to nested fields in a record structures , and will cover most of the scenarios. e.g. jq
already uses it[1], and will cover most of the scenarios.
However, dots are already allowed as part of element names on JSON (i.e. Schemaless) records(field names in JSON could include dots(e.g. {
'nested.keyfield': {'value':42}}
).
Therefore, the nested notation must support escaping dots that could be valid field names.
Instead of escaping dots with backslashes — which in JSON configurations will lead leads to unfriendly configurations — it's proposed to follow a similar approach as the JSONata[2] to add where backticks are used define field names with dots using backticks, e.g. `nested.key`.valuefield`
[1] https://stedolan.github.io/jq/manual/#Basicfilters
...
> Field references containing whitespace or reserved tokens can be enclosed in backticks
Rules
- 1. If field names do not contain dots (.), then only use dots to represent nested fieldsfield paths.
- 2. If field names contain dots, then:
- wrap the field name with a backtick pair (`...`) by
- adding an opening backtick at the beginning of the field name (beginning of path, or after a dot)
- adding a closing backtick at the end of the field name (end of the path, or before the next dot)
- if a field is wrapped and doesn't contain dots, is processed the same way: field name within the wrapping backticks is used
- wrap the field name with a backtick pair (`...`) by
- 3. If field name includes backticks, then:
- if the backticks are in a wrapping position (opening or closing a field name), then need to be escaped with backslash
- Backslashes (\) do not need to be escaped. If backslash happen to be part of the field name and before a backtick to be escaped, then add another backslash.
- else, backticks do not require escape
- if the backticks are in a wrapping position (opening or closing a field name), then need to be escaped with backslash
- 4. If wrapping backtick pairs are incomplete, Connect configuration must fail fast to avoid getting ambiguous paths deployed.
...
scenario | input | smt | output | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. Nested field. |
|
|
| |||||||||||||||
2. Nested field, when field names include dots |
|
|
|
...
Existing SMT configurations will not be affected by these changes as the default field.style
is plain
, which represents the current behavior and users will need to opt-in the the new notation.
Rejected Alternatives
Keep ExtractField
as it is and use it multiple times until reaching nested fields
...