Thanks, Chris!

Please, find my comments below:

On Tue, 7 Jun 2022 at 04:39, Chris Egerton <fearthecel...@gmail.com> wrote:

> Hi Jorge,
>
> Thanks! Sorry for the delay; here are my thoughts:
>
> 1. Under the "Accessing multiple values by deep-scan" header it's stated
> that "If deep-scan is used, it must have only one field after the asterisk
> level.". However, in example 3 for the Cast SMT and other examples for
> other SMTs, the spec contains a field of "*.child.k2", which appears to
> have two fields after the asterisk level. I may be misunderstanding the
> proposal, but it seems like the two contradict each other.
>

Thanks for catching this. I have clarified it by removing this restriction.
Also, have extended the deep-scan scenarios.


>
> 2. I'm a little unclear on why we need the special handling for arrays
> where, for an array field "a", the field name "a" can be treated as either
> the array itself, or every element in the array. Is there a reason we can't
> use the field name "a.*" to handle the latter case, and "a" to handle the
> former?
>

Agree, this is confusing. I like the `a.*` approach to access array items.
I have added this to the proposal.


>
> 3. How would a user specify that they'd like to access a field with the
> literal name "*"?
>

Good one. I'm proposing an approach similar to how it's proposed to escape
dots, with a double-asterisk. Curious on your thoughts around this.


>
> 4. For the Cast SMT, do you think it might bite some people if fields that
> can't be cast correctly are silently ignored? I'm imagining the case where
> none of the fields in a multi-path expression can be cast correctly and it
> ends up eating half of someone's day to track down why their SMT isn't
> doing anything.
>

If I understand correctly, this challenge could be relevant across SMTs.
At the moment, most/all? SMTs just silently ignore.
Was thinking about adding a flag `field.on.path.not.found` to either ignore
or fail when no paths are found. What do your think?


>
> 5. For the ExtractField and ValueToKey SMTs, what happens if a deep-scan
> field name is used, but only one field is found? Is the resulting field
> still an array, or is it just the single field that was found? (FWIW I'm
> leaning towards keeping it an array just to keep schemas consistent in a
> pipeline in case the number of fields found fluctuates across records.)
>
> Agree. Will clarify that an array is always produced even for 1 or 0
fields are found.


> 6. (Nit) For the HeaderFrom SMT, it's stated that "As this SMT affects only
> existing fields, additional configurations will not be required.". Given
> the new "field.syntax.version" property, this part should probably be
> removed.
>
Agree.


>
> 7. Is recursive descent intentionally excluded? That was an important part
> of Joshua's KIP and his feedback on this KIP; I think it's worth pursuing
> if we can.
>

My understanding from Joshua's feedback is that by including support for
deep-scan, we are already covering the recursive functionality. Though, I
may be missing something.

>
>
> Cheers,
>
> Chris
>
> On Tue, May 24, 2022 at 3:49 AM Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > Thanks, Chris!
> >
> > I have updated the KIP with the rejected alternatives updated. Also, I
> have
> > drafted the support for arrays and deep scans as part of the proposed
> > notation to make it more complete, even though these can be implemented
> in
> > multiple PRs.
> >
> > Looking forward to your feedback.
> >
> > Cheers,
> > Jorge.
> >
> > On Sat, 21 May 2022 at 17:39, Chris Egerton <fearthecel...@gmail.com>
> > wrote:
> >
> > > Hi Jorge,
> > >
> > > I really appreciate the effort you've made to simplify the syntax and
> > > feature set of a JSONPath-based approach as much as possible. I'm still
> > > hesitant to continue with it, though.
> > >
> > > 1. The syntax is much less friendly. Just compare "top.mid.bottom" to
> > > "$['top']['mid']['bottom']"... not everyone uses JSONPath or even JSON,
> > and
> > > the learning curve for the former is going to be steeper. The examples
> in
> > > the new KIP draft you published demonstrate this pretty well, and this
> is
> > > without diving into the details of what escape syntax would look like.
> > >
> > > 2. The three new major features that this syntax adds (array accesses,
> > deep
> > > scans, and multi-value paths) could all be added pretty easily to the
> dot
> > > notation syntax without introducing brackets and dollar signs. Array
> > > accesses can be described using the same syntax as struct/map field
> > access,
> > > deep scans can be described using '*', and multi-value paths can be
> > > described by referencing the name of a field that's expected to have
> > > children. These are all top-of-the-head ideas and can probably be
> > refined,
> > > but hopefully they demonstrate that we can keep the syntax simple
> without
> > > sacrificing features. Of course, the question of leaving room for
> future
> > > features might arise... given that these are the out-of-the-box SMTs
> that
> > > are likely to be the first that many people encounter, I'd err on the
> > side
> > > of simplicity and a gentle learning curve; if people need to get more
> out
> > > of transforms, the option to implement their own is still there. If we
> > can
> > > address 95% of use cases with something easy to use, it's not worth
> > making
> > > the feature harder for everyone to use just to accommodate the
> remaining
> > > 5%.
> > >
> > > 3. The advantage of leveraging an existing syntax is twofold: users who
> > are
> > > already familiar with that syntax don't need to learn a new syntax, and
> > > maintainers of the syntax get to leverage existing libraries and
> > > documentation for the syntax. With the current proposal, reusing
> > libraries
> > > is off the table, which means that we're going to have to parse this
> > > ourselves (including all the escape syntax edge cases), and we won't be
> > > able to automatically leverage new features added to that syntax. And
> > given
> > > how stripped-down the syntax is in comparison to full JSONPath, there's
> > > still going to be a learning curve for users who are already familiar
> > with
> > > it, and we'll still have to document how Connect's variant of JSONPath
> > > works either instead of or in addition to just linking to a
> > well-maintained
> > > third-party docs site.
> > >
> > > On the topic of "field.path" and "include.path" vs. "field.style", I
> > > actually think that a single property per SMT is cleaner and simpler.
> > > Allowing users to mix and match styles within the same SMT config seems
> > > like a recipe for confusion, and with a single property that dictates
> > field
> > > syntax behavior, we leave the door open to change the default at a
> later
> > > date. We could even fully remove support for plain field notation at
> some
> > > point and still be able to retain the simple property names of "field",
> > > "include", etc. instead of forcing people to use "field.path" and the
> > like.
> > > That said, the term "field.style" and the permitted values might be a
> > > little ambiguous. One alternative, though a little heavy-handed, is to
> > > change it to "field.syntax.version" with permitted values of "V1"
> > (default,
> > > equivalent to "field.style = plain") and "V2" (equivalent to
> > "field.style =
> > > nested"). This would leave us room in the future to make further
> changes
> > to
> > > the syntax without having to come up with new names, although it does
> > > sacrifice a little bit in that the permitted values are no longer
> > > self-describing.
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Sun, May 15, 2022 at 4:51 PM Jorge Esteban Quilcate Otoya <
> > > quilcate.jo...@gmail.com> wrote:
> > >
> > > > Thank you all for your feedback, and sorry for the long wait for a
> > reply.
> > > >
> > > > I would like to explore the idea of JSONPath-inspired/subset
> notation a
> > > bit
> > > > further:
> > > >
> > > > It will need to be a much-reduced version of JSONPath:
> > > > - No full support for JsonPath therefore an additional dependency.
> > > > - All paths must start with '$'
> > > > - No functions support or other operators allowed.
> > > > - JsonPath dotted and square-bracket notations can be supported to
> > avoid
> > > > escaping dots or other characters: `$a.b.c` and `$['a.b']['c']` - or
> we
> > > > could only support the second one as it's more complete.
> > > > - Add support for arrays with `[<integer>]`, e.g. `$a.[1].b`
> > > > - Add support for multiple-value paths using array access `$a.*.b` or
> > > deep
> > > > scan `$a..b`.
> > > >     - Some SMTs that will benefit from this: `MaskField`, `Cast`,
> > > > `ReplaceField`.
> > > > - We could introduce a `path[s]` config under the current
> > configurations
> > > to
> > > > apply this feature so no compatibility issues are introduced: e.g.
> > > > `field.path`, `fields.paths`, `exclude.path`.
> > > >
> > > > With these,  100, 101, 102, and 103 will be effectively solved.
> > > >
> > > > The main challenge that I see at the moment is that by being JSON
> > > > path-like, there may be some edge cases that I can't foresee at the
> > > moment,
> > > > that could make this hard to implement, test, and maintain in the
> long
> > > run.
> > > >
> > > > I'll appreciate your feedback on how this JSONPath-based alternative
> > > > compares to the dotted notation initially proposed, and if it matches
> > > your
> > > > feedback.
> > > > I have drafted a copy of the KIP to change the examples to JSONPath
> > > > approach and validate some differences:
> > > > https://cwiki.apache.org/confluence/x/9BihD
> > > >
> > > > About 104. I agreed with Chris. We can handle this as part of a new
> > KIP.
> > > >
> > > > Thanks,
> > > > Jorge.
> > > >
> > > > On Sun, 24 Apr 2022 at 17:27, Chris Egerton <fearthecel...@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi Joshua,
> > > > >
> > > > > I have a few reservations about using JsonPath notation here.
> > > > >
> > > > > 1. There's likely to be a substantial performance penalty for
> > > converting
> > > > > between the Kafka Connect format and something that a JsonPath
> > library
> > > > > would understand.
> > > > >
> > > > > 2. The complexity of the feature will be significantly higher. It
> > will
> > > be
> > > > > harder to test, document, and implement. There will be many more
> edge
> > > > cases
> > > > > to consider and support, and we'll be on the hook to handle any
> bugs
> > or
> > > > > inconsistencies that arise, either as a result of our use of the
> > > JsonPath
> > > > > library we choose, or as a result of bugs in that library.
> > > > >
> > > > > 3. It's not clear that JsonPath is superior or even suitable for
> some
> > > of
> > > > > the SMTs proposed here. What would be the advantages of JsonPath
> with
> > > the
> > > > > InsertField or HoistField SMTs?
> > > > >
> > > > > I also don't think that adding dot notation is unfriendly to users;
> > > many
> > > > > have proposed this type of syntax in the past, and it's frequently
> > used
> > > > in
> > > > > informal discussions to refer to nested fields. If the proposed
> > syntax
> > > > was
> > > > > not already intuitive then a case against deciding on our own might
> > be
> > > > more
> > > > > convincing, but as things stand, simple dot notation is likely
> going
> > to
> > > > be
> > > > > easier for users to understand than JsonPath syntax.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Chris
> > > > >
> > > > > On Fri, Apr 22, 2022 at 9:31 AM Joshua Grisham <
> > grishamj...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hello all!
> > > > > >
> > > > > > Sorry that I come a bit later to the party here, but I am the one
> > who
> > > > > wrote
> > > > > > KIP-683 [1] for recursive support (just simply looping through
> all
> > > > child
> > > > > > non-primitive structures for the same matching name(s)) which is
> a
> > > > > slightly
> > > > > > different way to try and solve a similar requirement --
> > unfortunately
> > > > at
> > > > > > the time the dev community was not quite as active and then I
> also
> > > got
> > > > > busy
> > > > > > with work and just life in general so wasn't able to follow up or
> > > push
> > > > > it.
> > > > > >
> > > > > > I do think it is a very good idea to have some kind of path-like
> > > > > expression
> > > > > > to be able to specifically address a nested field, as I can see
> > that
> > > > the
> > > > > > simple "recursive" case could potentially result in unwanted or
> > > > > unexpected
> > > > > > behavior, plus there is the potential to introduce a bit of a
> > > > performance
> > > > > > hit to always loop through everything in cases where the
> > > schemas/values
> > > > > > might be quite large.
> > > > > >
> > > > > > One thing I wanted to ask: instead of creating a new "path
> parser"
> > > > > > including some bespoke or "borrowed" syntax, why not just use
> > > something
> > > > > > that already exists? Specifically here I am thinking about
> > JsonPath (
> > > > > > https://github.com/json-path/JsonPath)
> > > > > >
> > > > > > There is already quite nice support in JsonPath for handling
> > special
> > > > > > characters in field names, for handling different non-primitive
> > types
> > > > > > (arrays etc), for handling multiple levels of nesting, etc etc.
> > > Would
> > > > it
> > > > > > be possible to instead to re-think this and maybe have some kind
> of
> > > > > > JsonPath-based Schema selector / updater and/or JsonPath-based
> > Value
> > > > > > selector / updater? Conceptually this feels like it makes sense
> to
> > > me,
> > > > as
> > > > > > from the top of my head it would be quite a natural fit to map a
> > Json
> > > > > data
> > > > > > structure to the Connect API data structure (and you could
> > > potentially
> > > > > even
> > > > > > try to leverage the existing Json-to-Connect
> > serializer/deserializer
> > > to
> > > > > > help out with this even in a more "out of the box"-feeling kind
> of
> > > > way).
> > > > > >
> > > > > > Maybe also as Tom mentioned, this part (in my example, this
> > > > > JsonPath-based
> > > > > > "thing") could even be a generic API that could be used by any
> SMT,
> > > > > > including used in custom ones built by the community.  Then I
> think
> > > to
> > > > > use
> > > > > > a completely separate config property somehow related to "path"
> (as
> > > Tom
> > > > > > also mentioned) would also make a lot of sense here as well. This
> > > way,
> > > > if
> > > > > > you select based on "path" then this JsonPath-based API would be
> > > used,
> > > > > > otherwise it could use something similar to the existing
> get-field
> > > > based
> > > > > > approach (which I guess could also be refactored into some kind
> of
> > > > > utility
> > > > > > / API as well if it made sense?)
> > > > > >
> > > > > > And with that in mind, if this was the kind of direction to go,
> > then
> > > a
> > > > > > "recursive" capability like I pitched in KIP-683 would also
> become
> > > > > > unnecessary because you could easily write a JsonPath expression
> > like
> > > > > > "$..someRecuriveField" and it would do the same thing (on top of
> > > > anything
> > > > > > else you would want to do that is already supported by JsonPath).
> > > Then
> > > > we
> > > > > > could also kill that older KIP and do a bit of clean-up :)
> > > > > >
> > > > > > [1] -
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-683%3A+Add+recursive+support+to+Connect+Cast+and+ReplaceField+transforms%2C+and+support+for+casting+complex+types+to+either+a+native+or+JSON+string
> > > > > >
> > > > > > Just some extra food for thought. All-in-all I think this is a
> > super
> > > > > great
> > > > > > initiative!
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Joshua
> > > > > >
> > > > > >
> > > > > > Den fre 22 apr. 2022 kl 14:50 skrev Chris Egerton <
> > > > > fearthecel...@gmail.com
> > > > > > >:
> > > > > >
> > > > > > > Hi Tom,
> > > > > > >
> > > > > > > Thanks for taking a look at this, and for your thoughtful
> > comments.
> > > > > I'll
> > > > > > > leave it up to Jorge to address most of your comments but I
> > wanted
> > > to
> > > > > > share
> > > > > > > a couple quick thoughts I had regarding 103 and 104.
> > > > > > >
> > > > > > > 103. Like you, I was envisioning a possible syntax for array
> > access
> > > > > that
> > > > > > > used classic C-style brackets; e.g., `arr[index]`. However, I
> > > wonder
> > > > if
> > > > > > we
> > > > > > > could keep things simple and use the same syntax that we're
> > > proposing
> > > > > for
> > > > > > > nested field access? In other words, instead of `arr[index]`,
> > you'd
> > > > > write
> > > > > > > `arr.index`. It'd save us and users the headache of reserving
> > > > > characters
> > > > > > > now that would need to be escaped even if their unescaped
> > brethren
> > > > > aren't
> > > > > > > used for anything, and also avoid the question of what exactly
> we
> > > > > should
> > > > > > do
> > > > > > > when we see a config that uses reserved characters that aren't
> > yet
> > > > > > > supported (throwing an exception seems pretty unfriendly for
> new
> > > > > users).
> > > > > > >
> > > > > > > 104. This would probably be useful, but it would come with some
> > > nasty
> > > > > > > compatibility questions that would need to be addressed if we'd
> > > want
> > > > > SMTs
> > > > > > > that leverage this new API to be viable for older versions of
> > > > Connect.
> > > > > If
> > > > > > > we package and distribute this feature as a library (either via
> > an
> > > > > > entirely
> > > > > > > new artifact, or as part of the existing connect-transforms or
> > > > > > connect-api
> > > > > > > artifacts), then we'd have to either sidestep the existing
> plugin
> > > > > > isolation
> > > > > > > logic [1] that basically makes it impossible for Connect
> plugins
> > to
> > > > > ship
> > > > > > > their own versions of Connect artifacts, or issue a big warning
> > to
> > > > > people
> > > > > > > that any SMT that uses this API won't work with any older
> > versions
> > > of
> > > > > > > Connect. There's also some other features we might want to add
> in
> > > an
> > > > > > > SMT-utils library such as the existing, internal, utils that
> > > Connect
> > > > > uses
> > > > > > > right now [2]. It may be worth exploring this in a separate KIP
> > of
> > > > its
> > > > > > own.
> > > > > > >
> > > > > > > [1] -
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/d480c4aa6e513e36050d8e067931de2270525d18/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/PluginUtils.java#L46-L143
> > > > > > >
> > > > > > > [2] -
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/tree/d480c4aa6e513e36050d8e067931de2270525d18/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/util
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Chris
> > > > > > >
> > > > > > > On Fri, Apr 22, 2022 at 6:55 AM Tom Bentley <
> tbent...@redhat.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > Hi Jorge,
> > > > > > > >
> > > > > > > > Thanks for the KIP, especially for the examples which are
> > > > > super-clear.
> > > > > > > >
> > > > > > > > 100. The name `field.style` isn't so clear for something like
> > > > > > > ReplaceField:
> > > > > > > > it's not so obvious that field.style applies to `include` and
> > > > > > `exclude`.
> > > > > > > >
> > > > > > > > 101. The permitted values for `field.style` don't seem
> terribly
> > > > > > intuitive
> > > > > > > > (to me anyway): the meaning of `plain` isn't very guessable.
> > Why
> > > > not
> > > > > > > > `top-level` or `root` instead? Also `nested` could be
> > > misconstrued
> > > > to
> > > > > > > mean
> > > > > > > > nested-but-not-top-level, so perhaps `recursive` or
> `cascading`
> > > > might
> > > > > > be
> > > > > > > > better?
> > > > > > > >
> > > > > > > > 102. I'm torn on whether making the interpretation of
> existing
> > > > > configs
> > > > > > > like
> > > > > > > > `field` be dependent on `field.style` is a good idea. I can
> see
> > > > that
> > > > > > it's
> > > > > > > > the simplest thing to do, but it just feels a bit odd that
> > > > sometimes
> > > > > > the
> > > > > > > > `field` would actually be a path and have different escaping
> > > rules.
> > > > > An
> > > > > > > > alternative would be to come up with a parallel set of config
> > > names
> > > > > > (e.g.
> > > > > > > > as well as "field" an SMT might support "path") which were
> > > defined
> > > > to
> > > > > > > > always take paths, thus avoiding the changeable
> interpretation
> > of
> > > > the
> > > > > > > > existing configs. The SMT's #configure() would need to throw
> in
> > > the
> > > > > > case
> > > > > > > > that both configs were given. I can see that that would be
> more
> > > > work
> > > > > in
> > > > > > > > implementation, but it feels cleaner.
> > > > > > > >
> > > > > > > > 103. I think in order to allow for supporting arrays in a
> later
> > > KIP
> > > > > > > (which
> > > > > > > > certainly seems like it could be useful), we'd want to
> specify
> > > the
> > > > > > syntax
> > > > > > > > now, even if it wasn't implemented under this KIP. That's
> > > because I
> > > > > > don't
> > > > > > > > think you can't exclude the possibility that characters such
> as
> > > `[`
> > > > > and
> > > > > > > `]`
> > > > > > > > appear in field names. So you'd have a compatibility problem
> if
> > > > > people
> > > > > > > > started using the features of this KIP to access such fields,
> > > only
> > > > > for
> > > > > > > > those characters to change their meaning under a later KIP.
> > > > > > > >
> > > > > > > > 104. I also wonder whether making paths into a public Java
> API,
> > > for
> > > > > use
> > > > > > > by
> > > > > > > > 3rd party SMTs, would be valuable.
> > > > > > > >
> > > > > > > > Thanks again,
> > > > > > > >
> > > > > > > > Tom
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, 20 Apr 2022 at 17:53, Chris Egerton <
> > > > fearthecel...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > 💯 Thanks Jorge, LGTM!
> > > > > > > > >
> > > > > > > > > On Wed, Apr 20, 2022, 12:40 Jorge Esteban Quilcate Otoya <
> > > > > > > > > quilcate.jo...@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > > Thank you, Chris! Not possible without your feedback.
> > > > > > > > > >
> > > > > > > > > > On Tue, 19 Apr 2022 at 23:04, Chris Egerton <
> > > > > > fearthecel...@gmail.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Jorge,
> > > > > > > > > > >
> > > > > > > > > > > Thank you for sticking through this. I have one small
> > > remark
> > > > > and
> > > > > > > one
> > > > > > > > > > small
> > > > > > > > > > > clarification; assuming you agree with me on them then
> > I'm
> > > > > ready
> > > > > > to
> > > > > > > > > vote
> > > > > > > > > > on
> > > > > > > > > > > the KIP.
> > > > > > > > > > >
> > > > > > > > > > > 1. InsertField: The "field.on.missing.parent" and
> > > > > > > > > > "field.on.existing.field"
> > > > > > > > > > > docs both mention a permitted value of "ingore"; this
> > > should
> > > > be
> > > > > > > > > "ignore",
> > > > > > > > > > > right?
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Of course, one more typo :)
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > 2. InsertField: The examples are still missing the
> > > > > "field.style"
> > > > > > > > > property
> > > > > > > > > > > from the configurations. They should all include the
> > > property
> > > > > > > > > > > "transforms.smt1.field.style": "nested", correct?
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Yes, it is there. I think I know what you mean now, seems
> > > that
> > > > > > > > Confluence
> > > > > > > > > > is putting everything in one line when it's in separate
> > lines
> > > > in
> > > > > > the
> > > > > > > > > > editor.
> > > > > > > > > > Hopefully, it's fixed now.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thanks again for working through this, and
> > congratulations
> > > > on a
> > > > > > > > > > > well-written KIP!
> > > > > > > > > > >
> > > > > > > > > > > Cheers,
> > > > > > > > > > >
> > > > > > > > > > > Chris
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Apr 19, 2022 at 2:06 PM Jorge Esteban Quilcate
> > > Otoya
> > > > <
> > > > > > > > > > > quilcate.jo...@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thank you, Chris! I apply these improvements to the
> > KIP,
> > > > let
> > > > > me
> > > > > > > > know
> > > > > > > > > > how
> > > > > > > > > > > it
> > > > > > > > > > > > looks now.
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, 11 Apr 2022 at 23:43, Chris Egerton <
> > > > > > > > fearthecel...@gmail.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Jorge,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Wow, those examples are great! A few more remarks,
> > but
> > > I
> > > > > > think
> > > > > > > > > we're
> > > > > > > > > > > > > getting close:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1. The examples differ across SMTs with the name of
> > the
> > > > > > > > > > > newly-introduced
> > > > > > > > > > > > > style property; some of them use "field.style", and
> > > some
> > > > > use
> > > > > > > > > > > > > "fields.style". I think for consistency's sake we
> > > should
> > > > > > stick
> > > > > > > > with
> > > > > > > > > > > just
> > > > > > > > > > > > > "field.style"; otherwise it could be painful for
> > users
> > > to
> > > > > > have
> > > > > > > to
> > > > > > > > > > > > remember
> > > > > > > > > > > > > which to use.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Great catch. Agree, I fixed the config names to
> > > > > `field.style`.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2. Some of the examples are off:
> > > > > > > > > > > > > - TimestampConverter: the input in the second
> example
> > > > > ("when
> > > > > > > > field
> > > > > > > > > > > names
> > > > > > > > > > > > > include dots") doesn't contain a field with a
> dotted
> > > name
> > > > > > > > > > > > > - ValueToKey: the config in the third example
> ("when
> > > > field
> > > > > > > names
> > > > > > > > > > > include
> > > > > > > > > > > > > dots") should probably use "parent..child.k2" as
> the
> > > > > > > > > > > > > "transforms.smt1.fields" property
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Fixed. Thanks!
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 3. RE changes to InsertField:
> > > > > > > > > > > > > - The InsertField SMT should also come with the new
> > > > > > > "field.style"
> > > > > > > > > > > > property
> > > > > > > > > > > > > in order to preserve backwards compatibility,
> right?
> > I
> > > > > don't
> > > > > > > see
> > > > > > > > it
> > > > > > > > > > > > > included in the example configs for that one, just
> > want
> > > > to
> > > > > > make
> > > > > > > > > sure
> > > > > > > > > > > > > - I don't know of any cases where we use snake_case
> > for
> > > > > > > property
> > > > > > > > > > names
> > > > > > > > > > > in
> > > > > > > > > > > > > Kafka; we should probably use "on.missing.parent"
> and
> > > > > > > > > > > "on.existing.field"
> > > > > > > > > > > > > as the new property names for InsertField.
> > > > > > > > > > > > > - Why is the "on_existing_field" (or
> > > "on.existing.field")
> > > > > > > > property
> > > > > > > > > > only
> > > > > > > > > > > > > applied when the field style is nested? Couldn't it
> > be
> > > > > useful
> > > > > > > for
> > > > > > > > > > > > > non-nested fields as well?
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Great points! I have applied these suggestions to the
> > > KIP.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Chris
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sat, Apr 9, 2022 at 12:40 PM Jorge Esteban
> > Quilcate
> > > > > Otoya
> > > > > > <
> > > > > > > > > > > > > quilcate.jo...@gmail.com> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Again, great feedback Chris. Much appreciated.
> > > > > > > > > > > > > > Added my comments below:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, 5 Apr 2022 at 20:22, Chris Egerton <
> > > > > > > > > > fearthecel...@gmail.com>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Jorge,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Looking good! I have a few comments left but
> all
> > > but
> > > > > one
> > > > > > or
> > > > > > > > two
> > > > > > > > > > are
> > > > > > > > > > > > > > minor.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1. The motivation section states "This KIP is
> > aimed
> > > > to
> > > > > > > > include
> > > > > > > > > > > > support
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > nested structures on the existing SMTs... and
> to
> > > > > include
> > > > > > > the
> > > > > > > > > > > > > abstractions
> > > > > > > > > > > > > > > to reuse this in future SMTs". A good
> > > implementation
> > > > of
> > > > > > > this
> > > > > > > > > KIP
> > > > > > > > > > > will
> > > > > > > > > > > > > > > definitely isolate reusable logic into a
> separate
> > > > > > > abstraction
> > > > > > > > > > that
> > > > > > > > > > > > can
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > easily pulled in to the SMTs we want to add
> > nested
> > > > > field
> > > > > > > > > support
> > > > > > > > > > > to,
> > > > > > > > > > > > > but
> > > > > > > > > > > > > > > unless we plan on making this kind of
> abstraction
> > > > > > publicly
> > > > > > > > > > > available
> > > > > > > > > > > > as
> > > > > > > > > > > > > > > some kind of utility method or class that
> > external
> > > > SMT
> > > > > > > > > developers
> > > > > > > > > > > can
> > > > > > > > > > > > > > > leverage, we can probably leave this part out
> as
> > > it's
> > > > > > more
> > > > > > > of
> > > > > > > > > an
> > > > > > > > > > > > > > > implementation detail.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Make sense, will leave this out of the KIP.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2. The Cast example is a little misleading,
> isn't
> > > it?
> > > > > It
> > > > > > > > > > > demonstrates
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > escape syntax for fields with dot literals in
> > their
> > > > > > names,
> > > > > > > > but
> > > > > > > > > it
> > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > demonstrate a way to actually use the Cast (or
> > any
> > > > > other)
> > > > > > > SMT
> > > > > > > > > to
> > > > > > > > > > > > > access a
> > > > > > > > > > > > > > > nested field in a record, which is the whole
> > point
> > > of
> > > > > the
> > > > > > > > KIP.
> > > > > > > > > I
> > > > > > > > > > > like
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > example of escape syntax but we should probably
> > > also
> > > > > add
> > > > > > > one
> > > > > > > > > for
> > > > > > > > > > > > nested
> > > > > > > > > > > > > > > field access.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Agree. I have added examples to each SMT to be
> more
> > > > clear
> > > > > > > about
> > > > > > > > > how
> > > > > > > > > > > it
> > > > > > > > > > > > > > affects each function
> > > > > > > > > > > > > > .
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 3. With the InsertField SMT, I'm wondering what
> > the
> > > > > > > specific
> > > > > > > > > > > behavior
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > be when some or all of the "middle layer"
> nested
> > > > fields
> > > > > > are
> > > > > > > > > > > missing.
> > > > > > > > > > > > > For
> > > > > > > > > > > > > > > example, if I have a record with a value of {
> > "k1":
> > > > > "v1 }
> > > > > > > > and I
> > > > > > > > > > > apply
> > > > > > > > > > > > > > > InsertField with topic.field = n1.n2.n3.topic,
> > what
> > > > > will
> > > > > > > > > happen?
> > > > > > > > > > > Will
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > updated value become { "k1": "v1", "n1": {
> "n2":
> > {
> > > > > "n3":
> > > > > > > > > "topic"
> > > > > > > > > > > }}},
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > an exception be thrown, or something else? This
> > > seems
> > > > > > > > analogous
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > command line mkdir command, which (at least on
> > some
> > > > > > > operating
> > > > > > > > > > > > systems)
> > > > > > > > > > > > > > > fails by default if you try to create a new
> > nested
> > > > > > > directory
> > > > > > > > > > where
> > > > > > > > > > > > > > anything
> > > > > > > > > > > > > > > but the last element in the path doesn't exist,
> > but
> > > > can
> > > > > > be
> > > > > > > > > > invoked
> > > > > > > > > > > > > with a
> > > > > > > > > > > > > > > flag that instructs it to go ahead and create
> all
> > > > > levels
> > > > > > of
> > > > > > > > > > nested
> > > > > > > > > > > > > > > directory instead. I'm leaning on the side of
> > "just
> > > > > > create
> > > > > > > > > > > > everything"
> > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > would be interested in your thoughts, and
> either
> > > way,
> > > > > we
> > > > > > > > should
> > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > make sure the intended behavior is well-defined
> > > > before
> > > > > > > > voting.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > This is an interesting case, thanks for catching
> > > this!
> > > > > > > > > > > > > > The default behavior I see is to create parents
> if
> > > they
> > > > > are
> > > > > > > > > missing
> > > > > > > > > > > and
> > > > > > > > > > > > > > overwrite fields
> > > > > > > > > > > > > > if they already exist.
> > > > > > > > > > > > > > I'm planning to include the following two flags
> if
> > > > there
> > > > > > is a
> > > > > > > > > need
> > > > > > > > > > to
> > > > > > > > > > > > > > overwrite this behavior:
> > > > > > > > > > > > > > - `on_missing_parent` = (CREATE|IGNORE),
> > > default=CREATE
> > > > > > > > > > > > > > - `on_existing_field` = (OVERWRITE|IGNORE),
> > > > > > default=OVERWRITE
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 4. Similarly, what will the behavior be if any
> of
> > > the
> > > > > > field
> > > > > > > > > > > elements
> > > > > > > > > > > > > > > specified with InsertField already exist in the
> > > > record
> > > > > > > value?
> > > > > > > > > > Will
> > > > > > > > > > > we
> > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > overwrite them? What's the behavior of
> > InsertField
> > > > > today
> > > > > > > > under
> > > > > > > > > > > > similar
> > > > > > > > > > > > > > > circumstances?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The current behavior is to overwrite the value.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Chris
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Thu, Mar 31, 2022 at 4:15 PM Jorge Esteban
> > > > Quilcate
> > > > > > > Otoya
> > > > > > > > <
> > > > > > > > > > > > > > > quilcate.jo...@gmail.com> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks, Chris! Much appreciated all the
> > feedback
> > > > > here.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 1. You nailed it setting the design goal
> here:
> > > "it
> > > > > > > > shouldn't
> > > > > > > > > be
> > > > > > > > > > > > > > > impossible
> > > > > > > > > > > > > > > > to use this new feature for any field name,
> no
> > > > matter
> > > > > > how
> > > > > > > > > > > > convoluted.
> > > > > > > > > > > > > > > It's
> > > > > > > > > > > > > > > > fine if edge cases introduce difficulty (such
> > as
> > > > > > > > > less-readable
> > > > > > > > > > > > > > > > configurations), but it's not fine if they
> > can't
> > > be
> > > > > > > > addressed
> > > > > > > > > > at
> > > > > > > > > > > > > all."
> > > > > > > > > > > > > > > > Back to the previous proposals (using only
> dots
> > > as
> > > > > > > > > separators)
> > > > > > > > > > we
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > 2
> > > > > > > > > > > > > > > > alternatives:
> > > > > > > > > > > > > > > > 1. escaping with backslashes
> > > > > > > > > > > > > > > > 2. escaping with dots itself
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I'll lean for alternative 2, as you proposed
> > > > before.
> > > > > > > Feels
> > > > > > > > to
> > > > > > > > > > me
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > backslashes have more potential to lead to
> > > > confusion
> > > > > in
> > > > > > > > JSON
> > > > > > > > > > > > configs,
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > CSV seems like a good precedent to use the
> same
> > > > > > character
> > > > > > > > to
> > > > > > > > > > > escape
> > > > > > > > > > > > > > > itself.
> > > > > > > > > > > > > > > > KIP is updated to reflect this.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2. Thanks! I'll add an example, and stick
> with
> > > the
> > > > > > > current
> > > > > > > > > > > approach
> > > > > > > > > > > > > > > > defining the style per individual transform
> > > > > > > configuration.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 3. Yes, thanks! KIP updated.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 4. Of course. KIP updated.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Mon, 28 Mar 2022 at 21:59, Chris Egerton <
> > > > > > > > > > > > fearthecel...@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi Jorge,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks for addressing my comments; the KIP
> > > looks
> > > > > > > > up-to-date
> > > > > > > > > > and
> > > > > > > > > > > > > > pretty
> > > > > > > > > > > > > > > > > readable now, and the rejected alternatives
> > > > section
> > > > > > > does
> > > > > > > > a
> > > > > > > > > > > great
> > > > > > > > > > > > > job
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > outlining the discussion so far and
> providing
> > > > > context
> > > > > > > for
> > > > > > > > > > > anyone
> > > > > > > > > > > > > else
> > > > > > > > > > > > > > > who
> > > > > > > > > > > > > > > > > might want to join in.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 1. Thoughts on choice of delimiter:
> > > > > > > > > > > > > > > > > - I like the optimization for simple cases,
> > > but I
> > > > > > think
> > > > > > > > the
> > > > > > > > > > new
> > > > > > > > > > > > > > > proposal
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > a little too restrictive. What if there's a
> > > field
> > > > > > whose
> > > > > > > > > name
> > > > > > > > > > > > > contains
> > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > of the permitted options (currently just
> ".",
> > > > ",",
> > > > > > and
> > > > > > > > > "/")?
> > > > > > > > > > > > > > > > > - If we expand the set of permitted
> > delimiters
> > > to
> > > > > > allow
> > > > > > > > for
> > > > > > > > > > any
> > > > > > > > > > > > > > > > > single-character string, configuration
> > > complexity
> > > > > > will
> > > > > > > > > > increase
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > readability may decrease
> > > > > > > > > > > > > > > > > - Also worth pointing out that there is
> some
> > > > > > convention
> > > > > > > > for
> > > > > > > > > > > > > doubling
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > delimiter character as an escape mechanism
> > with
> > > > > > formats
> > > > > > > > > like
> > > > > > > > > > > CSV
> > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > - Overall I think we may be approaching the
> > > > > > saturation
> > > > > > > > > point
> > > > > > > > > > > for
> > > > > > > > > > > > > > > > productive
> > > > > > > > > > > > > > > > > discussion on delimiter syntax so I don't
> > want
> > > to
> > > > > > spend
> > > > > > > > too
> > > > > > > > > > > much
> > > > > > > > > > > > > more
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > your time on it. I think the one point I'd
> > like
> > > > to
> > > > > > > leave
> > > > > > > > > for
> > > > > > > > > > > now
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > shouldn't be impossible to use this new
> > feature
> > > > for
> > > > > > any
> > > > > > > > > field
> > > > > > > > > > > > name,
> > > > > > > > > > > > > > no
> > > > > > > > > > > > > > > > > matter how convoluted. It's fine if edge
> > cases
> > > > > > > introduce
> > > > > > > > > > > > difficulty
> > > > > > > > > > > > > > > (such
> > > > > > > > > > > > > > > > > as less-readable configurations), but it's
> > not
> > > > fine
> > > > > > if
> > > > > > > > they
> > > > > > > > > > > can't
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > addressed at all.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2.
> > > > > > > > > > > > > > > > > The configuration style where you define
> > > > > > > > > > > "transforms.field.style"
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > connector config, and then this applies to
> > all
> > > > SMTs
> > > > > > for
> > > > > > > > the
> > > > > > > > > > > > > > connector,
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > very interesting. However, it doesn't
> follow
> > > > > > convention
> > > > > > > > for
> > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > SMTs.
> > > > > > > > > > > > > > > > > Right now, if you want to configure an SMT,
> > you
> > > > > > define
> > > > > > > > its
> > > > > > > > > > name
> > > > > > > > > > > > in
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > connector config (for example,
> "transforms":
> > > > > "smt1"),
> > > > > > > and
> > > > > > > > > > then
> > > > > > > > > > > > > define
> > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > of the properties for that SMT in the
> > connector
> > > > > > config
> > > > > > > > > using
> > > > > > > > > > a
> > > > > > > > > > > > > > > > namespacing
> > > > > > > > > > > > > > > > > mechanism specific to that SMT (for
> example,
> > > > > > > > > > > > > "transforms.smt1.prop1":
> > > > > > > > > > > > > > > > > "val1"). That SMT then sees only the
> > properties
> > > > > > defined
> > > > > > > > in
> > > > > > > > > > that
> > > > > > > > > > > > > > > > namespace,
> > > > > > > > > > > > > > > > > with the prefix stripped (for example,
> > "prop1":
> > > > > > "val1")
> > > > > > > > in
> > > > > > > > > > its
> > > > > > > > > > > > > > > configure
> > > > > > > > > > > > > > > > > [2] [3] method.
> > > > > > > > > > > > > > > > > If we want to continue to follow this
> > > convention,
> > > > > > then
> > > > > > > > > > instead
> > > > > > > > > > > of
> > > > > > > > > > > > > > > > > specifying "transforms.field.style" in a
> > > > connector
> > > > > > > > config,
> > > > > > > > > we
> > > > > > > > > > > > would
> > > > > > > > > > > > > > > > expect
> > > > > > > > > > > > > > > > > users to configure
> > > > "transforms.<name>.field.style",
> > > > > > for
> > > > > > > > > each
> > > > > > > > > > > SMT
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > want to configure a field style for. This
> > would
> > > > > > require
> > > > > > > > > more
> > > > > > > > > > > work
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > part of the user, but would be simpler to
> > > reason
> > > > > > about
> > > > > > > > and
> > > > > > > > > > > easier
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > implement.
> > > > > > > > > > > > > > > > > If we want to explore an alternative where
> > > users
> > > > > can
> > > > > > > > > specify
> > > > > > > > > > > > global
> > > > > > > > > > > > > > > > > properties that apply to all transforms in
> a
> > > > > > connector
> > > > > > > > > > config,
> > > > > > > > > > > > then
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > semantics for this need to be defined in
> the
> > > KIP.
> > > > > > This
> > > > > > > > > would
> > > > > > > > > > > have
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > include whether this will apply only for
> the
> > > > > special
> > > > > > > case
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > "field.style" and possibly
> "field.separator"
> > > > > > properties
> > > > > > > > or
> > > > > > > > > if
> > > > > > > > > > > it
> > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > available more generally for other
> > properties,
> > > > > > whether
> > > > > > > it
> > > > > > > > > > will
> > > > > > > > > > > > > apply
> > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > for the SMTs outlined in the KIP or if the
> > > > > > > "field.style"
> > > > > > > > > and
> > > > > > > > > > > > > possibly
> > > > > > > > > > > > > > > > > "field.separator" properties would also be
> > > passed
> > > > > > into
> > > > > > > > > custom
> > > > > > > > > > > > SMTs
> > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > they could choose to act on them if
> > applicable,
> > > > how
> > > > > > > edge
> > > > > > > > > > cases
> > > > > > > > > > > > like
> > > > > > > > > > > > > > > > having
> > > > > > > > > > > > > > > > > an SMT named "field" in your connector
> config
> > > > would
> > > > > > be
> > > > > > > > > > handled,
> > > > > > > > > > > > > etc.
> > > > > > > > > > > > > > > > > Either way, it might help to have an
> example
> > in
> > > > the
> > > > > > KIP
> > > > > > > > > > > outlining
> > > > > > > > > > > > > how
> > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > of the to-be-augmented SMTs can be
> configured
> > > > with
> > > > > > this
> > > > > > > > new
> > > > > > > > > > > > feature
> > > > > > > > > > > > > > > and a
> > > > > > > > > > > > > > > > > before/after of how a record value would be
> > > > > > transformed
> > > > > > > > > with
> > > > > > > > > > > that
> > > > > > > > > > > > > > > > > configuration.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 3. The docstring for the
> > > "transforms.field.style"
> > > > > > > > property
> > > > > > > > > > > > mentions
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > the permitted values are "plain" and
> > "nested",
> > > > but
> > > > > > then
> > > > > > > > > > > describes
> > > > > > > > > > > > > > > > behavior
> > > > > > > > > > > > > > > > > with a value of "root". Should that be
> > "plain"
> > > > > > instead?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 4. The docstring for the
> > > > > "transforms.field.separator"
> > > > > > > > > > property
> > > > > > > > > > > > > > > > exclusively
> > > > > > > > > > > > > > > > > mentions structs, but the feature is
> intended
> > > to
> > > > > work
> > > > > > > > with
> > > > > > > > > > maps
> > > > > > > > > > > > as
> > > > > > > > > > > > > > > well.
> > > > > > > > > > > > > > > > > Can we update it to reflect this?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > References:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > [1] - https://stackoverflow.com/a/17808731
> > > > > > > > > > > > > > > > > [2] -
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/7243facb8d69a7252e6b9556b5eaee13e41bab7f/connect/api/src/main/java/org/apache/kafka/connect/transforms/Transformation.java#L30
> > > > > > > > > > > > > > > > > [3] -
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/7243facb8d69a7252e6b9556b5eaee13e41bab7f/clients/src/main/java/org/apache/kafka/common/Configurable.java#L26-L29
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Chris
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Mon, Mar 28, 2022 at 1:32 PM Jorge
> Esteban
> > > > > > Quilcate
> > > > > > > > > Otoya
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > quilcate.jo...@gmail.com> wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks, Chris!
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 1. I'd argue "this..field.child" could be
> > > > harder
> > > > > to
> > > > > > > > grasp
> > > > > > > > > > > than
> > > > > > > > > > > > > > > > > > "this.field/child" + separator: "/".
> > > > > > > > > > > > > > > > > > Even though this represents additional
> > > > > information,
> > > > > > > it
> > > > > > > > > > > follows
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > > > approach as the "Flatten#delimeter"
> > > > > configuration.
> > > > > > > > > > > > > > > > > > I want to give the separator approach
> > another
> > > > > try,
> > > > > > > so I
> > > > > > > > > > have
> > > > > > > > > > > > > > updated
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > KIP with the separator proposal, sticking
> > to
> > > > > only 2
> > > > > > > > > > > > alternatives
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > should hopefully cover most scenarios.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2. Agree. KIP has been updated with this
> > > > > > improvement.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 3. You're right. I have updated this
> > section
> > > > > > > > accordingly.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 4. Good catch! I've replaced it with
> > > > > `DropHeaders`.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Looking forward to your feedback.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > Jorge.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Wed, 9 Mar 2022 at 21:33, Chris
> Egerton
> > <
> > > > > > > > > > > > > > fearthecel...@gmail.com>
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hi Jorge,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Looking good! Got a few more thoughts.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 1. Sorry to revisit this, but I think
> we
> > > may
> > > > > want
> > > > > > > to
> > > > > > > > > > adopt
> > > > > > > > > > > a
> > > > > > > > > > > > > > > slightly
> > > > > > > > > > > > > > > > > > > different escape syntax style.
> > Backslashes
> > > > are
> > > > > > > great,
> > > > > > > > > but
> > > > > > > > > > > > since
> > > > > > > > > > > > > > > > they're
> > > > > > > > > > > > > > > > > > > already used by JSON, using them as an
> > > escape
> > > > > > > > sequence
> > > > > > > > > in
> > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > notation
> > > > > > > > > > > > > > > > > > > would also lead to some pretty ugly
> > > connector
> > > > > > > > configs.
> > > > > > > > > > > Anyone
> > > > > > > > > > > > > > who's
> > > > > > > > > > > > > > > > had
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > write regular expressions with
> > backslashes
> > > in
> > > > > > Java
> > > > > > > is
> > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > already
> > > > > > > > > > > > > > > > > > > familiar with this:
> > > > > > > > > > > > > "this\\\\.is\\\\.not\\\\.very\\\\.readable".
> > > > > > > > > > > > > > > What
> > > > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > > > you think about using the dot character
> > to
> > > > > escape
> > > > > > > > > itself?
> > > > > > > > > > > In
> > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > words,
> > > > > > > > > > > > > > > > > > > to access a single field named
> > > "this.field",
> > > > > > > instead
> > > > > > > > of
> > > > > > > > > > > using
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > syntax
> > > > > > > > > > > > > > > > > > > "this\.field" (which in JSON would have
> > to
> > > be
> > > > > > > > expressed
> > > > > > > > > > as
> > > > > > > > > > > > > > > > > > "this\\.field"),
> > > > > > > > > > > > > > > > > > > we could use "this..field", and for a
> > > single
> > > > > > field
> > > > > > > > > named
> > > > > > > > > > > > > > > > "this\field",
> > > > > > > > > > > > > > > > > > > instead of using the syntax
> "this\\field"
> > > > (or,
> > > > > in
> > > > > > > > JSON,
> > > > > > > > > > > > > > > > > "this\\\\field"),
> > > > > > > > > > > > > > > > > > > we could use "this\field" (or, in JSON,
> > > > > > > > "this\\field").
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 2. Could you flesh out the details on
> the
> > > new
> > > > > > > > > > "field.style"
> > > > > > > > > > > > > > > property,
> > > > > > > > > > > > > > > > > > > including the type, default value,
> > > > importance,
> > > > > > and
> > > > > > > a
> > > > > > > > > > > > > preliminary
> > > > > > > > > > > > > > > > > > docstring?
> > > > > > > > > > > > > > > > > > > See
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-618%3A+Exactly-Once+Support+for+Source+Connectors#KIP618:ExactlyOnceSupportforSourceConnectors-Newproperties
> > > > > > > > > > > > > > > > > > > for an example.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 3. Is the "Compatibility, Deprecation,
> > and
> > > > > > > Migration
> > > > > > > > > > Plan"
> > > > > > > > > > > > > > section
> > > > > > > > > > > > > > > > > still
> > > > > > > > > > > > > > > > > > > accurate after the latest update? Seems
> > > like
> > > > > it's
> > > > > > > > still
> > > > > > > > > > > > written
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > assumption that nested field syntax
> will
> > be
> > > > > > > hardcoded
> > > > > > > > > or
> > > > > > > > > > > > > opt-in,
> > > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > IIUC
> > > > > > > > > > > > > > > > > > > isn't the case anymore.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 4. Nit: The "These SMTs do not require
> > > nested
> > > > > > > > structure
> > > > > > > > > > > > > support"
> > > > > > > > > > > > > > > > > section
> > > > > > > > > > > > > > > > > > > mentions a "Drop" SMT. I think this may
> > be
> > > > > > > referring
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > > > > Confluent
> > > > > > > > > > > > > > > > > > Drop
> > > > > > > > > > > > > > > > > > > SMT, which isn't a part of Apache
> Kafka.
> > > > Should
> > > > > > we
> > > > > > > > drop
> > > > > > > > > > > (heh)
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > SMT
> > > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > the list? Or perhaps just replace it
> with
> > > > > > > > > "DropHeaders",
> > > > > > > > > > > > which
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > currently
> > > > > > > > > > > > > > > > > > > missing from the list and shouldn't
> > require
> > > > any
> > > > > > > > > > > nested-field
> > > > > > > > > > > > > > > related
> > > > > > > > > > > > > > > > > > > updates?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Chris
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Mon, Feb 28, 2022 at 2:12 PM Jorge
> > > Esteban
> > > > > > > > Quilcate
> > > > > > > > > > > Otoya
> > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > quilcate.jo...@gmail.com> wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thank you, Chris! and sorry for the
> > > delayed
> > > > > > > > response.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Please, find my comments below:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Mon, 14 Feb 2022 at 17:34, Chris
> > > Egerton
> > > > > > > > > > > > > > > > > > <chr...@confluent.io.invalid
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Hi Jorge,
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Thanks for the KIP! I'd love to see
> > > > support
> > > > > > for
> > > > > > > > > > nested
> > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > added
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > out-of-the-box SMTs provided with
> > > > Connect.
> > > > > > Here
> > > > > > > > are
> > > > > > > > > > my
> > > > > > > > > > > > > > initial
> > > > > > > > > > > > > > > > > > > thoughts:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 1. I agree that there's a case to
> be
> > > made
> > > > > for
> > > > > > > > > > expanding
> > > > > > > > > > > > > > > > HoistField
> > > > > > > > > > > > > > > > > > > with a
> > > > > > > > > > > > > > > > > > > > > new config property for
> identifying a
> > > > > nested,
> > > > > > > > > > > > to-be-hoisted
> > > > > > > > > > > > > > > > field,
> > > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > example in the KIP doesn't really
> > > > > demonstrate
> > > > > > > why
> > > > > > > > > > this
> > > > > > > > > > > > > would
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > valuable. I
> > > > > > > > > > > > > > > > > > > > > think it'd be helpful to expand the
> > > > example
> > > > > > to
> > > > > > > > add
> > > > > > > > > > > other
> > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > order
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > show how adding nested field
> support
> > > > > enables
> > > > > > > > users
> > > > > > > > > to
> > > > > > > > > > > > > hoist a
> > > > > > > > > > > > > > > > > nested
> > > > > > > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > > > > > without dropping other fields from
> > the
> > > > > value.
> > > > > > > > Maybe
> > > > > > > > > > > > > something
> > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > > this:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >     source = nested.val
> > > > > > > > > > > > > > > > > > > > >     field = line
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >     value (before):
> > > > > > > > > > > > > > > > > > > > >         {
> > > > > > > > > > > > > > > > > > > > >             "nested": {
> > > > > > > > > > > > > > > > > > > > >                 "val": 42,
> > > > > > > > > > > > > > > > > > > > >                 "other val": 96
> > > > > > > > > > > > > > > > > > > > >             }
> > > > > > > > > > > > > > > > > > > > >         }
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >     value (after):
> > > > > > > > > > > > > > > > > > > > >         {
> > > > > > > > > > > > > > > > > > > > >             "nested": {
> > > > > > > > > > > > > > > > > > > > >                 "line": {
> > > > > > > > > > > > > > > > > > > > >                     "val": 42,
> > > > > > > > > > > > > > > > > > > > >                 }
> > > > > > > > > > > > > > > > > > > > >                 "other val": 96
> > > > > > > > > > > > > > > > > > > > >             }
> > > > > > > > > > > > > > > > > > > > >         }
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 2. Nit: I think "source" is a
> little
> > > > > strange
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > > new
> > > > > > > > > > > > > > > > HoistField
> > > > > > > > > > > > > > > > > > > > > property name. Maybe "hoisted" or
> > > > > > > "hoisted.field"
> > > > > > > > > > would
> > > > > > > > > > > > be
> > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > > descriptive?
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > About 1. and 2.:
> > > > > > > > > > > > > > > > > > > > Agree. The example for this SMT is
> > > updated
> > > > > and
> > > > > > > have
> > > > > > > > > > added
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > `hoisted`
> > > > > > > > > > > > > > > > > > > > configuration.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 3. Is there a reasonable use case
> for
> > > > > > expanding
> > > > > > > > > > Flatten
> > > > > > > > > > > > to
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > able
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > flatten specific fields? My
> > > understanding
> > > > > is
> > > > > > > that
> > > > > > > > > > it's
> > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > useful
> > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > writing to systems like databases
> > that
> > > > > don't
> > > > > > > > > support
> > > > > > > > > > > > nested
> > > > > > > > > > > > > > > > values
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > require everything to be a flat
> list
> > of
> > > > > > > key-value
> > > > > > > > > > > pairs.
> > > > > > > > > > > > > > Being
> > > > > > > > > > > > > > > > able
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > flatten a nested field wouldn't
> > provide
> > > > any
> > > > > > > > > advantage
> > > > > > > > > > > for
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > case.
> > > > > > > > > > > > > > > > > > > > > Are there other cases where it
> would?
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 4. I don't think we should
> > > > unconditionally
> > > > > > > change
> > > > > > > > > the
> > > > > > > > > > > > > default
> > > > > > > > > > > > > > > > > > delimiter
> > > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > Flatten. It's a
> > backwards-incompatible,
> > > > > > > breaking
> > > > > > > > > > change
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > cause
> > > > > > > > > > > > > > > > > > > > > headaches for users. It might be
> > > > reasonable
> > > > > > to
> > > > > > > > > change
> > > > > > > > > > > the
> > > > > > > > > > > > > > > default
> > > > > > > > > > > > > > > > > > value
> > > > > > > > > > > > > > > > > > > > > dynamically based on whether the
> user
> > > has
> > > > > > > > > specified a
> > > > > > > > > > > > value
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > "field"
> > > > > > > > > > > > > > > > > > > > > property, but considering the
> > > motivation
> > > > > for
> > > > > > > > > changing
> > > > > > > > > > > the
> > > > > > > > > > > > > > > default
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > it creates conflicts with the
> > > > > > to-be-introduced
> > > > > > > > > nested
> > > > > > > > > > > > field
> > > > > > > > > > > > > > > > syntax
> > > > > > > > > > > > > > > > > > > (which
> > > > > > > > > > > > > > > > > > > > > could arise with downstream SMTs
> > > > regardless
> > > > > > of
> > > > > > > > > > whether
> > > > > > > > > > > > the
> > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > > > > > > explicitly configured Flatten with
> > the
> > > > > > "field"
> > > > > > > > > > > > property), I
> > > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > > > know
> > > > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > this would be too useful either. I
> > have
> > > > > some
> > > > > > > > > thoughts
> > > > > > > > > > > > below
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > handle possible conflicts between
> > names
> > > > > with
> > > > > > > dots
> > > > > > > > > in
> > > > > > > > > > > > their
> > > > > > > > > > > > > > > names
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > dotted
> > > > > > > > > > > > > > > > > > > > > syntax for nested field references
> > that
> > > > > > should
> > > > > > > > > > > hopefully
> > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > > either
> > > > > > > > > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > > > > > > unnecessary.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Fair enough. With the support for
> > nested
> > > > > fields
> > > > > > > in
> > > > > > > > > > other
> > > > > > > > > > > > > SMTs,
> > > > > > > > > > > > > > > > > Flatten
> > > > > > > > > > > > > > > > > > > > could stay as it is.
> > > > > > > > > > > > > > > > > > > > This removes the need for (4)
> changing
> > > > > Flatten
> > > > > > > > config
> > > > > > > > > > as
> > > > > > > > > > > > > well.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 5. I think it's fine to expand
> > > > ExtractField
> > > > > > to
> > > > > > > > > > support
> > > > > > > > > > > > > nested
> > > > > > > > > > > > > > > > > > notation,
> > > > > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > > it might be worth noting in the
> > > rejected
> > > > > > > > > alternatives
> > > > > > > > > > > > > section
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > isn't strictly necessary since you
> > can
> > > > > > replace
> > > > > > > > any
> > > > > > > > > > > single
> > > > > > > > > > > > > > > > > invocation
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > that SMT that uses nested field
> > > notation
> > > > > with
> > > > > > > > > > multiple
> > > > > > > > > > > > > > > > invocations
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > that use non-nested notation.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Agree. Adding it.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 6. Nit: "RegerRouter" should be
> > > > > "RegexRouter"
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > list
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > SMTs
> > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > > > > > not require nested structure
> support.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Ack. Fixing it.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 7. It may be rare for dots in field
> > > names
> > > > > to
> > > > > > > > occur
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > > wild
> > > > > > > > > > > > > > > > > > > (although
> > > > > > > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > > > > > wouldn't be so certain of this),
> but
> > > > unless
> > > > > > we
> > > > > > > > want
> > > > > > > > > > to
> > > > > > > > > > > > > > inflict
> > > > > > > > > > > > > > > > > > > headaches
> > > > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > > users of Flatten, I think we're
> going
> > > to
> > > > > have
> > > > > > > to
> > > > > > > > > > think
> > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > conflicts
> > > > > > > > > > > > > > > > > > > > > between dotted notation and
> > non-nested
> > > > > fields
> > > > > > > > whose
> > > > > > > > > > > names
> > > > > > > > > > > > > > > contain
> > > > > > > > > > > > > > > > > > > dots. I
> > > > > > > > > > > > > > > > > > > > > don't think this is actually such a
> > bad
> > > > > > thing,
> > > > > > > > > > though.
> > > > > > > > > > > I
> > > > > > > > > > > > > > agree
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > dotted
> > > > > > > > > > > > > > > > > > > > > notation is intuitive and pretty
> > > > > commonplace
> > > > > > > (in
> > > > > > > > > > tools
> > > > > > > > > > > > like
> > > > > > > > > > > > > > jq,
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > example), so I'd like it if we
> could
> > > > stick
> > > > > to
> > > > > > > it.
> > > > > > > > > > What
> > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > introducing
> > > > > > > > > > > > > > > > > > > > > escape syntax, using a backslash?
> > That
> > > > way,
> > > > > > > users
> > > > > > > > > > could
> > > > > > > > > > > > > > > > > disambiguate
> > > > > > > > > > > > > > > > > > > > > between "this.field" (which would
> > refer
> > > > to
> > > > > > the
> > > > > > > > > nested
> > > > > > > > > > > > field
> > > > > > > > > > > > > > > > "field"
> > > > > > > > > > > > > > > > > > > under
> > > > > > > > > > > > > > > > > > > > > the top-level "this" field), and
> > > > > > "this\.field"
> > > > > > > > > (which
> > > > > > > > > > > > would
> > > > > > > > > > > > > > > refer
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > field named "this.field"). Like
> with
> > > most
> > > > > > > > languages
> > > > > > > > > > > that
> > > > > > > > > > > > > use
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > backslash
> > > > > > > > > > > > > > > > > > > > > for escape sequences, it could also
> > be
> > > > used
> > > > > > to
> > > > > > > > > escape
> > > > > > > > > > > > > itself,
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > event
> > > > > > > > > > > > > > > > > > > > > that a field name contains a
> > > backslash. I
> > > > > > think
> > > > > > > > > this
> > > > > > > > > > is
> > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > intuitive
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > simpler than, e.g., adding a new
> > config
> > > > > > > property
> > > > > > > > to
> > > > > > > > > > > > toggle
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > delimiter
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > be used when parsing nested field
> > > > > references.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I like this approach. Adding to the
> > KIP.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 8. I don't think we can
> > unconditionally
> > > > > turn
> > > > > > > this
> > > > > > > > > > > feature
> > > > > > > > > > > > > on.
> > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > risk
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > breaking existing pipelines
> > (especially
> > > > > ones
> > > > > > > that
> > > > > > > > > > > > involve,
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > example, a
> > > > > > > > > > > > > > > > > > > > > combination of the Flatten and Cast
> > > SMTs)
> > > > > is
> > > > > > > > pretty
> > > > > > > > > > > > high. I
> > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > should be an opt-in feature, at
> least
> > > > until
> > > > > > the
> > > > > > > > > next
> > > > > > > > > > > > major
> > > > > > > > > > > > > > > > release.
> > > > > > > > > > > > > > > > > > One
> > > > > > > > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > > > > > > we could accomplish this is by
> > > > introducing
> > > > > a
> > > > > > > new
> > > > > > > > > > > > > > "field.style"
> > > > > > > > > > > > > > > > > (name
> > > > > > > > > > > > > > > > > > > > > obviously subject to change)
> property
> > > > with
> > > > > > > values
> > > > > > > > > of
> > > > > > > > > > > > > "plain"
> > > > > > > > > > > > > > > > > > (default)
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > "nested". If set to "plain" then
> the
> > > > > current
> > > > > > > > > > non-nested
> > > > > > > > > > > > > > > behavior
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > used,
> > > > > > > > > > > > > > > > > > > > > and if set to "nested", then the
> > > proposed
> > > > > > > nested
> > > > > > > > > > > behavior
> > > > > > > > > > > > > is.
> > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > > consider updating the default value
> > to
> > > > > > "nested"
> > > > > > > > in
> > > > > > > > > a
> > > > > > > > > > > > future
> > > > > > > > > > > > > > > major
> > > > > > > > > > > > > > > > > > > release
> > > > > > > > > > > > > > > > > > > > > (or even codify that plan in this
> > KIP,
> > > if
> > > > > > > there's
> > > > > > > > > > > enough
> > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > it).
> > > > > > > > > > > > > > > > > > > > > This would also leave the door open
> > for
> > > > > > adding
> > > > > > > > > > > recursive
> > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > SMTs
> > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > the future by adding a permitted
> > value
> > > of
> > > > > > > > > > "recursive".
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 9. One of the linked tickets in the
> > > > > > > "Motivation"
> > > > > > > > > > > section,
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10640,
> > > > > > > > > > has
> > > > > > > > > > > > an
> > > > > > > > > > > > > > open
> > > > > > > > > > > > > > > > PR
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > > > > > that propose adding recursive
> support
> > > to
> > > > > some
> > > > > > > > SMTs.
> > > > > > > > > > > Have
> > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > > > considered
> > > > > > > > > > > > > > > > > > > > > this type of functionality for your
> > > KIP?
> > > > Or
> > > > > > is
> > > > > > > > your
> > > > > > > > > > aim
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > stick
> > > > > > > > > > > > > > > > > > solely
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > nested field support?
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I like the `field.style`
> configuration
> > > flag
> > > > > > > > approach.
> > > > > > > > > > > > > > > > > > > > Thanks for pointing out the
> `recursive`
> > > > > > approach.
> > > > > > > > > Will
> > > > > > > > > > > add
> > > > > > > > > > > > > > > `nested`
> > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > moment, let's check the demand for
> > > > > `recursive`
> > > > > > to
> > > > > > > > > > > consider
> > > > > > > > > > > > it
> > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > part
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > this or another KIP.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I have added the following on the
> KIP:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > > > > > > > Future KIPs could extend this support
> > > for:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > - Recursive notation: name a field
> and
> > > > apply
> > > > > it
> > > > > > > to
> > > > > > > > > all
> > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > across
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > schema matching that name.
> > > > > > > > > > > > > > > > > > > > - Access to arrays: Adding `[]`
> > notation
> > > to
> > > > > > > > represent
> > > > > > > > > > > > access
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > arrays
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > applying SMTs to fields within an
> > array.
> > > > > > > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Chris
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Tue, Feb 8, 2022 at 1:23 PM
> Jorge
> > > > > Esteban
> > > > > > > > > Quilcate
> > > > > > > > > > > > > Otoya <
> > > > > > > > > > > > > > > > > > > > > quilcate.jo...@gmail.com> wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Hi Dev team,
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > I'd like to start a new
> discussion
> > > > thread
> > > > > > on
> > > > > > > > > Kafka
> > > > > > > > > > > > > Connect
> > > > > > > > > > > > > > > > > KIP-821:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-821%3A+Connect+Transforms+support+for+nested+structures
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > This KIP is aimed to include
> > support
> > > > for
> > > > > > > nested
> > > > > > > > > > > > > structures
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > > > > > SMTs — where this make sense.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Looking forward to your feedback.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > > > > > Jorge.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to