Thanks, Chris! Please, find my comments below:
On Tue, 7 Jun 2022 at 04:39, Chris Egerton <fearthecel...@gmail.com> wrote: > Hi Jorge, > > Thanks! Sorry for the delay; here are my thoughts: > > 1. Under the "Accessing multiple values by deep-scan" header it's stated > that "If deep-scan is used, it must have only one field after the asterisk > level.". However, in example 3 for the Cast SMT and other examples for > other SMTs, the spec contains a field of "*.child.k2", which appears to > have two fields after the asterisk level. I may be misunderstanding the > proposal, but it seems like the two contradict each other. > Thanks for catching this. I have clarified it by removing this restriction. Also, have extended the deep-scan scenarios. > > 2. I'm a little unclear on why we need the special handling for arrays > where, for an array field "a", the field name "a" can be treated as either > the array itself, or every element in the array. Is there a reason we can't > use the field name "a.*" to handle the latter case, and "a" to handle the > former? > Agree, this is confusing. I like the `a.*` approach to access array items. I have added this to the proposal. > > 3. How would a user specify that they'd like to access a field with the > literal name "*"? > Good one. I'm proposing an approach similar to how it's proposed to escape dots, with a double-asterisk. Curious on your thoughts around this. > > 4. For the Cast SMT, do you think it might bite some people if fields that > can't be cast correctly are silently ignored? I'm imagining the case where > none of the fields in a multi-path expression can be cast correctly and it > ends up eating half of someone's day to track down why their SMT isn't > doing anything. > If I understand correctly, this challenge could be relevant across SMTs. At the moment, most/all? SMTs just silently ignore. Was thinking about adding a flag `field.on.path.not.found` to either ignore or fail when no paths are found. What do your think? > > 5. For the ExtractField and ValueToKey SMTs, what happens if a deep-scan > field name is used, but only one field is found? Is the resulting field > still an array, or is it just the single field that was found? (FWIW I'm > leaning towards keeping it an array just to keep schemas consistent in a > pipeline in case the number of fields found fluctuates across records.) > > Agree. Will clarify that an array is always produced even for 1 or 0 fields are found. > 6. (Nit) For the HeaderFrom SMT, it's stated that "As this SMT affects only > existing fields, additional configurations will not be required.". Given > the new "field.syntax.version" property, this part should probably be > removed. > Agree. > > 7. Is recursive descent intentionally excluded? That was an important part > of Joshua's KIP and his feedback on this KIP; I think it's worth pursuing > if we can. > My understanding from Joshua's feedback is that by including support for deep-scan, we are already covering the recursive functionality. Though, I may be missing something. > > > Cheers, > > Chris > > On Tue, May 24, 2022 at 3:49 AM Jorge Esteban Quilcate Otoya < > quilcate.jo...@gmail.com> wrote: > > > Thanks, Chris! > > > > I have updated the KIP with the rejected alternatives updated. Also, I > have > > drafted the support for arrays and deep scans as part of the proposed > > notation to make it more complete, even though these can be implemented > in > > multiple PRs. > > > > Looking forward to your feedback. > > > > Cheers, > > Jorge. > > > > On Sat, 21 May 2022 at 17:39, Chris Egerton <fearthecel...@gmail.com> > > wrote: > > > > > Hi Jorge, > > > > > > I really appreciate the effort you've made to simplify the syntax and > > > feature set of a JSONPath-based approach as much as possible. I'm still > > > hesitant to continue with it, though. > > > > > > 1. The syntax is much less friendly. Just compare "top.mid.bottom" to > > > "$['top']['mid']['bottom']"... not everyone uses JSONPath or even JSON, > > and > > > the learning curve for the former is going to be steeper. The examples > in > > > the new KIP draft you published demonstrate this pretty well, and this > is > > > without diving into the details of what escape syntax would look like. > > > > > > 2. The three new major features that this syntax adds (array accesses, > > deep > > > scans, and multi-value paths) could all be added pretty easily to the > dot > > > notation syntax without introducing brackets and dollar signs. Array > > > accesses can be described using the same syntax as struct/map field > > access, > > > deep scans can be described using '*', and multi-value paths can be > > > described by referencing the name of a field that's expected to have > > > children. These are all top-of-the-head ideas and can probably be > > refined, > > > but hopefully they demonstrate that we can keep the syntax simple > without > > > sacrificing features. Of course, the question of leaving room for > future > > > features might arise... given that these are the out-of-the-box SMTs > that > > > are likely to be the first that many people encounter, I'd err on the > > side > > > of simplicity and a gentle learning curve; if people need to get more > out > > > of transforms, the option to implement their own is still there. If we > > can > > > address 95% of use cases with something easy to use, it's not worth > > making > > > the feature harder for everyone to use just to accommodate the > remaining > > > 5%. > > > > > > 3. The advantage of leveraging an existing syntax is twofold: users who > > are > > > already familiar with that syntax don't need to learn a new syntax, and > > > maintainers of the syntax get to leverage existing libraries and > > > documentation for the syntax. With the current proposal, reusing > > libraries > > > is off the table, which means that we're going to have to parse this > > > ourselves (including all the escape syntax edge cases), and we won't be > > > able to automatically leverage new features added to that syntax. And > > given > > > how stripped-down the syntax is in comparison to full JSONPath, there's > > > still going to be a learning curve for users who are already familiar > > with > > > it, and we'll still have to document how Connect's variant of JSONPath > > > works either instead of or in addition to just linking to a > > well-maintained > > > third-party docs site. > > > > > > On the topic of "field.path" and "include.path" vs. "field.style", I > > > actually think that a single property per SMT is cleaner and simpler. > > > Allowing users to mix and match styles within the same SMT config seems > > > like a recipe for confusion, and with a single property that dictates > > field > > > syntax behavior, we leave the door open to change the default at a > later > > > date. We could even fully remove support for plain field notation at > some > > > point and still be able to retain the simple property names of "field", > > > "include", etc. instead of forcing people to use "field.path" and the > > like. > > > That said, the term "field.style" and the permitted values might be a > > > little ambiguous. One alternative, though a little heavy-handed, is to > > > change it to "field.syntax.version" with permitted values of "V1" > > (default, > > > equivalent to "field.style = plain") and "V2" (equivalent to > > "field.style = > > > nested"). This would leave us room in the future to make further > changes > > to > > > the syntax without having to come up with new names, although it does > > > sacrifice a little bit in that the permitted values are no longer > > > self-describing. > > > > > > Cheers, > > > > > > Chris > > > > > > On Sun, May 15, 2022 at 4:51 PM Jorge Esteban Quilcate Otoya < > > > quilcate.jo...@gmail.com> wrote: > > > > > > > Thank you all for your feedback, and sorry for the long wait for a > > reply. > > > > > > > > I would like to explore the idea of JSONPath-inspired/subset > notation a > > > bit > > > > further: > > > > > > > > It will need to be a much-reduced version of JSONPath: > > > > - No full support for JsonPath therefore an additional dependency. > > > > - All paths must start with '$' > > > > - No functions support or other operators allowed. > > > > - JsonPath dotted and square-bracket notations can be supported to > > avoid > > > > escaping dots or other characters: `$a.b.c` and `$['a.b']['c']` - or > we > > > > could only support the second one as it's more complete. > > > > - Add support for arrays with `[<integer>]`, e.g. `$a.[1].b` > > > > - Add support for multiple-value paths using array access `$a.*.b` or > > > deep > > > > scan `$a..b`. > > > > - Some SMTs that will benefit from this: `MaskField`, `Cast`, > > > > `ReplaceField`. > > > > - We could introduce a `path[s]` config under the current > > configurations > > > to > > > > apply this feature so no compatibility issues are introduced: e.g. > > > > `field.path`, `fields.paths`, `exclude.path`. > > > > > > > > With these, 100, 101, 102, and 103 will be effectively solved. > > > > > > > > The main challenge that I see at the moment is that by being JSON > > > > path-like, there may be some edge cases that I can't foresee at the > > > moment, > > > > that could make this hard to implement, test, and maintain in the > long > > > run. > > > > > > > > I'll appreciate your feedback on how this JSONPath-based alternative > > > > compares to the dotted notation initially proposed, and if it matches > > > your > > > > feedback. > > > > I have drafted a copy of the KIP to change the examples to JSONPath > > > > approach and validate some differences: > > > > https://cwiki.apache.org/confluence/x/9BihD > > > > > > > > About 104. I agreed with Chris. We can handle this as part of a new > > KIP. > > > > > > > > Thanks, > > > > Jorge. > > > > > > > > On Sun, 24 Apr 2022 at 17:27, Chris Egerton <fearthecel...@gmail.com > > > > > > wrote: > > > > > > > > > Hi Joshua, > > > > > > > > > > I have a few reservations about using JsonPath notation here. > > > > > > > > > > 1. There's likely to be a substantial performance penalty for > > > converting > > > > > between the Kafka Connect format and something that a JsonPath > > library > > > > > would understand. > > > > > > > > > > 2. The complexity of the feature will be significantly higher. It > > will > > > be > > > > > harder to test, document, and implement. There will be many more > edge > > > > cases > > > > > to consider and support, and we'll be on the hook to handle any > bugs > > or > > > > > inconsistencies that arise, either as a result of our use of the > > > JsonPath > > > > > library we choose, or as a result of bugs in that library. > > > > > > > > > > 3. It's not clear that JsonPath is superior or even suitable for > some > > > of > > > > > the SMTs proposed here. What would be the advantages of JsonPath > with > > > the > > > > > InsertField or HoistField SMTs? > > > > > > > > > > I also don't think that adding dot notation is unfriendly to users; > > > many > > > > > have proposed this type of syntax in the past, and it's frequently > > used > > > > in > > > > > informal discussions to refer to nested fields. If the proposed > > syntax > > > > was > > > > > not already intuitive then a case against deciding on our own might > > be > > > > more > > > > > convincing, but as things stand, simple dot notation is likely > going > > to > > > > be > > > > > easier for users to understand than JsonPath syntax. > > > > > > > > > > Cheers, > > > > > > > > > > Chris > > > > > > > > > > On Fri, Apr 22, 2022 at 9:31 AM Joshua Grisham < > > grishamj...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hello all! > > > > > > > > > > > > Sorry that I come a bit later to the party here, but I am the one > > who > > > > > wrote > > > > > > KIP-683 [1] for recursive support (just simply looping through > all > > > > child > > > > > > non-primitive structures for the same matching name(s)) which is > a > > > > > slightly > > > > > > different way to try and solve a similar requirement -- > > unfortunately > > > > at > > > > > > the time the dev community was not quite as active and then I > also > > > got > > > > > busy > > > > > > with work and just life in general so wasn't able to follow up or > > > push > > > > > it. > > > > > > > > > > > > I do think it is a very good idea to have some kind of path-like > > > > > expression > > > > > > to be able to specifically address a nested field, as I can see > > that > > > > the > > > > > > simple "recursive" case could potentially result in unwanted or > > > > > unexpected > > > > > > behavior, plus there is the potential to introduce a bit of a > > > > performance > > > > > > hit to always loop through everything in cases where the > > > schemas/values > > > > > > might be quite large. > > > > > > > > > > > > One thing I wanted to ask: instead of creating a new "path > parser" > > > > > > including some bespoke or "borrowed" syntax, why not just use > > > something > > > > > > that already exists? Specifically here I am thinking about > > JsonPath ( > > > > > > https://github.com/json-path/JsonPath) > > > > > > > > > > > > There is already quite nice support in JsonPath for handling > > special > > > > > > characters in field names, for handling different non-primitive > > types > > > > > > (arrays etc), for handling multiple levels of nesting, etc etc. > > > Would > > > > it > > > > > > be possible to instead to re-think this and maybe have some kind > of > > > > > > JsonPath-based Schema selector / updater and/or JsonPath-based > > Value > > > > > > selector / updater? Conceptually this feels like it makes sense > to > > > me, > > > > as > > > > > > from the top of my head it would be quite a natural fit to map a > > Json > > > > > data > > > > > > structure to the Connect API data structure (and you could > > > potentially > > > > > even > > > > > > try to leverage the existing Json-to-Connect > > serializer/deserializer > > > to > > > > > > help out with this even in a more "out of the box"-feeling kind > of > > > > way). > > > > > > > > > > > > Maybe also as Tom mentioned, this part (in my example, this > > > > > JsonPath-based > > > > > > "thing") could even be a generic API that could be used by any > SMT, > > > > > > including used in custom ones built by the community. Then I > think > > > to > > > > > use > > > > > > a completely separate config property somehow related to "path" > (as > > > Tom > > > > > > also mentioned) would also make a lot of sense here as well. This > > > way, > > > > if > > > > > > you select based on "path" then this JsonPath-based API would be > > > used, > > > > > > otherwise it could use something similar to the existing > get-field > > > > based > > > > > > approach (which I guess could also be refactored into some kind > of > > > > > utility > > > > > > / API as well if it made sense?) > > > > > > > > > > > > And with that in mind, if this was the kind of direction to go, > > then > > > a > > > > > > "recursive" capability like I pitched in KIP-683 would also > become > > > > > > unnecessary because you could easily write a JsonPath expression > > like > > > > > > "$..someRecuriveField" and it would do the same thing (on top of > > > > anything > > > > > > else you would want to do that is already supported by JsonPath). > > > Then > > > > we > > > > > > could also kill that older KIP and do a bit of clean-up :) > > > > > > > > > > > > [1] - > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-683%3A+Add+recursive+support+to+Connect+Cast+and+ReplaceField+transforms%2C+and+support+for+casting+complex+types+to+either+a+native+or+JSON+string > > > > > > > > > > > > Just some extra food for thought. All-in-all I think this is a > > super > > > > > great > > > > > > initiative! > > > > > > > > > > > > Best, > > > > > > > > > > > > Joshua > > > > > > > > > > > > > > > > > > Den fre 22 apr. 2022 kl 14:50 skrev Chris Egerton < > > > > > fearthecel...@gmail.com > > > > > > >: > > > > > > > > > > > > > Hi Tom, > > > > > > > > > > > > > > Thanks for taking a look at this, and for your thoughtful > > comments. > > > > > I'll > > > > > > > leave it up to Jorge to address most of your comments but I > > wanted > > > to > > > > > > share > > > > > > > a couple quick thoughts I had regarding 103 and 104. > > > > > > > > > > > > > > 103. Like you, I was envisioning a possible syntax for array > > access > > > > > that > > > > > > > used classic C-style brackets; e.g., `arr[index]`. However, I > > > wonder > > > > if > > > > > > we > > > > > > > could keep things simple and use the same syntax that we're > > > proposing > > > > > for > > > > > > > nested field access? In other words, instead of `arr[index]`, > > you'd > > > > > write > > > > > > > `arr.index`. It'd save us and users the headache of reserving > > > > > characters > > > > > > > now that would need to be escaped even if their unescaped > > brethren > > > > > aren't > > > > > > > used for anything, and also avoid the question of what exactly > we > > > > > should > > > > > > do > > > > > > > when we see a config that uses reserved characters that aren't > > yet > > > > > > > supported (throwing an exception seems pretty unfriendly for > new > > > > > users). > > > > > > > > > > > > > > 104. This would probably be useful, but it would come with some > > > nasty > > > > > > > compatibility questions that would need to be addressed if we'd > > > want > > > > > SMTs > > > > > > > that leverage this new API to be viable for older versions of > > > > Connect. > > > > > If > > > > > > > we package and distribute this feature as a library (either via > > an > > > > > > entirely > > > > > > > new artifact, or as part of the existing connect-transforms or > > > > > > connect-api > > > > > > > artifacts), then we'd have to either sidestep the existing > plugin > > > > > > isolation > > > > > > > logic [1] that basically makes it impossible for Connect > plugins > > to > > > > > ship > > > > > > > their own versions of Connect artifacts, or issue a big warning > > to > > > > > people > > > > > > > that any SMT that uses this API won't work with any older > > versions > > > of > > > > > > > Connect. There's also some other features we might want to add > in > > > an > > > > > > > SMT-utils library such as the existing, internal, utils that > > > Connect > > > > > uses > > > > > > > right now [2]. It may be worth exploring this in a separate KIP > > of > > > > its > > > > > > own. > > > > > > > > > > > > > > [1] - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/kafka/blob/d480c4aa6e513e36050d8e067931de2270525d18/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/PluginUtils.java#L46-L143 > > > > > > > > > > > > > > [2] - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/kafka/tree/d480c4aa6e513e36050d8e067931de2270525d18/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/util > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > Chris > > > > > > > > > > > > > > On Fri, Apr 22, 2022 at 6:55 AM Tom Bentley < > tbent...@redhat.com > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Jorge, > > > > > > > > > > > > > > > > Thanks for the KIP, especially for the examples which are > > > > > super-clear. > > > > > > > > > > > > > > > > 100. The name `field.style` isn't so clear for something like > > > > > > > ReplaceField: > > > > > > > > it's not so obvious that field.style applies to `include` and > > > > > > `exclude`. > > > > > > > > > > > > > > > > 101. The permitted values for `field.style` don't seem > terribly > > > > > > intuitive > > > > > > > > (to me anyway): the meaning of `plain` isn't very guessable. > > Why > > > > not > > > > > > > > `top-level` or `root` instead? Also `nested` could be > > > misconstrued > > > > to > > > > > > > mean > > > > > > > > nested-but-not-top-level, so perhaps `recursive` or > `cascading` > > > > might > > > > > > be > > > > > > > > better? > > > > > > > > > > > > > > > > 102. I'm torn on whether making the interpretation of > existing > > > > > configs > > > > > > > like > > > > > > > > `field` be dependent on `field.style` is a good idea. I can > see > > > > that > > > > > > it's > > > > > > > > the simplest thing to do, but it just feels a bit odd that > > > > sometimes > > > > > > the > > > > > > > > `field` would actually be a path and have different escaping > > > rules. > > > > > An > > > > > > > > alternative would be to come up with a parallel set of config > > > names > > > > > > (e.g. > > > > > > > > as well as "field" an SMT might support "path") which were > > > defined > > > > to > > > > > > > > always take paths, thus avoiding the changeable > interpretation > > of > > > > the > > > > > > > > existing configs. The SMT's #configure() would need to throw > in > > > the > > > > > > case > > > > > > > > that both configs were given. I can see that that would be > more > > > > work > > > > > in > > > > > > > > implementation, but it feels cleaner. > > > > > > > > > > > > > > > > 103. I think in order to allow for supporting arrays in a > later > > > KIP > > > > > > > (which > > > > > > > > certainly seems like it could be useful), we'd want to > specify > > > the > > > > > > syntax > > > > > > > > now, even if it wasn't implemented under this KIP. That's > > > because I > > > > > > don't > > > > > > > > think you can't exclude the possibility that characters such > as > > > `[` > > > > > and > > > > > > > `]` > > > > > > > > appear in field names. So you'd have a compatibility problem > if > > > > > people > > > > > > > > started using the features of this KIP to access such fields, > > > only > > > > > for > > > > > > > > those characters to change their meaning under a later KIP. > > > > > > > > > > > > > > > > 104. I also wonder whether making paths into a public Java > API, > > > for > > > > > use > > > > > > > by > > > > > > > > 3rd party SMTs, would be valuable. > > > > > > > > > > > > > > > > Thanks again, > > > > > > > > > > > > > > > > Tom > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, 20 Apr 2022 at 17:53, Chris Egerton < > > > > fearthecel...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > 💯 Thanks Jorge, LGTM! > > > > > > > > > > > > > > > > > > On Wed, Apr 20, 2022, 12:40 Jorge Esteban Quilcate Otoya < > > > > > > > > > quilcate.jo...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > Thank you, Chris! Not possible without your feedback. > > > > > > > > > > > > > > > > > > > > On Tue, 19 Apr 2022 at 23:04, Chris Egerton < > > > > > > fearthecel...@gmail.com > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi Jorge, > > > > > > > > > > > > > > > > > > > > > > Thank you for sticking through this. I have one small > > > remark > > > > > and > > > > > > > one > > > > > > > > > > small > > > > > > > > > > > clarification; assuming you agree with me on them then > > I'm > > > > > ready > > > > > > to > > > > > > > > > vote > > > > > > > > > > on > > > > > > > > > > > the KIP. > > > > > > > > > > > > > > > > > > > > > > 1. InsertField: The "field.on.missing.parent" and > > > > > > > > > > "field.on.existing.field" > > > > > > > > > > > docs both mention a permitted value of "ingore"; this > > > should > > > > be > > > > > > > > > "ignore", > > > > > > > > > > > right? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Of course, one more typo :) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. InsertField: The examples are still missing the > > > > > "field.style" > > > > > > > > > property > > > > > > > > > > > from the configurations. They should all include the > > > property > > > > > > > > > > > "transforms.smt1.field.style": "nested", correct? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yes, it is there. I think I know what you mean now, seems > > > that > > > > > > > > Confluence > > > > > > > > > > is putting everything in one line when it's in separate > > lines > > > > in > > > > > > the > > > > > > > > > > editor. > > > > > > > > > > Hopefully, it's fixed now. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks again for working through this, and > > congratulations > > > > on a > > > > > > > > > > > well-written KIP! > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > > > Chris > > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 19, 2022 at 2:06 PM Jorge Esteban Quilcate > > > Otoya > > > > < > > > > > > > > > > > quilcate.jo...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > Thank you, Chris! I apply these improvements to the > > KIP, > > > > let > > > > > me > > > > > > > > know > > > > > > > > > > how > > > > > > > > > > > it > > > > > > > > > > > > looks now. > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 11 Apr 2022 at 23:43, Chris Egerton < > > > > > > > > fearthecel...@gmail.com > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Hi Jorge, > > > > > > > > > > > > > > > > > > > > > > > > > > Wow, those examples are great! A few more remarks, > > but > > > I > > > > > > think > > > > > > > > > we're > > > > > > > > > > > > > getting close: > > > > > > > > > > > > > > > > > > > > > > > > > > 1. The examples differ across SMTs with the name of > > the > > > > > > > > > > > newly-introduced > > > > > > > > > > > > > style property; some of them use "field.style", and > > > some > > > > > use > > > > > > > > > > > > > "fields.style". I think for consistency's sake we > > > should > > > > > > stick > > > > > > > > with > > > > > > > > > > > just > > > > > > > > > > > > > "field.style"; otherwise it could be painful for > > users > > > to > > > > > > have > > > > > > > to > > > > > > > > > > > > remember > > > > > > > > > > > > > which to use. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Great catch. Agree, I fixed the config names to > > > > > `field.style`. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. Some of the examples are off: > > > > > > > > > > > > > - TimestampConverter: the input in the second > example > > > > > ("when > > > > > > > > field > > > > > > > > > > > names > > > > > > > > > > > > > include dots") doesn't contain a field with a > dotted > > > name > > > > > > > > > > > > > - ValueToKey: the config in the third example > ("when > > > > field > > > > > > > names > > > > > > > > > > > include > > > > > > > > > > > > > dots") should probably use "parent..child.k2" as > the > > > > > > > > > > > > > "transforms.smt1.fields" property > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixed. Thanks! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3. RE changes to InsertField: > > > > > > > > > > > > > - The InsertField SMT should also come with the new > > > > > > > "field.style" > > > > > > > > > > > > property > > > > > > > > > > > > > in order to preserve backwards compatibility, > right? > > I > > > > > don't > > > > > > > see > > > > > > > > it > > > > > > > > > > > > > included in the example configs for that one, just > > want > > > > to > > > > > > make > > > > > > > > > sure > > > > > > > > > > > > > - I don't know of any cases where we use snake_case > > for > > > > > > > property > > > > > > > > > > names > > > > > > > > > > > in > > > > > > > > > > > > > Kafka; we should probably use "on.missing.parent" > and > > > > > > > > > > > "on.existing.field" > > > > > > > > > > > > > as the new property names for InsertField. > > > > > > > > > > > > > - Why is the "on_existing_field" (or > > > "on.existing.field") > > > > > > > > property > > > > > > > > > > only > > > > > > > > > > > > > applied when the field style is nested? Couldn't it > > be > > > > > useful > > > > > > > for > > > > > > > > > > > > > non-nested fields as well? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Great points! I have applied these suggestions to the > > > KIP. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > > > > > > > Chris > > > > > > > > > > > > > > > > > > > > > > > > > > On Sat, Apr 9, 2022 at 12:40 PM Jorge Esteban > > Quilcate > > > > > Otoya > > > > > > < > > > > > > > > > > > > > quilcate.jo...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Again, great feedback Chris. Much appreciated. > > > > > > > > > > > > > > Added my comments below: > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, 5 Apr 2022 at 20:22, Chris Egerton < > > > > > > > > > > fearthecel...@gmail.com> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Jorge, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking good! I have a few comments left but > all > > > but > > > > > one > > > > > > or > > > > > > > > two > > > > > > > > > > are > > > > > > > > > > > > > > minor. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. The motivation section states "This KIP is > > aimed > > > > to > > > > > > > > include > > > > > > > > > > > > support > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > nested structures on the existing SMTs... and > to > > > > > include > > > > > > > the > > > > > > > > > > > > > abstractions > > > > > > > > > > > > > > > to reuse this in future SMTs". A good > > > implementation > > > > of > > > > > > > this > > > > > > > > > KIP > > > > > > > > > > > will > > > > > > > > > > > > > > > definitely isolate reusable logic into a > separate > > > > > > > abstraction > > > > > > > > > > that > > > > > > > > > > > > can > > > > > > > > > > > > > be > > > > > > > > > > > > > > > easily pulled in to the SMTs we want to add > > nested > > > > > field > > > > > > > > > support > > > > > > > > > > > to, > > > > > > > > > > > > > but > > > > > > > > > > > > > > > unless we plan on making this kind of > abstraction > > > > > > publicly > > > > > > > > > > > available > > > > > > > > > > > > as > > > > > > > > > > > > > > > some kind of utility method or class that > > external > > > > SMT > > > > > > > > > developers > > > > > > > > > > > can > > > > > > > > > > > > > > > leverage, we can probably leave this part out > as > > > it's > > > > > > more > > > > > > > of > > > > > > > > > an > > > > > > > > > > > > > > > implementation detail. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Make sense, will leave this out of the KIP. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. The Cast example is a little misleading, > isn't > > > it? > > > > > It > > > > > > > > > > > demonstrates > > > > > > > > > > > > > the > > > > > > > > > > > > > > > escape syntax for fields with dot literals in > > their > > > > > > names, > > > > > > > > but > > > > > > > > > it > > > > > > > > > > > > > doesn't > > > > > > > > > > > > > > > demonstrate a way to actually use the Cast (or > > any > > > > > other) > > > > > > > SMT > > > > > > > > > to > > > > > > > > > > > > > access a > > > > > > > > > > > > > > > nested field in a record, which is the whole > > point > > > of > > > > > the > > > > > > > > KIP. > > > > > > > > > I > > > > > > > > > > > like > > > > > > > > > > > > > the > > > > > > > > > > > > > > > example of escape syntax but we should probably > > > also > > > > > add > > > > > > > one > > > > > > > > > for > > > > > > > > > > > > nested > > > > > > > > > > > > > > > field access. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Agree. I have added examples to each SMT to be > more > > > > clear > > > > > > > about > > > > > > > > > how > > > > > > > > > > > it > > > > > > > > > > > > > > affects each function > > > > > > > > > > > > > > . > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3. With the InsertField SMT, I'm wondering what > > the > > > > > > > specific > > > > > > > > > > > behavior > > > > > > > > > > > > > > will > > > > > > > > > > > > > > > be when some or all of the "middle layer" > nested > > > > fields > > > > > > are > > > > > > > > > > > missing. > > > > > > > > > > > > > For > > > > > > > > > > > > > > > example, if I have a record with a value of { > > "k1": > > > > > "v1 } > > > > > > > > and I > > > > > > > > > > > apply > > > > > > > > > > > > > > > InsertField with topic.field = n1.n2.n3.topic, > > what > > > > > will > > > > > > > > > happen? > > > > > > > > > > > Will > > > > > > > > > > > > > the > > > > > > > > > > > > > > > updated value become { "k1": "v1", "n1": { > "n2": > > { > > > > > "n3": > > > > > > > > > "topic" > > > > > > > > > > > }}}, > > > > > > > > > > > > > > will > > > > > > > > > > > > > > > an exception be thrown, or something else? This > > > seems > > > > > > > > analogous > > > > > > > > > > to > > > > > > > > > > > > the > > > > > > > > > > > > > > > command line mkdir command, which (at least on > > some > > > > > > > operating > > > > > > > > > > > > systems) > > > > > > > > > > > > > > > fails by default if you try to create a new > > nested > > > > > > > directory > > > > > > > > > > where > > > > > > > > > > > > > > anything > > > > > > > > > > > > > > > but the last element in the path doesn't exist, > > but > > > > can > > > > > > be > > > > > > > > > > invoked > > > > > > > > > > > > > with a > > > > > > > > > > > > > > > flag that instructs it to go ahead and create > all > > > > > levels > > > > > > of > > > > > > > > > > nested > > > > > > > > > > > > > > > directory instead. I'm leaning on the side of > > "just > > > > > > create > > > > > > > > > > > > everything" > > > > > > > > > > > > > > but > > > > > > > > > > > > > > > would be interested in your thoughts, and > either > > > way, > > > > > we > > > > > > > > should > > > > > > > > > > > > > probably > > > > > > > > > > > > > > > make sure the intended behavior is well-defined > > > > before > > > > > > > > voting. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This is an interesting case, thanks for catching > > > this! > > > > > > > > > > > > > > The default behavior I see is to create parents > if > > > they > > > > > are > > > > > > > > > missing > > > > > > > > > > > and > > > > > > > > > > > > > > overwrite fields > > > > > > > > > > > > > > if they already exist. > > > > > > > > > > > > > > I'm planning to include the following two flags > if > > > > there > > > > > > is a > > > > > > > > > need > > > > > > > > > > to > > > > > > > > > > > > > > overwrite this behavior: > > > > > > > > > > > > > > - `on_missing_parent` = (CREATE|IGNORE), > > > default=CREATE > > > > > > > > > > > > > > - `on_existing_field` = (OVERWRITE|IGNORE), > > > > > > default=OVERWRITE > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 4. Similarly, what will the behavior be if any > of > > > the > > > > > > field > > > > > > > > > > > elements > > > > > > > > > > > > > > > specified with InsertField already exist in the > > > > record > > > > > > > value? > > > > > > > > > > Will > > > > > > > > > > > we > > > > > > > > > > > > > > just > > > > > > > > > > > > > > > overwrite them? What's the behavior of > > InsertField > > > > > today > > > > > > > > under > > > > > > > > > > > > similar > > > > > > > > > > > > > > > circumstances? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The current behavior is to overwrite the value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Chris > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Mar 31, 2022 at 4:15 PM Jorge Esteban > > > > Quilcate > > > > > > > Otoya > > > > > > > > < > > > > > > > > > > > > > > > quilcate.jo...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, Chris! Much appreciated all the > > feedback > > > > > here. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. You nailed it setting the design goal > here: > > > "it > > > > > > > > shouldn't > > > > > > > > > be > > > > > > > > > > > > > > > impossible > > > > > > > > > > > > > > > > to use this new feature for any field name, > no > > > > matter > > > > > > how > > > > > > > > > > > > convoluted. > > > > > > > > > > > > > > > It's > > > > > > > > > > > > > > > > fine if edge cases introduce difficulty (such > > as > > > > > > > > > less-readable > > > > > > > > > > > > > > > > configurations), but it's not fine if they > > can't > > > be > > > > > > > > addressed > > > > > > > > > > at > > > > > > > > > > > > > all." > > > > > > > > > > > > > > > > Back to the previous proposals (using only > dots > > > as > > > > > > > > > separators) > > > > > > > > > > we > > > > > > > > > > > > > have > > > > > > > > > > > > > > 2 > > > > > > > > > > > > > > > > alternatives: > > > > > > > > > > > > > > > > 1. escaping with backslashes > > > > > > > > > > > > > > > > 2. escaping with dots itself > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'll lean for alternative 2, as you proposed > > > > before. > > > > > > > Feels > > > > > > > > to > > > > > > > > > > me > > > > > > > > > > > > that > > > > > > > > > > > > > > > > backslashes have more potential to lead to > > > > confusion > > > > > in > > > > > > > > JSON > > > > > > > > > > > > configs, > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > CSV seems like a good precedent to use the > same > > > > > > character > > > > > > > > to > > > > > > > > > > > escape > > > > > > > > > > > > > > > itself. > > > > > > > > > > > > > > > > KIP is updated to reflect this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. Thanks! I'll add an example, and stick > with > > > the > > > > > > > current > > > > > > > > > > > approach > > > > > > > > > > > > > > > > defining the style per individual transform > > > > > > > configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3. Yes, thanks! KIP updated. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 4. Of course. KIP updated. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 28 Mar 2022 at 21:59, Chris Egerton < > > > > > > > > > > > > fearthecel...@gmail.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Jorge, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for addressing my comments; the KIP > > > looks > > > > > > > > up-to-date > > > > > > > > > > and > > > > > > > > > > > > > > pretty > > > > > > > > > > > > > > > > > readable now, and the rejected alternatives > > > > section > > > > > > > does > > > > > > > > a > > > > > > > > > > > great > > > > > > > > > > > > > job > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > outlining the discussion so far and > providing > > > > > context > > > > > > > for > > > > > > > > > > > anyone > > > > > > > > > > > > > else > > > > > > > > > > > > > > > who > > > > > > > > > > > > > > > > > might want to join in. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. Thoughts on choice of delimiter: > > > > > > > > > > > > > > > > > - I like the optimization for simple cases, > > > but I > > > > > > think > > > > > > > > the > > > > > > > > > > new > > > > > > > > > > > > > > > proposal > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > a little too restrictive. What if there's a > > > field > > > > > > whose > > > > > > > > > name > > > > > > > > > > > > > contains > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > of the permitted options (currently just > ".", > > > > ",", > > > > > > and > > > > > > > > > "/")? > > > > > > > > > > > > > > > > > - If we expand the set of permitted > > delimiters > > > to > > > > > > allow > > > > > > > > for > > > > > > > > > > any > > > > > > > > > > > > > > > > > single-character string, configuration > > > complexity > > > > > > will > > > > > > > > > > increase > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > readability may decrease > > > > > > > > > > > > > > > > > - Also worth pointing out that there is > some > > > > > > convention > > > > > > > > for > > > > > > > > > > > > > doubling > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > delimiter character as an escape mechanism > > with > > > > > > formats > > > > > > > > > like > > > > > > > > > > > CSV > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > - Overall I think we may be approaching the > > > > > > saturation > > > > > > > > > point > > > > > > > > > > > for > > > > > > > > > > > > > > > > productive > > > > > > > > > > > > > > > > > discussion on delimiter syntax so I don't > > want > > > to > > > > > > spend > > > > > > > > too > > > > > > > > > > > much > > > > > > > > > > > > > more > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > your time on it. I think the one point I'd > > like > > > > to > > > > > > > leave > > > > > > > > > for > > > > > > > > > > > now > > > > > > > > > > > > is > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > > shouldn't be impossible to use this new > > feature > > > > for > > > > > > any > > > > > > > > > field > > > > > > > > > > > > name, > > > > > > > > > > > > > > no > > > > > > > > > > > > > > > > > matter how convoluted. It's fine if edge > > cases > > > > > > > introduce > > > > > > > > > > > > difficulty > > > > > > > > > > > > > > > (such > > > > > > > > > > > > > > > > > as less-readable configurations), but it's > > not > > > > fine > > > > > > if > > > > > > > > they > > > > > > > > > > > can't > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > addressed at all. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. > > > > > > > > > > > > > > > > > The configuration style where you define > > > > > > > > > > > "transforms.field.style" > > > > > > > > > > > > > in > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > connector config, and then this applies to > > all > > > > SMTs > > > > > > for > > > > > > > > the > > > > > > > > > > > > > > connector, > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > very interesting. However, it doesn't > follow > > > > > > convention > > > > > > > > for > > > > > > > > > > > > > existing > > > > > > > > > > > > > > > > SMTs. > > > > > > > > > > > > > > > > > Right now, if you want to configure an SMT, > > you > > > > > > define > > > > > > > > its > > > > > > > > > > name > > > > > > > > > > > > in > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > connector config (for example, > "transforms": > > > > > "smt1"), > > > > > > > and > > > > > > > > > > then > > > > > > > > > > > > > define > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > of the properties for that SMT in the > > connector > > > > > > config > > > > > > > > > using > > > > > > > > > > a > > > > > > > > > > > > > > > > namespacing > > > > > > > > > > > > > > > > > mechanism specific to that SMT (for > example, > > > > > > > > > > > > > "transforms.smt1.prop1": > > > > > > > > > > > > > > > > > "val1"). That SMT then sees only the > > properties > > > > > > defined > > > > > > > > in > > > > > > > > > > that > > > > > > > > > > > > > > > > namespace, > > > > > > > > > > > > > > > > > with the prefix stripped (for example, > > "prop1": > > > > > > "val1") > > > > > > > > in > > > > > > > > > > its > > > > > > > > > > > > > > > configure > > > > > > > > > > > > > > > > > [2] [3] method. > > > > > > > > > > > > > > > > > If we want to continue to follow this > > > convention, > > > > > > then > > > > > > > > > > instead > > > > > > > > > > > of > > > > > > > > > > > > > > > > > specifying "transforms.field.style" in a > > > > connector > > > > > > > > config, > > > > > > > > > we > > > > > > > > > > > > would > > > > > > > > > > > > > > > > expect > > > > > > > > > > > > > > > > > users to configure > > > > "transforms.<name>.field.style", > > > > > > for > > > > > > > > > each > > > > > > > > > > > SMT > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > they > > > > > > > > > > > > > > > > > want to configure a field style for. This > > would > > > > > > require > > > > > > > > > more > > > > > > > > > > > work > > > > > > > > > > > > > on > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > part of the user, but would be simpler to > > > reason > > > > > > about > > > > > > > > and > > > > > > > > > > > easier > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > implement. > > > > > > > > > > > > > > > > > If we want to explore an alternative where > > > users > > > > > can > > > > > > > > > specify > > > > > > > > > > > > global > > > > > > > > > > > > > > > > > properties that apply to all transforms in > a > > > > > > connector > > > > > > > > > > config, > > > > > > > > > > > > then > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > semantics for this need to be defined in > the > > > KIP. > > > > > > This > > > > > > > > > would > > > > > > > > > > > have > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > include whether this will apply only for > the > > > > > special > > > > > > > case > > > > > > > > > of > > > > > > > > > > > the > > > > > > > > > > > > > > > > > "field.style" and possibly > "field.separator" > > > > > > properties > > > > > > > > or > > > > > > > > > if > > > > > > > > > > > it > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > available more generally for other > > properties, > > > > > > whether > > > > > > > it > > > > > > > > > > will > > > > > > > > > > > > > apply > > > > > > > > > > > > > > > only > > > > > > > > > > > > > > > > > for the SMTs outlined in the KIP or if the > > > > > > > "field.style" > > > > > > > > > and > > > > > > > > > > > > > possibly > > > > > > > > > > > > > > > > > "field.separator" properties would also be > > > passed > > > > > > into > > > > > > > > > custom > > > > > > > > > > > > SMTs > > > > > > > > > > > > > so > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > they could choose to act on them if > > applicable, > > > > how > > > > > > > edge > > > > > > > > > > cases > > > > > > > > > > > > like > > > > > > > > > > > > > > > > having > > > > > > > > > > > > > > > > > an SMT named "field" in your connector > config > > > > would > > > > > > be > > > > > > > > > > handled, > > > > > > > > > > > > > etc. > > > > > > > > > > > > > > > > > Either way, it might help to have an > example > > in > > > > the > > > > > > KIP > > > > > > > > > > > outlining > > > > > > > > > > > > > how > > > > > > > > > > > > > > > one > > > > > > > > > > > > > > > > > of the to-be-augmented SMTs can be > configured > > > > with > > > > > > this > > > > > > > > new > > > > > > > > > > > > feature > > > > > > > > > > > > > > > and a > > > > > > > > > > > > > > > > > before/after of how a record value would be > > > > > > transformed > > > > > > > > > with > > > > > > > > > > > that > > > > > > > > > > > > > > > > > configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3. The docstring for the > > > "transforms.field.style" > > > > > > > > property > > > > > > > > > > > > mentions > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > the permitted values are "plain" and > > "nested", > > > > but > > > > > > then > > > > > > > > > > > describes > > > > > > > > > > > > > > > > behavior > > > > > > > > > > > > > > > > > with a value of "root". Should that be > > "plain" > > > > > > instead? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 4. The docstring for the > > > > > "transforms.field.separator" > > > > > > > > > > property > > > > > > > > > > > > > > > > exclusively > > > > > > > > > > > > > > > > > mentions structs, but the feature is > intended > > > to > > > > > work > > > > > > > > with > > > > > > > > > > maps > > > > > > > > > > > > as > > > > > > > > > > > > > > > well. > > > > > > > > > > > > > > > > > Can we update it to reflect this? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > References: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] - https://stackoverflow.com/a/17808731 > > > > > > > > > > > > > > > > > [2] - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/kafka/blob/7243facb8d69a7252e6b9556b5eaee13e41bab7f/connect/api/src/main/java/org/apache/kafka/connect/transforms/Transformation.java#L30 > > > > > > > > > > > > > > > > > [3] - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/kafka/blob/7243facb8d69a7252e6b9556b5eaee13e41bab7f/clients/src/main/java/org/apache/kafka/common/Configurable.java#L26-L29 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Chris > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 28, 2022 at 1:32 PM Jorge > Esteban > > > > > > Quilcate > > > > > > > > > Otoya > > > > > > > > > > < > > > > > > > > > > > > > > > > > quilcate.jo...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, Chris! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. I'd argue "this..field.child" could be > > > > harder > > > > > to > > > > > > > > grasp > > > > > > > > > > > than > > > > > > > > > > > > > > > > > > "this.field/child" + separator: "/". > > > > > > > > > > > > > > > > > > Even though this represents additional > > > > > information, > > > > > > > it > > > > > > > > > > > follows > > > > > > > > > > > > a > > > > > > > > > > > > > > > > similar > > > > > > > > > > > > > > > > > > approach as the "Flatten#delimeter" > > > > > configuration. > > > > > > > > > > > > > > > > > > I want to give the separator approach > > another > > > > > try, > > > > > > > so I > > > > > > > > > > have > > > > > > > > > > > > > > updated > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > KIP with the separator proposal, sticking > > to > > > > > only 2 > > > > > > > > > > > > alternatives > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > should hopefully cover most scenarios. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. Agree. KIP has been updated with this > > > > > > improvement. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3. You're right. I have updated this > > section > > > > > > > > accordingly. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 4. Good catch! I've replaced it with > > > > > `DropHeaders`. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your feedback. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > Jorge. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, 9 Mar 2022 at 21:33, Chris > Egerton > > < > > > > > > > > > > > > > > fearthecel...@gmail.com> > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Jorge, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking good! Got a few more thoughts. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. Sorry to revisit this, but I think > we > > > may > > > > > want > > > > > > > to > > > > > > > > > > adopt > > > > > > > > > > > a > > > > > > > > > > > > > > > slightly > > > > > > > > > > > > > > > > > > > different escape syntax style. > > Backslashes > > > > are > > > > > > > great, > > > > > > > > > but > > > > > > > > > > > > since > > > > > > > > > > > > > > > > they're > > > > > > > > > > > > > > > > > > > already used by JSON, using them as an > > > escape > > > > > > > > sequence > > > > > > > > > in > > > > > > > > > > > > field > > > > > > > > > > > > > > > > > notation > > > > > > > > > > > > > > > > > > > would also lead to some pretty ugly > > > connector > > > > > > > > configs. > > > > > > > > > > > Anyone > > > > > > > > > > > > > > who's > > > > > > > > > > > > > > > > had > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > write regular expressions with > > backslashes > > > in > > > > > > Java > > > > > > > is > > > > > > > > > > > > probably > > > > > > > > > > > > > > > > already > > > > > > > > > > > > > > > > > > > familiar with this: > > > > > > > > > > > > > "this\\\\.is\\\\.not\\\\.very\\\\.readable". > > > > > > > > > > > > > > > What > > > > > > > > > > > > > > > > > do > > > > > > > > > > > > > > > > > > > you think about using the dot character > > to > > > > > escape > > > > > > > > > itself? > > > > > > > > > > > In > > > > > > > > > > > > > > other > > > > > > > > > > > > > > > > > words, > > > > > > > > > > > > > > > > > > > to access a single field named > > > "this.field", > > > > > > > instead > > > > > > > > of > > > > > > > > > > > using > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > syntax > > > > > > > > > > > > > > > > > > > "this\.field" (which in JSON would have > > to > > > be > > > > > > > > expressed > > > > > > > > > > as > > > > > > > > > > > > > > > > > > "this\\.field"), > > > > > > > > > > > > > > > > > > > we could use "this..field", and for a > > > single > > > > > > field > > > > > > > > > named > > > > > > > > > > > > > > > > "this\field", > > > > > > > > > > > > > > > > > > > instead of using the syntax > "this\\field" > > > > (or, > > > > > in > > > > > > > > JSON, > > > > > > > > > > > > > > > > > "this\\\\field"), > > > > > > > > > > > > > > > > > > > we could use "this\field" (or, in JSON, > > > > > > > > "this\\field"). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. Could you flesh out the details on > the > > > new > > > > > > > > > > "field.style" > > > > > > > > > > > > > > > property, > > > > > > > > > > > > > > > > > > > including the type, default value, > > > > importance, > > > > > > and > > > > > > > a > > > > > > > > > > > > > preliminary > > > > > > > > > > > > > > > > > > docstring? > > > > > > > > > > > > > > > > > > > See > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-618%3A+Exactly-Once+Support+for+Source+Connectors#KIP618:ExactlyOnceSupportforSourceConnectors-Newproperties > > > > > > > > > > > > > > > > > > > for an example. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3. Is the "Compatibility, Deprecation, > > and > > > > > > > Migration > > > > > > > > > > Plan" > > > > > > > > > > > > > > section > > > > > > > > > > > > > > > > > still > > > > > > > > > > > > > > > > > > > accurate after the latest update? Seems > > > like > > > > > it's > > > > > > > > still > > > > > > > > > > > > written > > > > > > > > > > > > > > > with > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > assumption that nested field syntax > will > > be > > > > > > > hardcoded > > > > > > > > > or > > > > > > > > > > > > > opt-in, > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > IIUC > > > > > > > > > > > > > > > > > > > isn't the case anymore. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 4. Nit: The "These SMTs do not require > > > nested > > > > > > > > structure > > > > > > > > > > > > > support" > > > > > > > > > > > > > > > > > section > > > > > > > > > > > > > > > > > > > mentions a "Drop" SMT. I think this may > > be > > > > > > > referring > > > > > > > > to > > > > > > > > > > the > > > > > > > > > > > > > > > Confluent > > > > > > > > > > > > > > > > > > Drop > > > > > > > > > > > > > > > > > > > SMT, which isn't a part of Apache > Kafka. > > > > Should > > > > > > we > > > > > > > > drop > > > > > > > > > > > (heh) > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > SMT > > > > > > > > > > > > > > > > > > from > > > > > > > > > > > > > > > > > > > the list? Or perhaps just replace it > with > > > > > > > > > "DropHeaders", > > > > > > > > > > > > which > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > currently > > > > > > > > > > > > > > > > > > > missing from the list and shouldn't > > require > > > > any > > > > > > > > > > > nested-field > > > > > > > > > > > > > > > related > > > > > > > > > > > > > > > > > > > updates? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Chris > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Feb 28, 2022 at 2:12 PM Jorge > > > Esteban > > > > > > > > Quilcate > > > > > > > > > > > Otoya > > > > > > > > > > > > < > > > > > > > > > > > > > > > > > > > quilcate.jo...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you, Chris! and sorry for the > > > delayed > > > > > > > > response. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please, find my comments below: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 14 Feb 2022 at 17:34, Chris > > > Egerton > > > > > > > > > > > > > > > > > > <chr...@confluent.io.invalid > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Jorge, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the KIP! I'd love to see > > > > support > > > > > > for > > > > > > > > > > nested > > > > > > > > > > > > > fields > > > > > > > > > > > > > > > > added > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > > out-of-the-box SMTs provided with > > > > Connect. > > > > > > Here > > > > > > > > are > > > > > > > > > > my > > > > > > > > > > > > > > initial > > > > > > > > > > > > > > > > > > > thoughts: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. I agree that there's a case to > be > > > made > > > > > for > > > > > > > > > > expanding > > > > > > > > > > > > > > > > HoistField > > > > > > > > > > > > > > > > > > > with a > > > > > > > > > > > > > > > > > > > > > new config property for > identifying a > > > > > nested, > > > > > > > > > > > > to-be-hoisted > > > > > > > > > > > > > > > > field, > > > > > > > > > > > > > > > > > > but > > > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > > example in the KIP doesn't really > > > > > demonstrate > > > > > > > why > > > > > > > > > > this > > > > > > > > > > > > > would > > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > > > > valuable. I > > > > > > > > > > > > > > > > > > > > > think it'd be helpful to expand the > > > > example > > > > > > to > > > > > > > > add > > > > > > > > > > > other > > > > > > > > > > > > > > fields > > > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > > > order > > > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > > show how adding nested field > support > > > > > enables > > > > > > > > users > > > > > > > > > to > > > > > > > > > > > > > hoist a > > > > > > > > > > > > > > > > > nested > > > > > > > > > > > > > > > > > > > > field > > > > > > > > > > > > > > > > > > > > > without dropping other fields from > > the > > > > > value. > > > > > > > > Maybe > > > > > > > > > > > > > something > > > > > > > > > > > > > > > > like > > > > > > > > > > > > > > > > > > > this: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > source = nested.val > > > > > > > > > > > > > > > > > > > > > field = line > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > value (before): > > > > > > > > > > > > > > > > > > > > > { > > > > > > > > > > > > > > > > > > > > > "nested": { > > > > > > > > > > > > > > > > > > > > > "val": 42, > > > > > > > > > > > > > > > > > > > > > "other val": 96 > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > value (after): > > > > > > > > > > > > > > > > > > > > > { > > > > > > > > > > > > > > > > > > > > > "nested": { > > > > > > > > > > > > > > > > > > > > > "line": { > > > > > > > > > > > > > > > > > > > > > "val": 42, > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > "other val": 96 > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. Nit: I think "source" is a > little > > > > > strange > > > > > > > for > > > > > > > > > the > > > > > > > > > > > new > > > > > > > > > > > > > > > > HoistField > > > > > > > > > > > > > > > > > > > > > property name. Maybe "hoisted" or > > > > > > > "hoisted.field" > > > > > > > > > > would > > > > > > > > > > > > be > > > > > > > > > > > > > > more > > > > > > > > > > > > > > > > > > > > > descriptive? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About 1. and 2.: > > > > > > > > > > > > > > > > > > > > Agree. The example for this SMT is > > > updated > > > > > and > > > > > > > have > > > > > > > > > > added > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > `hoisted` > > > > > > > > > > > > > > > > > > > > configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3. Is there a reasonable use case > for > > > > > > expanding > > > > > > > > > > Flatten > > > > > > > > > > > > to > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > able > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > > flatten specific fields? My > > > understanding > > > > > is > > > > > > > that > > > > > > > > > > it's > > > > > > > > > > > > > mostly > > > > > > > > > > > > > > > > > useful > > > > > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > > writing to systems like databases > > that > > > > > don't > > > > > > > > > support > > > > > > > > > > > > nested > > > > > > > > > > > > > > > > values > > > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > > > > > require everything to be a flat > list > > of > > > > > > > key-value > > > > > > > > > > > pairs. > > > > > > > > > > > > > > Being > > > > > > > > > > > > > > > > able > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > > flatten a nested field wouldn't > > provide > > > > any > > > > > > > > > advantage > > > > > > > > > > > for > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > use > > > > > > > > > > > > > > > > > > > case. > > > > > > > > > > > > > > > > > > > > > Are there other cases where it > would? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 4. I don't think we should > > > > unconditionally > > > > > > > change > > > > > > > > > the > > > > > > > > > > > > > default > > > > > > > > > > > > > > > > > > delimiter > > > > > > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > > Flatten. It's a > > backwards-incompatible, > > > > > > > breaking > > > > > > > > > > change > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > > cause > > > > > > > > > > > > > > > > > > > > > headaches for users. It might be > > > > reasonable > > > > > > to > > > > > > > > > change > > > > > > > > > > > the > > > > > > > > > > > > > > > default > > > > > > > > > > > > > > > > > > value > > > > > > > > > > > > > > > > > > > > > dynamically based on whether the > user > > > has > > > > > > > > > specified a > > > > > > > > > > > > value > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > "field" > > > > > > > > > > > > > > > > > > > > > property, but considering the > > > motivation > > > > > for > > > > > > > > > changing > > > > > > > > > > > the > > > > > > > > > > > > > > > default > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > > > > it creates conflicts with the > > > > > > to-be-introduced > > > > > > > > > nested > > > > > > > > > > > > field > > > > > > > > > > > > > > > > syntax > > > > > > > > > > > > > > > > > > > (which > > > > > > > > > > > > > > > > > > > > > could arise with downstream SMTs > > > > regardless > > > > > > of > > > > > > > > > > whether > > > > > > > > > > > > the > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > has > > > > > > > > > > > > > > > > > > > > > explicitly configured Flatten with > > the > > > > > > "field" > > > > > > > > > > > > property), I > > > > > > > > > > > > > > > don't > > > > > > > > > > > > > > > > > > know > > > > > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > > > > this would be too useful either. I > > have > > > > > some > > > > > > > > > thoughts > > > > > > > > > > > > below > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > > handle possible conflicts between > > names > > > > > with > > > > > > > dots > > > > > > > > > in > > > > > > > > > > > > their > > > > > > > > > > > > > > > names > > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > > > > dotted > > > > > > > > > > > > > > > > > > > > > syntax for nested field references > > that > > > > > > should > > > > > > > > > > > hopefully > > > > > > > > > > > > > make > > > > > > > > > > > > > > > > > either > > > > > > > > > > > > > > > > > > > > change > > > > > > > > > > > > > > > > > > > > > unnecessary. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fair enough. With the support for > > nested > > > > > fields > > > > > > > in > > > > > > > > > > other > > > > > > > > > > > > > SMTs, > > > > > > > > > > > > > > > > > Flatten > > > > > > > > > > > > > > > > > > > > could stay as it is. > > > > > > > > > > > > > > > > > > > > This removes the need for (4) > changing > > > > > Flatten > > > > > > > > config > > > > > > > > > > as > > > > > > > > > > > > > well. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 5. I think it's fine to expand > > > > ExtractField > > > > > > to > > > > > > > > > > support > > > > > > > > > > > > > nested > > > > > > > > > > > > > > > > > > notation, > > > > > > > > > > > > > > > > > > > > but > > > > > > > > > > > > > > > > > > > > > it might be worth noting in the > > > rejected > > > > > > > > > alternatives > > > > > > > > > > > > > section > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > > > isn't strictly necessary since you > > can > > > > > > replace > > > > > > > > any > > > > > > > > > > > single > > > > > > > > > > > > > > > > > invocation > > > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > > > > that SMT that uses nested field > > > notation > > > > > with > > > > > > > > > > multiple > > > > > > > > > > > > > > > > invocations > > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > > > > > > that use non-nested notation. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Agree. Adding it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 6. Nit: "RegerRouter" should be > > > > > "RegexRouter" > > > > > > > in > > > > > > > > > the > > > > > > > > > > > list > > > > > > > > > > > > > of > > > > > > > > > > > > > > > SMTs > > > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > > do > > > > > > > > > > > > > > > > > > > > > not require nested structure > support. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Ack. Fixing it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 7. It may be rare for dots in field > > > names > > > > > to > > > > > > > > occur > > > > > > > > > in > > > > > > > > > > > the > > > > > > > > > > > > > > wild > > > > > > > > > > > > > > > > > > > (although > > > > > > > > > > > > > > > > > > > > I > > > > > > > > > > > > > > > > > > > > > wouldn't be so certain of this), > but > > > > unless > > > > > > we > > > > > > > > want > > > > > > > > > > to > > > > > > > > > > > > > > inflict > > > > > > > > > > > > > > > > > > > headaches > > > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > > users of Flatten, I think we're > going > > > to > > > > > have > > > > > > > to > > > > > > > > > > think > > > > > > > > > > > > > about > > > > > > > > > > > > > > > > > > conflicts > > > > > > > > > > > > > > > > > > > > > between dotted notation and > > non-nested > > > > > fields > > > > > > > > whose > > > > > > > > > > > names > > > > > > > > > > > > > > > contain > > > > > > > > > > > > > > > > > > > dots. I > > > > > > > > > > > > > > > > > > > > > don't think this is actually such a > > bad > > > > > > thing, > > > > > > > > > > though. > > > > > > > > > > > I > > > > > > > > > > > > > > agree > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > > > dotted > > > > > > > > > > > > > > > > > > > > > notation is intuitive and pretty > > > > > commonplace > > > > > > > (in > > > > > > > > > > tools > > > > > > > > > > > > like > > > > > > > > > > > > > > jq, > > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > > example), so I'd like it if we > could > > > > stick > > > > > to > > > > > > > it. > > > > > > > > > > What > > > > > > > > > > > > > about > > > > > > > > > > > > > > > > > > > introducing > > > > > > > > > > > > > > > > > > > > > escape syntax, using a backslash? > > That > > > > way, > > > > > > > users > > > > > > > > > > could > > > > > > > > > > > > > > > > > disambiguate > > > > > > > > > > > > > > > > > > > > > between "this.field" (which would > > refer > > > > to > > > > > > the > > > > > > > > > nested > > > > > > > > > > > > field > > > > > > > > > > > > > > > > "field" > > > > > > > > > > > > > > > > > > > under > > > > > > > > > > > > > > > > > > > > > the top-level "this" field), and > > > > > > "this\.field" > > > > > > > > > (which > > > > > > > > > > > > would > > > > > > > > > > > > > > > refer > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > > field named "this.field"). Like > with > > > most > > > > > > > > languages > > > > > > > > > > > that > > > > > > > > > > > > > use > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > backslash > > > > > > > > > > > > > > > > > > > > > for escape sequences, it could also > > be > > > > used > > > > > > to > > > > > > > > > escape > > > > > > > > > > > > > itself, > > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > event > > > > > > > > > > > > > > > > > > > > > that a field name contains a > > > backslash. I > > > > > > think > > > > > > > > > this > > > > > > > > > > is > > > > > > > > > > > > > more > > > > > > > > > > > > > > > > > > intuitive > > > > > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > > > > > simpler than, e.g., adding a new > > config > > > > > > > property > > > > > > > > to > > > > > > > > > > > > toggle > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > delimiter > > > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > > be used when parsing nested field > > > > > references. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I like this approach. Adding to the > > KIP. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 8. I don't think we can > > unconditionally > > > > > turn > > > > > > > this > > > > > > > > > > > feature > > > > > > > > > > > > > on. > > > > > > > > > > > > > > > The > > > > > > > > > > > > > > > > > > risk > > > > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > > > > breaking existing pipelines > > (especially > > > > > ones > > > > > > > that > > > > > > > > > > > > involve, > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > example, a > > > > > > > > > > > > > > > > > > > > > combination of the Flatten and Cast > > > SMTs) > > > > > is > > > > > > > > pretty > > > > > > > > > > > > high. I > > > > > > > > > > > > > > > think > > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > > > should be an opt-in feature, at > least > > > > until > > > > > > the > > > > > > > > > next > > > > > > > > > > > > major > > > > > > > > > > > > > > > > release. > > > > > > > > > > > > > > > > > > One > > > > > > > > > > > > > > > > > > > > way > > > > > > > > > > > > > > > > > > > > > we could accomplish this is by > > > > introducing > > > > > a > > > > > > > new > > > > > > > > > > > > > > "field.style" > > > > > > > > > > > > > > > > > (name > > > > > > > > > > > > > > > > > > > > > obviously subject to change) > property > > > > with > > > > > > > values > > > > > > > > > of > > > > > > > > > > > > > "plain" > > > > > > > > > > > > > > > > > > (default) > > > > > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > > > > > "nested". If set to "plain" then > the > > > > > current > > > > > > > > > > non-nested > > > > > > > > > > > > > > > behavior > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > used, > > > > > > > > > > > > > > > > > > > > > and if set to "nested", then the > > > proposed > > > > > > > nested > > > > > > > > > > > behavior > > > > > > > > > > > > > is. > > > > > > > > > > > > > > > We > > > > > > > > > > > > > > > > > can > > > > > > > > > > > > > > > > > > > > > consider updating the default value > > to > > > > > > "nested" > > > > > > > > in > > > > > > > > > a > > > > > > > > > > > > future > > > > > > > > > > > > > > > major > > > > > > > > > > > > > > > > > > > release > > > > > > > > > > > > > > > > > > > > > (or even codify that plan in this > > KIP, > > > if > > > > > > > there's > > > > > > > > > > > enough > > > > > > > > > > > > > > > support > > > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > it). > > > > > > > > > > > > > > > > > > > > > This would also leave the door open > > for > > > > > > adding > > > > > > > > > > > recursive > > > > > > > > > > > > > > > support > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > SMTs > > > > > > > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > > > > > the future by adding a permitted > > value > > > of > > > > > > > > > > "recursive". > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 9. One of the linked tickets in the > > > > > > > "Motivation" > > > > > > > > > > > section, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10640, > > > > > > > > > > has > > > > > > > > > > > > an > > > > > > > > > > > > > > open > > > > > > > > > > > > > > > > PR > > > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > > > > KIP > > > > > > > > > > > > > > > > > > > > > that propose adding recursive > support > > > to > > > > > some > > > > > > > > SMTs. > > > > > > > > > > > Have > > > > > > > > > > > > > you > > > > > > > > > > > > > > > > > > considered > > > > > > > > > > > > > > > > > > > > > this type of functionality for your > > > KIP? > > > > Or > > > > > > is > > > > > > > > your > > > > > > > > > > aim > > > > > > > > > > > > to > > > > > > > > > > > > > > > stick > > > > > > > > > > > > > > > > > > solely > > > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > > nested field support? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I like the `field.style` > configuration > > > flag > > > > > > > > approach. > > > > > > > > > > > > > > > > > > > > Thanks for pointing out the > `recursive` > > > > > > approach. > > > > > > > > > Will > > > > > > > > > > > add > > > > > > > > > > > > > > > `nested` > > > > > > > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > moment, let's check the demand for > > > > > `recursive` > > > > > > to > > > > > > > > > > > consider > > > > > > > > > > > > it > > > > > > > > > > > > > > as > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > > > this or another KIP. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I have added the following on the > KIP: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > > > > > > > > > > > > Future KIPs could extend this support > > > for: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Recursive notation: name a field > and > > > > apply > > > > > it > > > > > > > to > > > > > > > > > all > > > > > > > > > > > > fields > > > > > > > > > > > > > > > > across > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > schema matching that name. > > > > > > > > > > > > > > > > > > > > - Access to arrays: Adding `[]` > > notation > > > to > > > > > > > > represent > > > > > > > > > > > > access > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > arrays > > > > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > > > > applying SMTs to fields within an > > array. > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Chris > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Feb 8, 2022 at 1:23 PM > Jorge > > > > > Esteban > > > > > > > > > Quilcate > > > > > > > > > > > > > Otoya < > > > > > > > > > > > > > > > > > > > > > quilcate.jo...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Dev team, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'd like to start a new > discussion > > > > thread > > > > > > on > > > > > > > > > Kafka > > > > > > > > > > > > > Connect > > > > > > > > > > > > > > > > > KIP-821: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-821%3A+Connect+Transforms+support+for+nested+structures > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This KIP is aimed to include > > support > > > > for > > > > > > > nested > > > > > > > > > > > > > structures > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > > existing > > > > > > > > > > > > > > > > > > > > > > SMTs — where this make sense. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your feedback. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > > > > > > > Jorge. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >