Thank you, Chris! Not possible without your feedback. On Tue, 19 Apr 2022 at 23:04, Chris Egerton <fearthecel...@gmail.com> wrote:
> Hi Jorge, > > Thank you for sticking through this. I have one small remark and one small > clarification; assuming you agree with me on them then I'm ready to vote on > the KIP. > > 1. InsertField: The "field.on.missing.parent" and "field.on.existing.field" > docs both mention a permitted value of "ingore"; this should be "ignore", > right? > Of course, one more typo :) > 2. InsertField: The examples are still missing the "field.style" property > from the configurations. They should all include the property > "transforms.smt1.field.style": "nested", correct? > Yes, it is there. I think I know what you mean now, seems that Confluence is putting everything in one line when it's in separate lines in the editor. Hopefully, it's fixed now. > > Thanks again for working through this, and congratulations on a > well-written KIP! > > Cheers, > > Chris > > On Tue, Apr 19, 2022 at 2:06 PM Jorge Esteban Quilcate Otoya < > quilcate.jo...@gmail.com> wrote: > > > Thank you, Chris! I apply these improvements to the KIP, let me know how > it > > looks now. > > > > On Mon, 11 Apr 2022 at 23:43, Chris Egerton <fearthecel...@gmail.com> > > wrote: > > > > > Hi Jorge, > > > > > > Wow, those examples are great! A few more remarks, but I think we're > > > getting close: > > > > > > 1. The examples differ across SMTs with the name of the > newly-introduced > > > style property; some of them use "field.style", and some use > > > "fields.style". I think for consistency's sake we should stick with > just > > > "field.style"; otherwise it could be painful for users to have to > > remember > > > which to use. > > > > > > > Great catch. Agree, I fixed the config names to `field.style`. > > > > > > > > > > 2. Some of the examples are off: > > > - TimestampConverter: the input in the second example ("when field > names > > > include dots") doesn't contain a field with a dotted name > > > - ValueToKey: the config in the third example ("when field names > include > > > dots") should probably use "parent..child.k2" as the > > > "transforms.smt1.fields" property > > > > > > > Fixed. Thanks! > > > > > > > > > > 3. RE changes to InsertField: > > > - The InsertField SMT should also come with the new "field.style" > > property > > > in order to preserve backwards compatibility, right? I don't see it > > > included in the example configs for that one, just want to make sure > > > - I don't know of any cases where we use snake_case for property names > in > > > Kafka; we should probably use "on.missing.parent" and > "on.existing.field" > > > as the new property names for InsertField. > > > - Why is the "on_existing_field" (or "on.existing.field") property only > > > applied when the field style is nested? Couldn't it be useful for > > > non-nested fields as well? > > > > > > > Great points! I have applied these suggestions to the KIP. > > > > > > > > > > Cheers, > > > > > > Chris > > > > > > On Sat, Apr 9, 2022 at 12:40 PM Jorge Esteban Quilcate Otoya < > > > quilcate.jo...@gmail.com> wrote: > > > > > > > Again, great feedback Chris. Much appreciated. > > > > Added my comments below: > > > > > > > > On Tue, 5 Apr 2022 at 20:22, Chris Egerton <fearthecel...@gmail.com> > > > > wrote: > > > > > > > > > Hi Jorge, > > > > > > > > > > Looking good! I have a few comments left but all but one or two are > > > > minor. > > > > > > > > > > 1. The motivation section states "This KIP is aimed to include > > support > > > > for > > > > > nested structures on the existing SMTs... and to include the > > > abstractions > > > > > to reuse this in future SMTs". A good implementation of this KIP > will > > > > > definitely isolate reusable logic into a separate abstraction that > > can > > > be > > > > > easily pulled in to the SMTs we want to add nested field support > to, > > > but > > > > > unless we plan on making this kind of abstraction publicly > available > > as > > > > > some kind of utility method or class that external SMT developers > can > > > > > leverage, we can probably leave this part out as it's more of an > > > > > implementation detail. > > > > > > > > > > > > > Make sense, will leave this out of the KIP. > > > > > > > > > > > > > > > > > > 2. The Cast example is a little misleading, isn't it? It > demonstrates > > > the > > > > > escape syntax for fields with dot literals in their names, but it > > > doesn't > > > > > demonstrate a way to actually use the Cast (or any other) SMT to > > > access a > > > > > nested field in a record, which is the whole point of the KIP. I > like > > > the > > > > > example of escape syntax but we should probably also add one for > > nested > > > > > field access. > > > > > > > > > > > > > Agree. I have added examples to each SMT to be more clear about how > it > > > > affects each function > > > > . > > > > > > > > > > > > > > > > > > 3. With the InsertField SMT, I'm wondering what the specific > behavior > > > > will > > > > > be when some or all of the "middle layer" nested fields are > missing. > > > For > > > > > example, if I have a record with a value of { "k1": "v1 } and I > apply > > > > > InsertField with topic.field = n1.n2.n3.topic, what will happen? > Will > > > the > > > > > updated value become { "k1": "v1", "n1": { "n2": { "n3": "topic" > }}}, > > > > will > > > > > an exception be thrown, or something else? This seems analogous to > > the > > > > > command line mkdir command, which (at least on some operating > > systems) > > > > > fails by default if you try to create a new nested directory where > > > > anything > > > > > but the last element in the path doesn't exist, but can be invoked > > > with a > > > > > flag that instructs it to go ahead and create all levels of nested > > > > > directory instead. I'm leaning on the side of "just create > > everything" > > > > but > > > > > would be interested in your thoughts, and either way, we should > > > probably > > > > > make sure the intended behavior is well-defined before voting. > > > > > > > > > > > > > This is an interesting case, thanks for catching this! > > > > The default behavior I see is to create parents if they are missing > and > > > > overwrite fields > > > > if they already exist. > > > > I'm planning to include the following two flags if there is a need to > > > > overwrite this behavior: > > > > - `on_missing_parent` = (CREATE|IGNORE), default=CREATE > > > > - `on_existing_field` = (OVERWRITE|IGNORE), default=OVERWRITE > > > > > > > > > > > > > > > > > > 4. Similarly, what will the behavior be if any of the field > elements > > > > > specified with InsertField already exist in the record value? Will > we > > > > just > > > > > overwrite them? What's the behavior of InsertField today under > > similar > > > > > circumstances? > > > > > > > > > > > > > The current behavior is to overwrite the value. > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > Chris > > > > > > > > > > On Thu, Mar 31, 2022 at 4:15 PM Jorge Esteban Quilcate Otoya < > > > > > quilcate.jo...@gmail.com> wrote: > > > > > > > > > > > Thanks, Chris! Much appreciated all the feedback here. > > > > > > > > > > > > 1. You nailed it setting the design goal here: "it shouldn't be > > > > > impossible > > > > > > to use this new feature for any field name, no matter how > > convoluted. > > > > > It's > > > > > > fine if edge cases introduce difficulty (such as less-readable > > > > > > configurations), but it's not fine if they can't be addressed at > > > all." > > > > > > Back to the previous proposals (using only dots as separators) we > > > have > > > > 2 > > > > > > alternatives: > > > > > > 1. escaping with backslashes > > > > > > 2. escaping with dots itself > > > > > > > > > > > > I'll lean for alternative 2, as you proposed before. Feels to me > > that > > > > > > backslashes have more potential to lead to confusion in JSON > > configs, > > > > and > > > > > > CSV seems like a good precedent to use the same character to > escape > > > > > itself. > > > > > > KIP is updated to reflect this. > > > > > > > > > > > > 2. Thanks! I'll add an example, and stick with the current > approach > > > > > > defining the style per individual transform configuration. > > > > > > > > > > > > 3. Yes, thanks! KIP updated. > > > > > > > > > > > > 4. Of course. KIP updated. > > > > > > > > > > > > On Mon, 28 Mar 2022 at 21:59, Chris Egerton < > > fearthecel...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi Jorge, > > > > > > > > > > > > > > Thanks for addressing my comments; the KIP looks up-to-date and > > > > pretty > > > > > > > readable now, and the rejected alternatives section does a > great > > > job > > > > of > > > > > > > outlining the discussion so far and providing context for > anyone > > > else > > > > > who > > > > > > > might want to join in. > > > > > > > > > > > > > > 1. Thoughts on choice of delimiter: > > > > > > > - I like the optimization for simple cases, but I think the new > > > > > proposal > > > > > > is > > > > > > > a little too restrictive. What if there's a field whose name > > > contains > > > > > all > > > > > > > of the permitted options (currently just ".", ",", and "/")? > > > > > > > - If we expand the set of permitted delimiters to allow for any > > > > > > > single-character string, configuration complexity will increase > > and > > > > > > > readability may decrease > > > > > > > - Also worth pointing out that there is some convention for > > > doubling > > > > a > > > > > > > delimiter character as an escape mechanism with formats like > CSV > > > [1] > > > > > > > - Overall I think we may be approaching the saturation point > for > > > > > > productive > > > > > > > discussion on delimiter syntax so I don't want to spend too > much > > > more > > > > > of > > > > > > > your time on it. I think the one point I'd like to leave for > now > > is > > > > > that > > > > > > it > > > > > > > shouldn't be impossible to use this new feature for any field > > name, > > > > no > > > > > > > matter how convoluted. It's fine if edge cases introduce > > difficulty > > > > > (such > > > > > > > as less-readable configurations), but it's not fine if they > can't > > > be > > > > > > > addressed at all. > > > > > > > > > > > > > > 2. > > > > > > > The configuration style where you define > "transforms.field.style" > > > in > > > > > the > > > > > > > connector config, and then this applies to all SMTs for the > > > > connector, > > > > > is > > > > > > > very interesting. However, it doesn't follow convention for > > > existing > > > > > > SMTs. > > > > > > > Right now, if you want to configure an SMT, you define its name > > in > > > > the > > > > > > > connector config (for example, "transforms": "smt1"), and then > > > define > > > > > all > > > > > > > of the properties for that SMT in the connector config using a > > > > > > namespacing > > > > > > > mechanism specific to that SMT (for example, > > > "transforms.smt1.prop1": > > > > > > > "val1"). That SMT then sees only the properties defined in that > > > > > > namespace, > > > > > > > with the prefix stripped (for example, "prop1": "val1") in its > > > > > configure > > > > > > > [2] [3] method. > > > > > > > If we want to continue to follow this convention, then instead > of > > > > > > > specifying "transforms.field.style" in a connector config, we > > would > > > > > > expect > > > > > > > users to configure "transforms.<name>.field.style", for each > SMT > > > that > > > > > > they > > > > > > > want to configure a field style for. This would require more > work > > > on > > > > > the > > > > > > > part of the user, but would be simpler to reason about and > easier > > > to > > > > > > > implement. > > > > > > > If we want to explore an alternative where users can specify > > global > > > > > > > properties that apply to all transforms in a connector config, > > then > > > > the > > > > > > > semantics for this need to be defined in the KIP. This would > have > > > to > > > > > > > include whether this will apply only for the special case of > the > > > > > > > "field.style" and possibly "field.separator" properties or if > it > > > > would > > > > > be > > > > > > > available more generally for other properties, whether it will > > > apply > > > > > only > > > > > > > for the SMTs outlined in the KIP or if the "field.style" and > > > possibly > > > > > > > "field.separator" properties would also be passed into custom > > SMTs > > > so > > > > > > that > > > > > > > they could choose to act on them if applicable, how edge cases > > like > > > > > > having > > > > > > > an SMT named "field" in your connector config would be handled, > > > etc. > > > > > > > Either way, it might help to have an example in the KIP > outlining > > > how > > > > > one > > > > > > > of the to-be-augmented SMTs can be configured with this new > > feature > > > > > and a > > > > > > > before/after of how a record value would be transformed with > that > > > > > > > configuration. > > > > > > > > > > > > > > 3. The docstring for the "transforms.field.style" property > > mentions > > > > > that > > > > > > > the permitted values are "plain" and "nested", but then > describes > > > > > > behavior > > > > > > > with a value of "root". Should that be "plain" instead? > > > > > > > > > > > > > > 4. The docstring for the "transforms.field.separator" property > > > > > > exclusively > > > > > > > mentions structs, but the feature is intended to work with maps > > as > > > > > well. > > > > > > > Can we update it to reflect this? > > > > > > > > > > > > > > References: > > > > > > > > > > > > > > [1] - https://stackoverflow.com/a/17808731 > > > > > > > [2] - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/kafka/blob/7243facb8d69a7252e6b9556b5eaee13e41bab7f/connect/api/src/main/java/org/apache/kafka/connect/transforms/Transformation.java#L30 > > > > > > > [3] - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/kafka/blob/7243facb8d69a7252e6b9556b5eaee13e41bab7f/clients/src/main/java/org/apache/kafka/common/Configurable.java#L26-L29 > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > Chris > > > > > > > > > > > > > > On Mon, Mar 28, 2022 at 1:32 PM Jorge Esteban Quilcate Otoya < > > > > > > > quilcate.jo...@gmail.com> wrote: > > > > > > > > > > > > > > > Thanks, Chris! > > > > > > > > > > > > > > > > 1. I'd argue "this..field.child" could be harder to grasp > than > > > > > > > > "this.field/child" + separator: "/". > > > > > > > > Even though this represents additional information, it > follows > > a > > > > > > similar > > > > > > > > approach as the "Flatten#delimeter" configuration. > > > > > > > > I want to give the separator approach another try, so I have > > > > updated > > > > > > the > > > > > > > > KIP with the separator proposal, sticking to only 2 > > alternatives > > > > that > > > > > > > > should hopefully cover most scenarios. > > > > > > > > > > > > > > > > 2. Agree. KIP has been updated with this improvement. > > > > > > > > > > > > > > > > 3. You're right. I have updated this section accordingly. > > > > > > > > > > > > > > > > 4. Good catch! I've replaced it with `DropHeaders`. > > > > > > > > > > > > > > > > Looking forward to your feedback. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Jorge. > > > > > > > > > > > > > > > > On Wed, 9 Mar 2022 at 21:33, Chris Egerton < > > > > fearthecel...@gmail.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi Jorge, > > > > > > > > > > > > > > > > > > Looking good! Got a few more thoughts. > > > > > > > > > > > > > > > > > > 1. Sorry to revisit this, but I think we may want to adopt > a > > > > > slightly > > > > > > > > > different escape syntax style. Backslashes are great, but > > since > > > > > > they're > > > > > > > > > already used by JSON, using them as an escape sequence in > > field > > > > > > > notation > > > > > > > > > would also lead to some pretty ugly connector configs. > Anyone > > > > who's > > > > > > had > > > > > > > > to > > > > > > > > > write regular expressions with backslashes in Java is > > probably > > > > > > already > > > > > > > > > familiar with this: > > > "this\\\\.is\\\\.not\\\\.very\\\\.readable". > > > > > What > > > > > > > do > > > > > > > > > you think about using the dot character to escape itself? > In > > > > other > > > > > > > words, > > > > > > > > > to access a single field named "this.field", instead of > using > > > the > > > > > > > syntax > > > > > > > > > "this\.field" (which in JSON would have to be expressed as > > > > > > > > "this\\.field"), > > > > > > > > > we could use "this..field", and for a single field named > > > > > > "this\field", > > > > > > > > > instead of using the syntax "this\\field" (or, in JSON, > > > > > > > "this\\\\field"), > > > > > > > > > we could use "this\field" (or, in JSON, "this\\field"). > > > > > > > > > > > > > > > > > > 2. Could you flesh out the details on the new "field.style" > > > > > property, > > > > > > > > > including the type, default value, importance, and a > > > preliminary > > > > > > > > docstring? > > > > > > > > > See > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-618%3A+Exactly-Once+Support+for+Source+Connectors#KIP618:ExactlyOnceSupportforSourceConnectors-Newproperties > > > > > > > > > for an example. > > > > > > > > > > > > > > > > > > 3. Is the "Compatibility, Deprecation, and Migration Plan" > > > > section > > > > > > > still > > > > > > > > > accurate after the latest update? Seems like it's still > > written > > > > > with > > > > > > > the > > > > > > > > > assumption that nested field syntax will be hardcoded or > > > opt-in, > > > > > > which > > > > > > > > IIUC > > > > > > > > > isn't the case anymore. > > > > > > > > > > > > > > > > > > 4. Nit: The "These SMTs do not require nested structure > > > support" > > > > > > > section > > > > > > > > > mentions a "Drop" SMT. I think this may be referring to the > > > > > Confluent > > > > > > > > Drop > > > > > > > > > SMT, which isn't a part of Apache Kafka. Should we drop > (heh) > > > > that > > > > > > SMT > > > > > > > > from > > > > > > > > > the list? Or perhaps just replace it with "DropHeaders", > > which > > > is > > > > > > > > currently > > > > > > > > > missing from the list and shouldn't require any > nested-field > > > > > related > > > > > > > > > updates? > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > Chris > > > > > > > > > > > > > > > > > > On Mon, Feb 28, 2022 at 2:12 PM Jorge Esteban Quilcate > Otoya > > < > > > > > > > > > quilcate.jo...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > Thank you, Chris! and sorry for the delayed response. > > > > > > > > > > > > > > > > > > > > Please, find my comments below: > > > > > > > > > > > > > > > > > > > > On Mon, 14 Feb 2022 at 17:34, Chris Egerton > > > > > > > > <chr...@confluent.io.invalid > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi Jorge, > > > > > > > > > > > > > > > > > > > > > > Thanks for the KIP! I'd love to see support for nested > > > fields > > > > > > added > > > > > > > > to > > > > > > > > > > the > > > > > > > > > > > out-of-the-box SMTs provided with Connect. Here are my > > > > initial > > > > > > > > > thoughts: > > > > > > > > > > > > > > > > > > > > > > 1. I agree that there's a case to be made for expanding > > > > > > HoistField > > > > > > > > > with a > > > > > > > > > > > new config property for identifying a nested, > > to-be-hoisted > > > > > > field, > > > > > > > > but > > > > > > > > > > the > > > > > > > > > > > example in the KIP doesn't really demonstrate why this > > > would > > > > be > > > > > > > > > > valuable. I > > > > > > > > > > > think it'd be helpful to expand the example to add > other > > > > fields > > > > > > in > > > > > > > > > order > > > > > > > > > > to > > > > > > > > > > > show how adding nested field support enables users to > > > hoist a > > > > > > > nested > > > > > > > > > > field > > > > > > > > > > > without dropping other fields from the value. Maybe > > > something > > > > > > like > > > > > > > > > this: > > > > > > > > > > > > > > > > > > > > > > source = nested.val > > > > > > > > > > > field = line > > > > > > > > > > > > > > > > > > > > > > value (before): > > > > > > > > > > > { > > > > > > > > > > > "nested": { > > > > > > > > > > > "val": 42, > > > > > > > > > > > "other val": 96 > > > > > > > > > > > } > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > value (after): > > > > > > > > > > > { > > > > > > > > > > > "nested": { > > > > > > > > > > > "line": { > > > > > > > > > > > "val": 42, > > > > > > > > > > > } > > > > > > > > > > > "other val": 96 > > > > > > > > > > > } > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > 2. Nit: I think "source" is a little strange for the > new > > > > > > HoistField > > > > > > > > > > > property name. Maybe "hoisted" or "hoisted.field" would > > be > > > > more > > > > > > > > > > > descriptive? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About 1. and 2.: > > > > > > > > > > Agree. The example for this SMT is updated and have added > > the > > > > > > > `hoisted` > > > > > > > > > > configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3. Is there a reasonable use case for expanding Flatten > > to > > > be > > > > > > able > > > > > > > to > > > > > > > > > > > flatten specific fields? My understanding is that it's > > > mostly > > > > > > > useful > > > > > > > > > for > > > > > > > > > > > writing to systems like databases that don't support > > nested > > > > > > values > > > > > > > > and > > > > > > > > > > > require everything to be a flat list of key-value > pairs. > > > > Being > > > > > > able > > > > > > > > to > > > > > > > > > > > flatten a nested field wouldn't provide any advantage > for > > > > that > > > > > > use > > > > > > > > > case. > > > > > > > > > > > Are there other cases where it would? > > > > > > > > > > > > > > > > > > > > > > 4. I don't think we should unconditionally change the > > > default > > > > > > > > delimiter > > > > > > > > > > for > > > > > > > > > > > Flatten. It's a backwards-incompatible, breaking change > > > that > > > > > > could > > > > > > > > > cause > > > > > > > > > > > headaches for users. It might be reasonable to change > the > > > > > default > > > > > > > > value > > > > > > > > > > > dynamically based on whether the user has specified a > > value > > > > for > > > > > > the > > > > > > > > > > "field" > > > > > > > > > > > property, but considering the motivation for changing > the > > > > > default > > > > > > > is > > > > > > > > > that > > > > > > > > > > > it creates conflicts with the to-be-introduced nested > > field > > > > > > syntax > > > > > > > > > (which > > > > > > > > > > > could arise with downstream SMTs regardless of whether > > the > > > > user > > > > > > has > > > > > > > > > > > explicitly configured Flatten with the "field" > > property), I > > > > > don't > > > > > > > > know > > > > > > > > > > that > > > > > > > > > > > this would be too useful either. I have some thoughts > > below > > > > on > > > > > > how > > > > > > > to > > > > > > > > > > > handle possible conflicts between names with dots in > > their > > > > > names > > > > > > > and > > > > > > > > > > dotted > > > > > > > > > > > syntax for nested field references that should > hopefully > > > make > > > > > > > either > > > > > > > > > > change > > > > > > > > > > > unnecessary. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fair enough. With the support for nested fields in other > > > SMTs, > > > > > > > Flatten > > > > > > > > > > could stay as it is. > > > > > > > > > > This removes the need for (4) changing Flatten config as > > > well. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 5. I think it's fine to expand ExtractField to support > > > nested > > > > > > > > notation, > > > > > > > > > > but > > > > > > > > > > > it might be worth noting in the rejected alternatives > > > section > > > > > > that > > > > > > > > this > > > > > > > > > > > isn't strictly necessary since you can replace any > single > > > > > > > invocation > > > > > > > > of > > > > > > > > > > > that SMT that uses nested field notation with multiple > > > > > > invocations > > > > > > > of > > > > > > > > > it > > > > > > > > > > > that use non-nested notation. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Agree. Adding it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 6. Nit: "RegerRouter" should be "RegexRouter" in the > list > > > of > > > > > SMTs > > > > > > > > that > > > > > > > > > do > > > > > > > > > > > not require nested structure support. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Ack. Fixing it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 7. It may be rare for dots in field names to occur in > the > > > > wild > > > > > > > > > (although > > > > > > > > > > I > > > > > > > > > > > wouldn't be so certain of this), but unless we want to > > > > inflict > > > > > > > > > headaches > > > > > > > > > > on > > > > > > > > > > > users of Flatten, I think we're going to have to think > > > about > > > > > > > > conflicts > > > > > > > > > > > between dotted notation and non-nested fields whose > names > > > > > contain > > > > > > > > > dots. I > > > > > > > > > > > don't think this is actually such a bad thing, though. > I > > > > agree > > > > > > that > > > > > > > > > > dotted > > > > > > > > > > > notation is intuitive and pretty commonplace (in tools > > like > > > > jq, > > > > > > for > > > > > > > > > > > example), so I'd like it if we could stick to it. What > > > about > > > > > > > > > introducing > > > > > > > > > > > escape syntax, using a backslash? That way, users could > > > > > > > disambiguate > > > > > > > > > > > between "this.field" (which would refer to the nested > > field > > > > > > "field" > > > > > > > > > under > > > > > > > > > > > the top-level "this" field), and "this\.field" (which > > would > > > > > refer > > > > > > > to > > > > > > > > > the > > > > > > > > > > > field named "this.field"). Like with most languages > that > > > use > > > > > the > > > > > > > > > > backslash > > > > > > > > > > > for escape sequences, it could also be used to escape > > > itself, > > > > > in > > > > > > > the > > > > > > > > > > event > > > > > > > > > > > that a field name contains a backslash. I think this is > > > more > > > > > > > > intuitive > > > > > > > > > > and > > > > > > > > > > > simpler than, e.g., adding a new config property to > > toggle > > > > the > > > > > > > > > delimiter > > > > > > > > > > to > > > > > > > > > > > be used when parsing nested field references. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I like this approach. Adding to the KIP. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 8. I don't think we can unconditionally turn this > feature > > > on. > > > > > The > > > > > > > > risk > > > > > > > > > of > > > > > > > > > > > breaking existing pipelines (especially ones that > > involve, > > > > for > > > > > > > > > example, a > > > > > > > > > > > combination of the Flatten and Cast SMTs) is pretty > > high. I > > > > > think > > > > > > > > this > > > > > > > > > > > should be an opt-in feature, at least until the next > > major > > > > > > release. > > > > > > > > One > > > > > > > > > > way > > > > > > > > > > > we could accomplish this is by introducing a new > > > > "field.style" > > > > > > > (name > > > > > > > > > > > obviously subject to change) property with values of > > > "plain" > > > > > > > > (default) > > > > > > > > > > and > > > > > > > > > > > "nested". If set to "plain" then the current non-nested > > > > > behavior > > > > > > is > > > > > > > > > used, > > > > > > > > > > > and if set to "nested", then the proposed nested > behavior > > > is. > > > > > We > > > > > > > can > > > > > > > > > > > consider updating the default value to "nested" in a > > future > > > > > major > > > > > > > > > release > > > > > > > > > > > (or even codify that plan in this KIP, if there's > enough > > > > > support > > > > > > > for > > > > > > > > > it). > > > > > > > > > > > This would also leave the door open for adding > recursive > > > > > support > > > > > > to > > > > > > > > > SMTs > > > > > > > > > > in > > > > > > > > > > > the future by adding a permitted value of "recursive". > > > > > > > > > > > > > > > > > > > > > > 9. One of the linked tickets in the "Motivation" > section, > > > > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10640, has > > an > > > > open > > > > > > PR > > > > > > > > and > > > > > > > > > > KIP > > > > > > > > > > > that propose adding recursive support to some SMTs. > Have > > > you > > > > > > > > considered > > > > > > > > > > > this type of functionality for your KIP? Or is your aim > > to > > > > > stick > > > > > > > > solely > > > > > > > > > > to > > > > > > > > > > > nested field support? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I like the `field.style` configuration flag approach. > > > > > > > > > > Thanks for pointing out the `recursive` approach. Will > add > > > > > `nested` > > > > > > > at > > > > > > > > > the > > > > > > > > > > moment, let's check the demand for `recursive` to > consider > > it > > > > as > > > > > > part > > > > > > > > of > > > > > > > > > > this or another KIP. > > > > > > > > > > > > > > > > > > > > I have added the following on the KIP: > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > > Future KIPs could extend this support for: > > > > > > > > > > > > > > > > > > > > - Recursive notation: name a field and apply it to all > > fields > > > > > > across > > > > > > > > the > > > > > > > > > > schema matching that name. > > > > > > > > > > - Access to arrays: Adding `[]` notation to represent > > access > > > to > > > > > > > arrays > > > > > > > > > and > > > > > > > > > > applying SMTs to fields within an array. > > > > > > > > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > > > Chris > > > > > > > > > > > > > > > > > > > > > > On Tue, Feb 8, 2022 at 1:23 PM Jorge Esteban Quilcate > > > Otoya < > > > > > > > > > > > quilcate.jo...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > Hi Dev team, > > > > > > > > > > > > > > > > > > > > > > > > I'd like to start a new discussion thread on Kafka > > > Connect > > > > > > > KIP-821: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-821%3A+Connect+Transforms+support+for+nested+structures > > > > > > > > > > > > > > > > > > > > > > > > This KIP is aimed to include support for nested > > > structures > > > > on > > > > > > the > > > > > > > > > > > existing > > > > > > > > > > > > SMTs — where this make sense. > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your feedback. > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > Jorge. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >