Thanks Joshua for this KIP, it looks like a useful addition.

I have a few questions:

1. In Public Interfaces and Proposed Changes, the KIP mentions the
current "field" setting will be renamed to "fields". Do you instead
mean a new setting "fields" will be added and the existing setting
will be marked deprecated?

2. What happens if a user specifies both "field" and "fields"? Is one
setting taking precedence? Is it a bad configuration that will be
rejected? Same for the format and format.input/format.output settings.

3. Can you also add the proposed documentation for the new settings in the KIP?

Best,
Mickael

On Mon, Nov 22, 2021 at 11:37 AM Joshua Grisham <grishamj...@gmail.com> wrote:
>
> Hi Randall, thanks again for the great feedback! Sorry it took me a minute
> to get back on this one--I have been a bit full up over the last few
> months--but now I will try to brush it off and see if we can figure out if
> it makes sense to move forward and possibly close this one out.
>
> Regarding backward compatibility -- the intention of my proposed changes
> (including the candidate PR) was that it still is backward compatible, but
> I realize that what I wrote in the KIP was more theoretical about what
> could happen instead of what I actually did in the PR (that I actually made
> it backward-compatible).  So now I have updated the text in the KIP to
> hopefully more accurately reflect this and I think/hope that it, along with
> what is in the candidate PR, addresses everything you mentioned in your
> first and second paragraph.
>
> For clarity here is exactly what I have done in the PR:
>
> First I saw that the way the config properties for TimestampConverter
> lacked the same structure as it did for other SMTs (e.g. ReplaceField as
> one example), especially since I was adding several new properties, so I
> added the same ConfigName and ConfigDefault interfaces (instead of just
> having various string attributes on the TimestampConverter class itself).
> The existing public class  string attributes FIELD_CONFIG,
> TARGET_TYPE_CONFIG, and FORMAT_CONFIG all still exist and all still can be
> continued to used by anyone else, they will just get warnings at build time
> since I marked them with @Deprecated.  I did remove the private class
> strings since they were no longer used in the class but this should not
> have any impact on compatibility.
>
> Second I added the new properties to CONFIG_DEF but kept the existing
> properties there (so they still exist and can be used) and in the
> configure() method I have used ConfigUtils.translateDeprecatedConfigs() to
> alias the old "field" property to the new "fields" property automatically
> just like was done in ReplaceField from the change from KIP-629 (
> https://github.com/apache/kafka/commit/eab61cad2c418786ab5e58aa1267f689d82a61d1
> )
> In the end it means that a user can still have their connector use the
> "field" property but when the configure() method is called it will just put
> whatever they set in "field" to "fields" automatically.
>
> Also the property "format" still behaves exactly the same as before if it
> is used (it continues to function as a SimpleDateFormatter input and
> output) it is just that there is an additional capability added by
> voluntarily adopting the two new properties "format.input" and
> "format.output".
>
> As far as I can tell, in theory someone could have an existing connector,
> using TimestampConverter (or something else which uses the API even),
> upgrade to this version, and not change anything else, and it still behaves
> the same as before.  But perhaps it should not be marked as @Deprecated to
> avoid the warnings or otherwise is there a clearly defined deprecation
> policy that can be followed? I tried to search and look in the Kafka docs
> but could not find anything after several attempts and rabbit-holes
> unfortunately (closest I seem to find is just a list or documentation on
> what has been deprecated in each release).
>
> However please do say if there is anything else you think is missing either
> in the implementation or now in the updated KIP regarding this!
>
> Also hopefully with my updates to the KIP it has addressed your other
> comments or questions, but please feel free to give it another look when
> you have time and I would welcome any feedback that you have.
>
> I also noticed there was one small change where my PR now has a merge
> conflict but it should be pretty minimal to update as long as everything
> else is still looking ok.. I will try to take a look at this soon as well
> if I am able.
>
> Thanks again and have a great week!
>
> Best,
> Joshua
>
>
>
> Den ons 28 juli 2021 kl 16:05 skrev Randall Hauch <rha...@gmail.com>:
>
> > Thanks for the contribution, Joshua. Overall I think this is a really good
> > improvement to this connector, for all the reasons cited in the motivation
> > section. I do have some requests, though.
> >
> > The biggest issue I have is that the change is not backward compatible.
> > We've worked really hard to make sure that all Connect APIs (including
> > configuration properties) are always backward compatible because this makes
> > it trivial for users to upgrade from one Connect release to a newer one.
> > After all, any users that are already using this SMT in their connectors
> > should be able to upgrade Connect and still have the SMT behave exactly as
> > it did in earlier releases -- unless they opt in to changing their
> > configuration to use the new feature. So I would request that the KIP and
> > proposed implementation be changed to not *change* existing configuration
> > properties. We can of course introduce new config properties, but they
> > should have a default that results in exactly the same old behavior as in
> > earlier releases. A third option is to create a new SMT, though we should
> > try really hard to just improve the existing SMT since that's generally a
> > much better UX. Assuming we can make the existing transformation
> > work, the option to create a new SMT should be documented in the Rejected
> > Alternatives section, too.
> >
> > For example, there are (at least?) two options with regard to specifying
> > the field names. 1) just keep `field` as the name even though a
> > comma-separated list is allowed, or 2) add `fields` but keep (and
> > deprecate?) the existing `field` configuration, using `fields` if defined
> > or if `field` is not defined, and maybe even supporting comma separated
> > lists of names in both fields. The same goes for `format` and
> > `format.output`. In general, I think 2 *might* provide a better experience
> > at the cost of higher implementation complexity. Regardless of which you
> > choose, the KIP should explicitly propose one approach, and document the
> > other(s) in the rejected alternatives section. After all, the KIP should be
> > very clear about what will be proposed.
> >
> > In terms of deprecating the old configs in option 2, it'd be fine to do
> > this in the documentation (and code, if applicable) as part of this
> > feature. That means they could be removed at the earliest in the next major
> > release after the feature is merged, which would be AK 4.0. And it would be
> > good to mention this explicitly in the KIP so that we don't need another
> > KIP just to remove the deprecation; so something like "This configuration
> > property will be deprecated and documented as such, and eventually removed
> > in a subsequent major release."
> >
> > Supporting nested fields would also be a nice touch here, too. However, as
> > you mention, we'd likely want to make the same change to other SMTs, and as
> > such we might consider adding support for nested names on all SMTs with a
> > separate KIP. The two KIPs could be written such that they are independent.
> >
> > Finally, a KIP need not go into much detail about the implementation, and
> > should instead focus on the API and behavioral changes. For example, it's
> > probably not necessary for the KIP to mention that the implementation would
> > use DateTimeFormatter. However, it would be worth mentioning explicitly
> > that the `format.input` configuration property will take a regular
> > expression that is compatible with the JDK's DateTimeFormatter.ofPattern
> > method
> > -- this very clearly states the constraint on the input values.
> >
> > Again, very nice job on the KIP so far!
> >
> > Best regards,
> >
> > Randall
> >
> > On Sat, May 15, 2021 at 6:50 AM Joshua Grisham <grishamj...@gmail.com>
> > wrote:
> >
> > > Hello again Kafka Developers community!
> > >
> > > I thought it has been some time and I would try to dust KIP-682 off and
> > see
> > > if anyone wanted to take a discussion on them, if it still makes sense or
> > > if this should go in a different direction.
> > >
> > > Anyone have thoughts on this?  That Connect's TimestampConverter could be
> > > updated to support converting multiple fields in one pass of the
> > > transformation (instead of chaining multiple instances together if you
> > have
> > > multiple timestamps to convert) and that it could support a regex-like
> > > pattern for matching string input instead of only allowing one string
> > > format and expect that all producers are sending in 100% the same format?
> > >
> > > Since some time has gone there are a few conflicts in my proposed PR but
> > > they seem quite trivial (some error message and formatting of a test
> > result
> > > from a quick glance) so it should not be hard to resolve them.
> > >
> > > Thanks!
> > >
> > > Best,
> > > Joshua
> > >
> > >
> > > On Sat, Nov 7, 2020 at 9:24 PM Joshua Grisham <grishamj...@gmail.com>
> > > wrote:
> > >
> > > > Hi Brandon,
> > > >
> > > > I have added what i called "recursive" support to find child fields on
> > > any
> > > > level to the Cast and ReplaceField transforms as well, the PR is here
> > > > https://github.com/apache/kafka/pull/9493
> > > >
> > > > I plan to try and write up a KIP for that one as well maybe in the next
> > > > day or two. It is not exactly the same as specifically targeting one
> > > nested
> > > > field based on its full path, but instead will look for the field name
> > at
> > > > any level of the structure and cast/replace all of them if they match.
> > > >
> > > > The same could be added in theory to almost all of the transforms i
> > guess
> > > > but maybe that is for another discussion 😊
> > > >
> > > > But for now with this TimestampConverter the only thing which I have
> > > > proposed are the two mentioned before - multiple fields and support for
> > > > pattern like matching of string timestamps.
> > > >
> > > > Joshua
> > > >
> > > > Den lör 7 nov. 2020 18:42Brandon Brown <bran...@bbrownsound.com>
> > skrev:
> > > >
> > > >> I love this idea! I have a current KIP for a hash transform and am
> > > >> working adding multi field/nested support to that one. I can think of
> > a
> > > few
> > > >> times we’ve needed functionality like that. I’d add that adding
> > support
> > > for
> > > >> transforming nested fields would be a great feature for this.
> > > >>
> > > >> -Brandon
> > > >>
> > > >> Brandon Brown
> > > >> > On Nov 7, 2020, at 11:11 AM, Joshua Grisham <grishamj...@gmail.com>
> > > >> wrote:
> > > >> >
> > > >> > Hello everyone!
> > > >> > (Thanks to Tom Bentley for directing me in this direction!)
> > > >> >
> > > >> > I have made a series of changes to some of the standard Connect
> > > >> transforms
> > > >> > to meet some of the challenges at my company to consume messages
> > using
> > > >> > Connect, and have been running them for a few weeks now as custom
> > > SMTs.
> > > >> >
> > > >> > I realized that several of these changes might actually be really
> > good
> > > >> > features to be included in the standard transforms so I will open
> > some
> > > >> KIPs
> > > >> > to go along with PRs which I already submitted (apologize for going
> > a
> > > >> bit
> > > >> > backwards in the process as this was my first time working with the
> > > >> Kafka
> > > >> > project).
> > > >> >
> > > >> > The first one I have created a KIP for is KIP-682 on the
> > > >> TimestampConverter
> > > >> > transform.
> > > >> >
> > > >> > Basically the 2 limitations that I am proposing to remove are the
> > > >> ability
> > > >> > to only convert one field at a time, but instead convert multiple
> > > fields
> > > >> > using a common config, and that if any producer sends a slightly
> > > >> different
> > > >> > date/timestamp string format on any message then the transformation
> > > >> > fails... so to instead add the ability to support variations in the
> > > >> string
> > > >> > input format.
> > > >> >
> > > >> > More info here:
> > > >> >
> > > >> >
> > > >>
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-682%3A+Connect+TimestampConverter+support+for+multiple+fields+and+multiple+input+formats
> > > >> >
> > > >> > I would appreciate any kind of discussion to make improvements and
> > it
> > > >> would
> > > >> > be great to see changes like this being pushed up.
> > > >> >
> > > >> > Thanks for now!
> > > >> >
> > > >> > Best regards,
> > > >> > Joshua Grisham
> > > >>
> > > >
> > >
> >

Reply via email to