Re: [DISCUSS] KIP-470: TopologyTestDriver test input and output usability improvements

Bill Bejeck Thu, 01 Aug 2019 07:42:57 -0700

Hi Jukka,

I also think 3, 4, and 5 are all good options.


My personal preference is 4, but I also wouldn't mind going with 5 if that
is what others want to do.

Thanks,
Bill

On Tue, Jul 30, 2019 at 9:31 AM John Roesler <j...@confluent.io> wrote:

> Hey Jukka,
>
> Sorry for the delay.
>
> For what it's worth, I think 3, 4, and 5 are all good options. I guess my
> own preference is 5.
>
> It seems like the migration pain is a one-time concern vs. having more
> maintainable code for years thereafter.
>
> Thanks,
> -John
>
>
>
> On Tue, Jul 2, 2019 at 4:03 AM Jukka Karvanen <jukka.karva...@jukinimi.com
> >
> wrote:
>
> > Hi Matthias,
> >
> > Generally I think using Instant and Duration make the test more readable
> > and that's why I modified KIP based on your suggestion.
> > Now a lot of code use time with long or Long and that make the change
> more
> > complicated.
> >
> > What I tried to say about the migration is the lines without timestamp or
> > if long timestamp is supported can be migrated mainly with search &
> > recplace:
> >
> >
> >
> testDriver.pipeInput(recordFactory.create(WordCountLambdaExample.inputTopic,
> > nullKey, "Hello", 1L));
> >
> > ->
> >
> > *inputTopic*.pipeInput(nullKey, "Hello", 1L);
> >
> > If long is not supported as timestamp, the same is not so easy:
> >
> >
> >
> testDriver.pipeInput(recordFactory.create(WordCountLambdaExample.inputTopic,
> > nullKey, "Hello", 1L));
> >
> > ->
> >
> > *inputTopic1*.pipeInput(nullKey, "Hello", Instant.ofEpochMilli(1L));
> >
> > Also if you need to convert arbitrary long timestamps to proper time
> > Instants, it require you need to understand the logic of the test. So
> > mechanical search & replace is not possible.
> >
> >
> > I see there are several alternatives for long vs Instant / Duration:
> >
> > 1. All times as long/Long like in this version:
> >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=119550011
> >
> > (startTimestampMs, autoAdvanceMs as parameter of  createInputTopic
> > instead of configureTiming)
> >
> > 2. Auto timestamping configured with Instant and Duration, pipeInput
> > and TestRecord with long:
> >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120722523
> >
> >
> > 3. (CURRENT) Auto timestamping configured with Instant and Duration,
> > pipeInput and TestRecord with Instant, version with long deprecated:
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-470%3A+TopologyTestDriver+test+input+and+output+usability+improvements
> >
> >
> > 4. Auto timestamping configured with Instant and Duration, pipeInput
> > and TestRecord with Instant and long parallel (with long not
> > deprecated):
> >
> > 5. Auto timestamping configured with Instant and Duration, pipeInput
> > and TestRecord with Instant only
> >
> > I hope to get feedback.
> >
> > My own preference currently is alternative 3. or 4.
> >
> >
> > If somebody want to test, there is a implementation of this version
> > available in Github:
> >
> > https://github.com/jukkakarvanen/kafka-streams-test-topics
> >
> > which can be used directly from public Maven repository:
> >
> >     <dependency>
> >         <groupId>com.github.jukkakarvanen</groupId>
> >         <artifactId>kafka-streams-test-topics</artifactId>
> >         <version>0.0.1-beta3</version>
> >         <scope>test</scope>
> >     </dependency>
> >
> > Also is this approach in KIP-470 preferred over KIP-456, so can we close
> > it:
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-456%3A+Helper+classes+to+make+it+simpler+to+write+test+logic+with+TopologyTestDriver
> >
> > Jukka
> >
> > .
> >
> >
> > pe 28. kesäk. 2019 klo 1.10 Matthias J. Sax (matth...@confluent.io)
> > kirjoitti:
> >
> > > Thanks Jukka!
> > >
> > > The idea to use `Instant/Duration` was a proposal. If we think it's not
> > > a good one, we could still stay with `long`. Because `ProducerRecord`
> > > and `ConsumerRecord` are both based on `long,` it might make sense to
> > > keep `long`?
> > >
> > > > The result of converting millis to Instant directly generates
> > > >> rather non readable test code and changing from long to Instant
> > > correctly
> > > >> require understand what is the case it is testing.
> > >
> > > This might be a good indicator the using `Instant/Duration` might not
> be
> > > a good idea.
> > >
> > > Would be nice to get feedback from others.
> > >
> > > About adding new methods that we deprecate immediately: I don't think
> we
> > > should do this. IMHO, there are two kind of users, one that immediately
> > > rewrite their code to move off deprecated methods. Those won't use the
> > > new+deprecated ones anyway. Other uses migrate their code slowly and
> > > would just not rewrite their code at all, and thus also not use the
> > > new+deprecated methods.
> > >
> > > > Checking my own tests I was able to migrate the most of my code with
> > > > search&replace without further thinking about the logic to this new
> > > > approach. The result of converting millis to Instant directly
> generates
> > > > rather non readable test code and changing from long to Instant
> > correctly
> > > > require understand what is the case it is testing.
> > >
> > > Not sure if I can follow here. You first say, you could easily migrate
> > > your code, but than you say it was not easily possible? Can you clarify
> > > your experience upgrading your test code?
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 6/27/19 12:21 AM, Jukka Karvanen wrote:
> > > > Hi,
> > > >
> > > >>> (4) Should we switch from `long` for timestamps to `Instant` and
> > > > `Duration` ?
> > > >> This version startTimestamp is Instant and autoAdvance Duration in
> > > > Initialization and with manual configured collection pipe methods.
> > > >> Now timestamp of TestRecord is still Long and similarly single
> record
> > > > pipeInput still has long as parameter.
> > > >> Should these also converted to to Instant type?
> > > >> Should there be both long and Instant parallel?
> > > >> I expect there are existing test codebase which would be easier to
> > > migrate
> > > > if long could be still used.
> > > > Now added Instant version to TestRecord and pipeInput method.
> > > >
> > > > Checking my own tests I was able to migrate the most of my code with
> > > > search&replace without further thinking about the logic to this new
> > > > approach. The result of converting millis to Instant directly
> generates
> > > > rather non readable test code and changing from long to Instant
> > correctly
> > > > require understand what is the case it is testing.
> > > >
> > > > That is why version with long left still as deprecated for easier
> > > migration
> > > > for existing test.
> > > > Also TopologyTestDriver constructor and advanceWallClockTime  method
> > > > modified with same approach.
> > > >
> > > > Jukka
> > > >
> > > >
> > > > ma 24. kesäk. 2019 klo 16.47 Bill Bejeck (bbej...@gmail.com)
> > kirjoitti:
> > > >
> > > >> Hi Jukka
> > > >>
> > > >>> These topic objects are only interfacing TopologyTestDriver, not
> > > >> affecting
> > > >>> the internal functionality of it. In my plan the internal data
> > > structures
> > > >>> are using those Producer/ConsumerRecords as earlier. That way I
> don't
> > > see
> > > >>> how those could be affected.
> > > >>
> > > >> I mistakenly thought the KIP was proposing to completely remove
> > > >> Producer/ConsumerRecords in favor of TestRecord.  But taking another
> > > quick
> > > >> look I can see the plan for using the TestRecord objects.  Thanks
> for
> > > the
> > > >> clarification.
> > > >>
> > > >> -Bill
> > > >>
> > > >> On Sat, Jun 22, 2019 at 2:26 AM Jukka Karvanen <
> > > >> jukka.karva...@jukinimi.com>
> > > >> wrote:
> > > >>
> > > >>> Hi All,
> > > >>> Hi Matthias,
> > > >>>
> > > >>>> (1) It's a little confusing that you list all method (existing,
> > > proposed
> > > >>>> to deprecate, and new one) of `TopologyTestDriver` in the KIP.
> Maybe
> > > >>>> only list the ones you propose to deprecate and the new ones you
> > want
> > > to
> > > >>>> add?
> > > >>>
> > > >>> Ok. Unmodified methods removed.
> > > >>>
> > > >>>> (2) `TopologyTestDriver#createInputTopic`: might it be worth to
> add
> > > >>>> overload to initialize the timetamp and auto-advance feature
> > directly?
> > > >>>> Otherwise, uses always need to call `configureTiming` as an extra
> > > call?
> > > >>>
> > > >>> Added with Instant and Duration parameters.
> > > >>>
> > > >>>> (3) `TestInputTopic#configureTiming()`: maybe rename to
> > > >>> `reconfigureTiming()` ?
> > > >>>
> > > >>> I removed this method when we have now initialization in
> constructor.
> > > >>> You can recreate TestInputTopic if needing to reconfigure timing.
> > > >>>
> > > >>>
> > > >>>> (4) Should we switch from `long` for timestamps to `Instant` and
> > > >>> `Duration` ?
> > > >>> This version startTimestamp is Instant and autoAdvance Duration in
> > > >>> Initialization and with manual configured collection pipe methods.
> > > >>> Now timestamp of TestRecord is still Long and similarly single
> record
> > > >>> pipeInput still has long as parameter.
> > > >>> Should these also converted to to Instant type?
> > > >>> Should there be both long and Instant parallel?
> > > >>> I expect there are existing test codebase which would be easier to
> > > >> migrate
> > > >>> if long could be still used.
> > > >>>
> > > >>> Also should advanceWallClockTime  in TopologyTestDriver changed(or
> > > added
> > > >>> alternative) for Duration parameter.
> > > >>>
> > > >>>
> > > >>>> (5) Why do we have redundant getters? Or set with `getX()` and one
> > > >>> set without `get`-prefix?
> > > >>>
> > > >>> The methods without get-prefix are for compatibility with
> > > >> ProducerRecord /
> > > >>> ConsumerRecord and I expect would make migration to TestRecord
> > easier.
> > > >>> Standard getters in TestRecord enable writing test ignoring
> selected
> > > >> fields
> > > >>> with hamcrest like this:
> > > >>>
> > > >>> assertThat(outputTopic.readRecord(), allOf(
> > > >>>         hasProperty("key", equalTo(1L)),
> > > >>>         hasProperty("value", equalTo("Hello")),
> > > >>>         hasProperty("headers", equalTo(headers))));
> > > >>>
> > > >>>
> > > >>> That's why I have currently both methods.
> > > >>>
> > > >>> Jukka
> > > >>>
> > > >>>
> > > >>> pe 21. kesäk. 2019 klo 22.20 Matthias J. Sax (
> matth...@confluent.io)
> > > >>> kirjoitti:
> > > >>>
> > > >>>> Thanks for the KIP. The idea to add InputTopic and OutputTopic
> > > >>>> abstractions is really neat!
> > > >>>>
> > > >>>>
> > > >>>> Couple of minor comment:
> > > >>>>
> > > >>>> (1) It's a little confusing that you list all method (existing,
> > > >> proposed
> > > >>>> to deprecate, and new one) of `TopologyTestDriver` in the KIP.
> Maybe
> > > >>>> only list the ones you propose to deprecate and the new ones you
> > want
> > > >> to
> > > >>>> add?
> > > >>>>
> > > >>>> (Or mark all existing methods clearly -- atm, I need to got back
> to
> > > the
> > > >>>> code to read the KIP and to extract what changes are proposed).
> > > >>>>
> > > >>>>
> > > >>>> (2) `TopologyTestDriver#createInputTopic`: might it be worth to
> add
> > > >>>> overload to initialize the timetamp and auto-advance feature
> > directly?
> > > >>>> Otherwise, uses always need to call `configureTiming` as an extra
> > > call?
> > > >>>>
> > > >>>>
> > > >>>> (3) `TestInputTopic#configureTiming()`: maybe rename to
> > > >>>> `reconfigureTiming()` ?
> > > >>>>
> > > >>>>
> > > >>>> (4) Should we switch from `long` for timestamps to `Instant` and
> > > >>>> `Duration` ?
> > > >>>>
> > > >>>>
> > > >>>> (5) Why do we have redundant getters? Or set with `getX()` and one
> > set
> > > >>>> without `get`-prefix?
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> -Matthias
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> On 6/21/19 10:57 AM, Bill Bejeck wrote:
> > > >>>>> Jukka,
> > > >>>>>
> > > >>>>> Thanks for the KIP. I like the changes overall.
> > > >>>>> One thing I wanted to confirm, and this may be me being paranoid,
> > but
> > > >>>> will
> > > >>>>> the changes for input/output topic affect how the
> > TopologyTestDriver
> > > >>>> works
> > > >>>>> with internal topics when there are sub-topologies created?
> > > >>>>>
> > > >>>>> On Fri, Jun 21, 2019 at 12:05 PM Guozhang Wang <
> wangg...@gmail.com
> > >
> > > >>>> wrote:
> > > >>>>>
> > > >>>>>> 1) Got it, could you list this class along with all its
> functions
> > in
> > > >>> the
> > > >>>>>> proposed public APIs as well?
> > > >>>>>>
> > > >>>>>> 2) Ack, thanks!
> > > >>>>>>
> > > >>>>>> On Thu, Jun 20, 2019 at 11:27 PM Jukka Karvanen <
> > > >>>>>> jukka.karva...@jukinimi.com>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Hi  Guozhang,
> > > >>>>>>>
> > > >>>>>>> 1) This TestRecord is new class in my proposal. So it is a
> > > >> simplified
> > > >>>>>>> version of ProducerRecord and ConsumerRecord containing only
> the
> > > >>> fields
> > > >>>>>>> needed to test record content.
> > > >>>>>>>
> > > >>>>>>> 2)
> > > >>>>>>> public final <K, V> TestInputTopic<K, V> createInputTopic(final
> > > >>> String
> > > >>>>>>> topicName, final Serde<K> keySerde, final Serde<V> valueSerde);
> > > >>>>>>> public final <K, V> TestOutputTopic<K, V>
> createOutputTopic(final
> > > >>>> String
> > > >>>>>>> topicName, final Serde<K> keySerde, final Serde<V> valueSerde);
> > > >>>>>>> The purpose is to create separate object for each input and
> > output
> > > >>>> topic
> > > >>>>>>> you are using. The topic name is given to
> createInput/OutputTopic
> > > >>> when
> > > >>>>>>> initialize topic object.
> > > >>>>>>>
> > > >>>>>>> For example:
> > > >>>>>>>
> > > >>>>>>> final TestInputTopic<Long, String> inputTopic1 =
> > > >>>>>>> testDriver.createInputTopic( INPUT_TOPIC, longSerde,
> > stringSerde);
> > > >>>>>>> final TestInputTopic<Long, String> inputTopic2 =
> > > >>>>>>> testDriver.createInputTopic( INPUT_TOPIC_MAP, longSerde,
> > > >>> stringSerde);
> > > >>>>>>> final TestOutputTopic<Long, String> outputTopic1 =
> > > >>>>>>> testDriver.createOutputTopic(OUTPUT_TOPIC, longSerde,
> > stringSerde);
> > > >>>>>>> final TestOutputTopic<String, Long> outputTopic2 =
> > > >>>>>>> testDriver.createOutputTopic(OUTPUT_TOPIC_MAP, stringSerde,
> > > >>>>>>> longSerde);
> > > >>>>>>> inputTopic1.pipeInput(1L, "Hello");
> > > >>>>>>> assertThat(outputTopic1.readKeyValue(), equalTo(new
> > KeyValue<>(1L,
> > > >>>>>>> "Hello")));
> > > >>>>>>> assertThat(outputTopic2.readKeyValue(), equalTo(new
> > > >>> KeyValue<>("Hello",
> > > >>>>>>> 1L)));
> > > >>>>>>> inputTopic2.pipeInput(1L, "Hello");
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> Jukka
> > > >>>>>>>
> > > >>>>>>> to 20. kesäk. 2019 klo 23.52 Guozhang Wang (wangg...@gmail.com
> )
> > > >>>>>> kirjoitti:
> > > >>>>>>>
> > > >>>>>>>> Hello Jukka,
> > > >>>>>>>>
> > > >>>>>>>> Thanks for writing the KIP, I have a couple of quick
> questions:
> > > >>>>>>>>
> > > >>>>>>>> 1) Is "TestRecord" an existing class that you propose to
> > > >> piggy-back
> > > >>>> on?
> > > >>>>>>>> Right now we have a scala TestRecord case class but I doubt
> that
> > > >> was
> > > >>>>>> your
> > > >>>>>>>> proposal, or are you proposing to add a new Java class?
> > > >>>>>>>>
> > > >>>>>>>> 2) Would the new API only allow a single input / output topic
> > with
> > > >>>>>>>> `createInput/OutputTopic`? If not, when we call pipeInput how
> to
> > > >>>>>>> determine
> > > >>>>>>>> which topic this record should be pipe to?
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> Guozhang
> > > >>>>>>>>
> > > >>>>>>>> On Mon, Jun 17, 2019 at 1:34 PM John Roesler <
> j...@confluent.io
> > >
> > > >>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Woah, I wasn't aware of that Hamcrest test style. Awesome!
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks for the updates. I look forward to hearing what others
> > > >>> think.
> > > >>>>>>>>>
> > > >>>>>>>>> -John
> > > >>>>>>>>>
> > > >>>>>>>>> On Mon, Jun 17, 2019 at 4:12 AM Jukka Karvanen
> > > >>>>>>>>> <jukka.karva...@jukinimi.com> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>> Wiki page updated:
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-470%3A+TopologyTestDriver+test+input+and+output+usability+improvements
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> ClientRecord removed and replaced with TestRecord in method
> > > >> calls.
> > > >>>>>>>>>> TestRecordFactory removed (time tracking functionality to be
> > > >>>>>> included
> > > >>>>>>>> to
> > > >>>>>>>>>> TestInputTopic)
> > > >>>>>>>>>> OutputVerifier deprecated
> > > >>>>>>>>>> TestRecord topic removed and getters added
> > > >>>>>>>>>>
> > > >>>>>>>>>> Getters in TestRecord enable writing test ignoring selected
> > > >> fields
> > > >>>>>>> with
> > > >>>>>>>>>> hamcrest like this:
> > > >>>>>>>>>>
> > > >>>>>>>>>> assertThat(outputTopic.readRecord(), allOf(
> > > >>>>>>>>>>         hasProperty("key", equalTo(1L)),
> > > >>>>>>>>>>         hasProperty("value", equalTo("Hello")),
> > > >>>>>>>>>>         hasProperty("headers", equalTo(headers))));
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> Jukka
> > > >>>>>>>>>>
> > > >>>>>>>>>> la 15. kesäk. 2019 klo 1.10 John Roesler (j...@confluent.io
> )
> > > >>>>>>>> kirjoitti:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Sounds good. Thanks as always for considering my feedback!
> > > >>>>>>>>>>> -John
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Fri, Jun 14, 2019 at 12:12 PM Jukka Karvanen
> > > >>>>>>>>>>> <jukka.karva...@jukinimi.com> wrote:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Ok, I will modify KIP Public Interface in a wiki based on
> > the
> > > >>>>>>>>> feedback.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> TestRecordFactory / ConsumerRecordFactory was used by
> > > >>>>>>>> TestInputTopic
> > > >>>>>>>>> with
> > > >>>>>>>>>>>> the version I had with KIP456, but maybe I can merge That
> > > >>>>>>>>> functionality
> > > >>>>>>>>>>> to
> > > >>>>>>>>>>>> InputTopic or  TestRecordFactory   can kept non public
> maybe
> > > >>>>>>> moving
> > > >>>>>>>>> it to
> > > >>>>>>>>>>>> internals package.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I will make the proposal with a slim down interface.
> > > >>>>>>>>>>>> I don't want to go to so slim as you proposed with only
> > > >>>>>>> TestRecord
> > > >>>>>>>> or
> > > >>>>>>>>>>>> List<TestRecord>, because you then still end up doing
> helper
> > > >>>>>>>> methods
> > > >>>>>>>>> to
> > > >>>>>>>>>>>> construct List of TestRecord.
> > > >>>>>>>>>>>> The list of values is easier to write and clearer to read
> > than
> > > >>>>>> if
> > > >>>>>>>> you
> > > >>>>>>>>>>> need
> > > >>>>>>>>>>>> to contruct list of TestRecords.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> For example:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> final List<String> inputValues = Arrays.asList(
> > > >>>>>>>>>>>>         "Apache Kafka Streams Example",
> > > >>>>>>>>>>>>         "Using Kafka Streams Test Utils",
> > > >>>>>>>>>>>>         "Reading and Writing Kafka Topic"
> > > >>>>>>>>>>>> );
> > > >>>>>>>>>>>> inputTopic.pipeValueList(inputValues);
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Let's check after the next iteration is it still worth
> > > >> reducing
> > > >>>>>>> the
> > > >>>>>>>>>>> methods.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Jukka
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> pe 14. kesäk. 2019 klo 18.27 John Roesler (
> > j...@confluent.io)
> > > >>>>>>>>> kirjoitti:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> Thanks, Jukka,
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Ok, I buy this reasoning.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Just to echo what I think I read, you would just drop
> > > >>>>>>>> ClientRecord
> > > >>>>>>>>>>>>> from the proposal, and TestRecord would stand on its own,
> > > >>>>>> with
> > > >>>>>>>> the
> > > >>>>>>>>>>>>> same methods and properties you proposed, and the "input
> > > >>>>>> topic"
> > > >>>>>>>>> would
> > > >>>>>>>>>>>>> accept TestRecord, and the "output topic" would produce
> > > >>>>>>>> TestRecord?
> > > >>>>>>>>>>>>> Further, the "input and output topic" classes would
> > > >>>>>> internally
> > > >>>>>>>>> handle
> > > >>>>>>>>>>>>> the conversion to and from ConsumerRecord and
> > ProducerRecord
> > > >>>>>> to
> > > >>>>>>>>> pass
> > > >>>>>>>>>>>>> to and from the TopologyTestDriver?
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> This seems good to me.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Since the object coming out of the "output topic" is much
> > > >>>>>> more
> > > >>>>>>>>>>>>> ergonomic, I suspect we won't need the OutputVerifier at
> > all.
> > > >>>>>>> It
> > > >>>>>>>>> was
> > > >>>>>>>>>>>>> mostly needed because of all the extra junk in
> > ProducerRecord
> > > >>>>>>> you
> > > >>>>>>>>> need
> > > >>>>>>>>>>>>> to ignore. It seems better to just deprecate it. If in
> the
> > > >>>>>>> future
> > > >>>>>>>>> it
> > > >>>>>>>>>>>>> turns out there is some actual use case for a verifier,
> we
> > > >>>>>> can
> > > >>>>>>>>> have a
> > > >>>>>>>>>>>>> very small KIP to add one. But reading your response, I
> > > >>>>>> suspect
> > > >>>>>>>>> that
> > > >>>>>>>>>>>>> existing test verification libraries would be sufficient
> on
> > > >>>>>>> their
> > > >>>>>>>>> own.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Similarly, it seems like we can shrink the total
> interface
> > by
> > > >>>>>>>>> removing
> > > >>>>>>>>>>>>> the TestRecordFactory from the proposal. If TestRecord
> > > >>>>>> already
> > > >>>>>>>>> offers
> > > >>>>>>>>>>>>> all the constructors you'd want, then the only benefit of
> > the
> > > >>>>>>>>> factory
> > > >>>>>>>>>>>>> is to auto-increment the timestamps, but then again, the
> > > >>>>>> "input
> > > >>>>>>>>> topic"
> > > >>>>>>>>>>>>> can already do that (e.g., it can do it if the record
> > > >>>>>> timestamp
> > > >>>>>>>> is
> > > >>>>>>>>> not
> > > >>>>>>>>>>>>> set).
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Likewise, if the TestRecords are easy to create, then we
> > > >>>>>> don't
> > > >>>>>>>> need
> > > >>>>>>>>>>>>> all the redundant methods in "input topic" to pipe
> values,
> > or
> > > >>>>>>>>>>>>> key/values, or key/value/timestamp, etc. We can do with
> > just
> > > >>>>>>> two
> > > >>>>>>>>>>>>> methods, one for a single TestRecord and one for a
> > collection
> > > >>>>>>> of
> > > >>>>>>>>> them.
> > > >>>>>>>>>>>>> This reduces API ambiguity and also dramatically
> decreases
> > > >>>>>> the
> > > >>>>>>>>> surface
> > > >>>>>>>>>>>>> area of the interface, which ultimately makes it much
> > easier
> > > >>>>>> to
> > > >>>>>>>>> use.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> It's best to start with the smallest interface that will
> do
> > > >>>>>> the
> > > >>>>>>>> job
> > > >>>>>>>>>>>>> and expand it upon request, rather than throwing in
> > > >>>>>> everything
> > > >>>>>>>> you
> > > >>>>>>>>> can
> > > >>>>>>>>>>>>> think of up front. The extra stuff would be confusing to
> > > >>>>>> people
> > > >>>>>>>>> facing
> > > >>>>>>>>>>>>> two practically identical paths to accomplish the same
> > goal,
> > > >>>>>>> and
> > > >>>>>>>>> it's
> > > >>>>>>>>>>>>> very difficult to slim the interface down later, because
> we
> > > >>>>>>> don't
> > > >>>>>>>>>>>>> really know which parts are more popular (i.e., no one
> > > >>>>>> submits
> > > >>>>>>>>>>>>> "feature requests" to _remove_ stuff they don't need,
> only
> > to
> > > >>>>>>>> _add_
> > > >>>>>>>>>>>>> stuff that they need.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> But overall, I really like the structure of this design.
> > I'm
> > > >>>>>>>> super
> > > >>>>>>>>>>>>> excited about this KIP.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>> -John
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> On Fri, Jun 14, 2019 at 2:55 AM Jukka Karvanen
> > > >>>>>>>>>>>>> <jukka.karva...@jukinimi.com> wrote:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Hi,
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> I am not a fan of swapping only ProducerRecord and
> > > >>>>>>>>> ConsumerRecord.
> > > >>>>>>>>>>>>>> As a test writer point of view I do not want to care
> about
> > > >>>>>>> the
> > > >>>>>>>>>>> difference
> > > >>>>>>>>>>>>>> of those and
> > > >>>>>>>>>>>>>> that way I like to have object type which can be used to
> > > >>>>>> pipe
> > > >>>>>>>>>>> records in
> > > >>>>>>>>>>>>>> and compare outputs.
> > > >>>>>>>>>>>>>> That way avoid unnecessary conversions between
> > > >>>>>> ProducerRecord
> > > >>>>>>>> and
> > > >>>>>>>>>>>>>> ConsumerRecord.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> My initial assumption was that ProducerRecord and
> > > >>>>>>>>>>> ConsumerRecord.could
> > > >>>>>>>>>>>>>> implement the same ClientRecord
> > > >>>>>>>>>>>>>> and that way test writer could have used either of
> those,
> > > >>>>>> but
> > > >>>>>>>>> seems
> > > >>>>>>>>>>> that
> > > >>>>>>>>>>>>>> return type of timestamp method long vs Long is not
> > > >>>>>>> compatible.
> > > >>>>>>>>>>>>>> Now the main advantage of ClientRecord is no need to
> > > >>>>>>> duplicate
> > > >>>>>>>>>>>>>> OutputVerifier when it is modified from ProducerRecord
> to
> > > >>>>>>>>>>> ClientRecord.
> > > >>>>>>>>>>>>>> Generally is there need for OutputVerifier. Can we
> > > >>>>>> deprecate
> > > >>>>>>>> the
> > > >>>>>>>>>>> existing
> > > >>>>>>>>>>>>>> and use standard assertion libraries for new test.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> If you use hamcrest, assert-j or any other assertion
> > > >>>>>> library
> > > >>>>>>>>> for the
> > > >>>>>>>>>>>>> rest
> > > >>>>>>>>>>>>>> of the test, why not use it with these also.
> > > >>>>>>>>>>>>>> When we have these methods to access only needed fields
> it
> > > >>>>>> is
> > > >>>>>>>>> easier
> > > >>>>>>>>>>> to
> > > >>>>>>>>>>>>>> write test like this:
> > > >>>>>>>>>>>>>> assertThat(outputTopic.readValue()).isEqualTo("Hello");
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> or
> > > >>>>>>>>>>>>>>
> > > >>>>>>> assertThat(outputTopic.readRecord()).isEqualTo(expectedRecord);
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Only value for new OutputVerifier is when needing to
> > ignore
> > > >>>>>>>> some
> > > >>>>>>>>>>> fields
> > > >>>>>>>>>>>>>> ClientRecord actual = outputTopic.readRecord();
> > > >>>>>>>>>>>>>> assertThat(actual.value()).isEqualTo("Hello");
> > > >>>>>>>>>>>>>> assertThat(actual.key()).isEqualTo(expectedKey);
> > > >>>>>>>>>>>>>>
> > > >>>>>> assertThat(actual.timestamp()).isEqualTo(expectedTimestamp);
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> So if want to leave client package untouched, I would
> > > >>>>>> modify
> > > >>>>>>>> the
> > > >>>>>>>>>>> methods
> > > >>>>>>>>>>>>>> with ClientRecord now in InputTopic and OutputTopic to
> > pass
> > > >>>>>>> in
> > > >>>>>>>>> and
> > > >>>>>>>>>>> out
> > > >>>>>>>>>>>>> this
> > > >>>>>>>>>>>>>> TestRecord instead.
> > > >>>>>>>>>>>>>> In that case there would be possibility to add methods
> to
> > > >>>>>>>>> TestRecord
> > > >>>>>>>>>>> to
> > > >>>>>>>>>>>>>> help ignore some fields in assertions like:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> assertThat(outputTopic.readRecord().getValueTimestamp()).isEqualTo(expectedRecord.get
> > > >>>>>>>>>>>>>> ValueTimestamp());
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> How about this alternative?
> > > >>>>>>>>>>>>>> If this way sounds better I will modify KIP page in
> wiki.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Jukka
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> to 13. kesäk. 2019 klo 18.30 John Roesler (
> > > >>>>>> j...@confluent.io
> > > >>>>>>> )
> > > >>>>>>>>>>> kirjoitti:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Hey, all, maybe we can jump-start this discussion.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> I think this approach would be very ergonomic for
> > > >>>>>> testing,
> > > >>>>>>>> and
> > > >>>>>>>>>>> would
> > > >>>>>>>>>>>>>>> help reduce boilerplate in tests.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> The think I like most about it is that it mirrors the
> > > >>>>>>> mental
> > > >>>>>>>>> model
> > > >>>>>>>>>>>>>>> that people already have from Kafka Streams, in which
> you
> > > >>>>>>>>> write to
> > > >>>>>>>>>>> an
> > > >>>>>>>>>>>>>>> "input topic" and then get your results from an "output
> > > >>>>>>>>> topic". As
> > > >>>>>>>>>>> a
> > > >>>>>>>>>>>>>>> side benefit, the input and output topics in the
> proposal
> > > >>>>>>>> also
> > > >>>>>>>>>>> close
> > > >>>>>>>>>>>>>>> over the serdes, which makes it much less boilerplate
> for
> > > >>>>>>>> test
> > > >>>>>>>>>>> code.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> If I can offer one suggestion for change, I'm not sure
> > > >>>>>> I'm
> > > >>>>>>>>> totally
> > > >>>>>>>>>>>>>>> sold on the need for a new abstraction "ClientRecord"
> > > >>>>>> with
> > > >>>>>>> an
> > > >>>>>>>>>>>>>>> implementation for tests "TestRecord". It seems like
> this
> > > >>>>>>>>>>>>>>> unnecessarily complicates the main (non-testing) data
> > > >>>>>>> model.
> > > >>>>>>>> It
> > > >>>>>>>>>>> seems
> > > >>>>>>>>>>>>>>> like it would be sufficient, and just as ergonomic, to
> > > >>>>>> have
> > > >>>>>>>> the
> > > >>>>>>>>>>> input
> > > >>>>>>>>>>>>>>> topic accept ProducerRecords and the output topic
> return
> > > >>>>>>>>>>>>>>> ConsumerRecords. I'm open to discussion on this point,
> > > >>>>>>>> though.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Thanks for the proposal, Jukka!
> > > >>>>>>>>>>>>>>> -John
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> On Mon, May 20, 2019 at 7:59 AM Jukka Karvanen
> > > >>>>>>>>>>>>>>> <jukka.karva...@jukinimi.com> wrote:
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Hi All,
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> I would like to start the discussion on KIP-470:
> > > >>>>>>>>>>> TopologyTestDriver
> > > >>>>>>>>>>>>> test
> > > >>>>>>>>>>>>>>>> input and output usability improvements:
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-470%3A+TopologyTestDriver+test+input+and+output+usability+improvements
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> This KIP is inspired by the Discussion in KIP-456:
> > > >>>>>> Helper
> > > >>>>>>>>>>> classes to
> > > >>>>>>>>>>>>> make
> > > >>>>>>>>>>>>>>>> it simpler to write test logic with
> TopologyTestDriver:
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-456
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> %3A+Helper+classes+to+make+it+simpler+to+write+test+logic+with+TopologyTestDriver
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> The proposal in KIP-456
> > > >>>>>>>>>>>>>>>> <
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-456%3A+Helper+classes+to+make+it+simpler+to+write+test+logic+with+TopologyTestDriver
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> was
> > > >>>>>>>>>>>>>>>> to add alternate way to input and output topic, but
> > > >>>>>> this
> > > >>>>>>>> KIP
> > > >>>>>>>>>>> enhance
> > > >>>>>>>>>>>>>>> those
> > > >>>>>>>>>>>>>>>> classes and deprecate old functionality to make clear
> > > >>>>>>>>> interface
> > > >>>>>>>>>>> for
> > > >>>>>>>>>>>>> test
> > > >>>>>>>>>>>>>>>> writer to use.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> In current KIP-470 proposal, topic objects are created
> > > >>>>>>> with
> > > >>>>>>>>>>>>> topicName and
> > > >>>>>>>>>>>>>>>> related serders.
> > > >>>>>>>>>>>>>>>>     public final <K, V> TestInputTopic<K, V>
> > > >>>>>>>>>>> createInputTopic(final
> > > >>>>>>>>>>>>>>> String
> > > >>>>>>>>>>>>>>>> topicName, final Serde<K> keySerde, final Serde<V>
> > > >>>>>>>>> valueSerde);
> > > >>>>>>>>>>>>>>>>     public final <K, V> TestOutputTopic<K, V>
> > > >>>>>>>>>>> createOutputTopic(final
> > > >>>>>>>>>>>>>>> String
> > > >>>>>>>>>>>>>>>> topicName, final Serde<K> keySerde, final Serde<V>
> > > >>>>>>>>> valueSerde);
> > > >>>>>>>>>>>>>>>> One thing I wondered if there way to find out topic
> > > >>>>>>> related
> > > >>>>>>>>> serde
> > > >>>>>>>>>>>>> from
> > > >>>>>>>>>>>>>>>> TopologyTestDriver topology, it would simply creation
> > > >>>>>> of
> > > >>>>>>>>> these
> > > >>>>>>>>>>> Topic
> > > >>>>>>>>>>>>>>>> objects:
> > > >>>>>>>>>>>>>>>>     public final <K, V> TestInputTopic<K, V>
> > > >>>>>>>>>>> createInputTopic(final
> > > >>>>>>>>>>>>>>> String
> > > >>>>>>>>>>>>>>>> topicName);
> > > >>>>>>>>>>>>>>>>     public final <K, V> TestOutputTopic<K, V>
> > > >>>>>>>>>>> createOutputTopic(final
> > > >>>>>>>>>>>>>>> String
> > > >>>>>>>>>>>>>>>> topicName);
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> KIP-456 can be discarded if this KIP get accepted.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Best Regards,
> > > >>>>>>>>>>>>>>>> Jukka
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> --
> > > >>>>>>>> -- Guozhang
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> -- Guozhang
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > > >
> > >
> > >
> >
>

Re: [DISCUSS] KIP-470: TopologyTestDriver test input and output usability improvements

Reply via email to