Re: [DISCUSS] KIP-456: Helper classes to make it simpler to write test logic with TopologyTestDriver

Patrik Kleindl Thu, 09 May 2019 05:32:22 -0700

Hi Jukka
Regarding your comment
> If there would be a way to find out needed serders for the topic, it
would make API even simpler.
I was wondering if it wouldn't make more sense to have a "topic object"
including the Serdes and use this instead of only passing in the name as a
string everywhere.
>From a low-level perspective Kafka does and should not care what is inside
the topic, but from a user perspective this information usually belongs
together.
Sidenote: Having topics as objects would probably also make it easier to
track them from the outside.
regards
Patrik


On Thu, 9 May 2019 at 10:39, Jukka Karvanen <jukka.karva...@jukinimi.com>
wrote:

> Hi,
>
> I will write new KIP for the TestTopologyDriver Input and Output usability
> changes.
> It is out of the scope of the current title: "Helper classes to make it
> simpler to write test logic with TopologyTestDriver"
> and we can get back to this KIP if that alternative is not favored.
>
> So my original approach was not to modify existing classes, but if we end
> up modifing TTD, I would also change the
> way to instantiate these topics. We could add getInputTopic("my-topic") /
> getOutputTopic("my-topic") to TTD, so it would work
> same way as with getStateStore and related methods.
>
> If there would be a way to find out needed serders for the topic, it would
> make API even simpler.
>
> Generally still as a end user, I would prefer not only swapping the
> ConsumerRecord and ProducerRecord, but having
> interface accepting and returning Record, not needing to think about are
> those ConsumerRecord or ProducerRecords.
> and that way would could use same classes to pipe in and assert the
> result.Something similar than  "private final static class Record"
> in TopologyTestDriverTest.
>
> Jukka
>
> ke 8. toukok. 2019 klo 17.01 John Roesler (j...@confluent.io) kirjoitti:
>
> > Hi Jukka, thanks for the reply!
> >
> > I think this is a good summary (the discussion was getting a little
> > unwieldy. I'll reply inline.
> >
> > Also, thanks for clarify about your library vs. this KIP. That makes
> > perfect sense to me.
> > >
> > > 1. Add JavaDoc for KIP
> > >
> > > Is there a good example of KIP where Javadoc is included, so I can
> > follow?
> > > I create this KIP based on this as an example::
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-247%3A+Add+public+test+utils+for+Kafka+Streams
> > >
> > >
> > > Now added some comments to KIP page to clarify timestamp handling, but
> I
> > > did not want to add full JavaDoc of each methods.
> > > Is that enough?
> >
> > That's fine. I was just trying to make the review process more
> > efficient for other reviewers (which makes getting your KIP accepted
> > more efficient). I reviewed a few recent KIPs, and, indeed, I see that
> > javadocs are not actually as common as I thought.
> >
> > > 2. TTD usability changes and swapping ConsumerRecord and ProducerRecord
> > in
> > > APIs
> > >
> > > To my point of view only:
> > > - changing readRecord to return ConsumerRecord would cause we cannot
> use
> > > OutputVerifier
> >
> > Yes, we'd likely have to provide new methods in OutputVerifier to work
> > with ConsumerRecord. If you buy into the plan of deprecating most of
> > the current-style interactions, this wouldn't be that confusing, since
> > all the ProducerRecord verifications would be deprecated, and only the
> > ConsumerRecord verifications would remain "live".
> >
> > > - changing pipeInput to take in ProducerRecord, but not providing easy
> > way
> > > to contruct those like ConsumerRecordFactory
> >
> > I didn't follow this as well. The ConsumerRecordFactory is there
> > because it's a pain to construct ConsumerRecords. Conversely,
> > ProducerRecord has many convenience constructors, so we wouldn't need
> > a factory at all. This is a net win for users, since there's less
> > surface area for them to deal with. Under my proposal, we'd deprecate
> > the whole ConsumerRecordFactory.
> >
> > Note that there's an "idea parity check" here: ConsumerRecords are
> > hard to construct because developers aren't meant to ever construct
> > them. They are meant to construct ProducerRecords, which is why it's
> > made easy. TTD has inverted the relationships of these classes, which
> > is why the ConsumerRecordFactory is necessary, but if we correct it,
> > and return to a "normal" interaction with the Client library, then we
> > don't need special support classes.
> >
> > > - if initializing ConsumerRecord to/from  ProducerRecord  in these
> > classes
> > > field by field contructor, there are risk new fields are not added to
> > this
> > > classes if there are changes in ProducerRecord or ConsumerRecord
> >
> > This risk seems pretty low, to be honest. We will have tests that
> > exercise this testing framework, so if anyone changes ProducerRecord
> > or ConsumerRecord, our tests will break. Since both libraries are
> > build together, the problem would be fixed before the change is ever
> > merged to trunk.
> >
> > > I would propose a separate KIP for these and probably other
> > enhanchements:
> > > -superclass or common interface for ConsumerRecord and ProducerRecord
> > > -contructors to ConsumerRecord and ProducerRecord to initialize with
> this
> > > superclass
> > > -modify OutputVerifier to work with both ConsumerRecord and
> > ProducerRecord
> > > -create new RecordFactory to replace ConsumerRecordFactory
> >
> > I understand your motivation to control the scope of this change, but
> > I actually think that it's better for user-facing design changes to
> > occur in fewer, bigger, chunks, rather than many small changes. People
> > will get fatigued if multiple releases in a row change the
> > test-support library from under their feet. Better to do it in one
> > shot.
> >
> > Plus, this is a design discussion. We need to include the whole scope
> > of the system in the design, or we may realize in Phase 3 that there
> > was some design error in Phase 1, since we were only designing
> > locally. This doesn't mean that we only need one Jira ticket, there
> > can be many in support of this KIP, or that we only need one PR, it's
> > certainly better to send multiple small PRs to decrease risk and ease
> > reviews. But the design discussion doesn't need to be fragmented
> > similarly.
> >
> > > 3. return null vs NoSuchElementException when empty queue
> > >
> > > Should this be also included to the above TTD usability changes?
> > > If single row read methods is changed to throw expectiong, it would
> > require
> > > addition of hasRecords to able to verified the empty queue scenarios.
> > > I do not know how to implement it currently without modifying TTD to
> > > provide some kind way to get the queue size or peak items.
> >
> > Yes, it's absolutely within bounds to propose changes to TTD to
> > support the ergonomic API you're proposing.
> >
> > > 4. IllegalArgumentException("topic doesn't exist")
> > > Is this worth separate ticket?
> >
> > This is your call. That was just an idea in response to your experience.
> >
> > > 5. org.apache.kafka.streams.test vs org.apache.kafka.streams
> > >
> > > I was thinking org.apache.kafka.streams.test where also OutputVerifier
> > and
> > > ConsumerRecordFactory exist would be more logical place, but
> > > I do not know is there some technical reasons why TTD are in
> > > org.apache.kafka.streams, not in org.apache.kafka.streams.test where
> > other
> > > classes are.
> > >
> > > Did I skip something?
> >
> > Ah, no, you're right. I'm not sure why that is. I admit it's
> > confusing. I don't think the package matters *that* much, just keep it
> > wherever you think is appropriate.
> >
> >
> > That's all! Thanks for entertaining my thoughts.
> > -John
> >
>

Re: [DISCUSS] KIP-456: Helper classes to make it simpler to write test logic with TopologyTestDriver

Reply via email to