Hi, I will write new KIP for the TestTopologyDriver Input and Output usability changes. It is out of the scope of the current title: "Helper classes to make it simpler to write test logic with TopologyTestDriver" and we can get back to this KIP if that alternative is not favored.
So my original approach was not to modify existing classes, but if we end up modifing TTD, I would also change the way to instantiate these topics. We could add getInputTopic("my-topic") / getOutputTopic("my-topic") to TTD, so it would work same way as with getStateStore and related methods. If there would be a way to find out needed serders for the topic, it would make API even simpler. Generally still as a end user, I would prefer not only swapping the ConsumerRecord and ProducerRecord, but having interface accepting and returning Record, not needing to think about are those ConsumerRecord or ProducerRecords. and that way would could use same classes to pipe in and assert the result.Something similar than "private final static class Record" in TopologyTestDriverTest. Jukka ke 8. toukok. 2019 klo 17.01 John Roesler (j...@confluent.io) kirjoitti: > Hi Jukka, thanks for the reply! > > I think this is a good summary (the discussion was getting a little > unwieldy. I'll reply inline. > > Also, thanks for clarify about your library vs. this KIP. That makes > perfect sense to me. > > > > 1. Add JavaDoc for KIP > > > > Is there a good example of KIP where Javadoc is included, so I can > follow? > > I create this KIP based on this as an example:: > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-247%3A+Add+public+test+utils+for+Kafka+Streams > > > > > > Now added some comments to KIP page to clarify timestamp handling, but I > > did not want to add full JavaDoc of each methods. > > Is that enough? > > That's fine. I was just trying to make the review process more > efficient for other reviewers (which makes getting your KIP accepted > more efficient). I reviewed a few recent KIPs, and, indeed, I see that > javadocs are not actually as common as I thought. > > > 2. TTD usability changes and swapping ConsumerRecord and ProducerRecord > in > > APIs > > > > To my point of view only: > > - changing readRecord to return ConsumerRecord would cause we cannot use > > OutputVerifier > > Yes, we'd likely have to provide new methods in OutputVerifier to work > with ConsumerRecord. If you buy into the plan of deprecating most of > the current-style interactions, this wouldn't be that confusing, since > all the ProducerRecord verifications would be deprecated, and only the > ConsumerRecord verifications would remain "live". > > > - changing pipeInput to take in ProducerRecord, but not providing easy > way > > to contruct those like ConsumerRecordFactory > > I didn't follow this as well. The ConsumerRecordFactory is there > because it's a pain to construct ConsumerRecords. Conversely, > ProducerRecord has many convenience constructors, so we wouldn't need > a factory at all. This is a net win for users, since there's less > surface area for them to deal with. Under my proposal, we'd deprecate > the whole ConsumerRecordFactory. > > Note that there's an "idea parity check" here: ConsumerRecords are > hard to construct because developers aren't meant to ever construct > them. They are meant to construct ProducerRecords, which is why it's > made easy. TTD has inverted the relationships of these classes, which > is why the ConsumerRecordFactory is necessary, but if we correct it, > and return to a "normal" interaction with the Client library, then we > don't need special support classes. > > > - if initializing ConsumerRecord to/from ProducerRecord in these > classes > > field by field contructor, there are risk new fields are not added to > this > > classes if there are changes in ProducerRecord or ConsumerRecord > > This risk seems pretty low, to be honest. We will have tests that > exercise this testing framework, so if anyone changes ProducerRecord > or ConsumerRecord, our tests will break. Since both libraries are > build together, the problem would be fixed before the change is ever > merged to trunk. > > > I would propose a separate KIP for these and probably other > enhanchements: > > -superclass or common interface for ConsumerRecord and ProducerRecord > > -contructors to ConsumerRecord and ProducerRecord to initialize with this > > superclass > > -modify OutputVerifier to work with both ConsumerRecord and > ProducerRecord > > -create new RecordFactory to replace ConsumerRecordFactory > > I understand your motivation to control the scope of this change, but > I actually think that it's better for user-facing design changes to > occur in fewer, bigger, chunks, rather than many small changes. People > will get fatigued if multiple releases in a row change the > test-support library from under their feet. Better to do it in one > shot. > > Plus, this is a design discussion. We need to include the whole scope > of the system in the design, or we may realize in Phase 3 that there > was some design error in Phase 1, since we were only designing > locally. This doesn't mean that we only need one Jira ticket, there > can be many in support of this KIP, or that we only need one PR, it's > certainly better to send multiple small PRs to decrease risk and ease > reviews. But the design discussion doesn't need to be fragmented > similarly. > > > 3. return null vs NoSuchElementException when empty queue > > > > Should this be also included to the above TTD usability changes? > > If single row read methods is changed to throw expectiong, it would > require > > addition of hasRecords to able to verified the empty queue scenarios. > > I do not know how to implement it currently without modifying TTD to > > provide some kind way to get the queue size or peak items. > > Yes, it's absolutely within bounds to propose changes to TTD to > support the ergonomic API you're proposing. > > > 4. IllegalArgumentException("topic doesn't exist") > > Is this worth separate ticket? > > This is your call. That was just an idea in response to your experience. > > > 5. org.apache.kafka.streams.test vs org.apache.kafka.streams > > > > I was thinking org.apache.kafka.streams.test where also OutputVerifier > and > > ConsumerRecordFactory exist would be more logical place, but > > I do not know is there some technical reasons why TTD are in > > org.apache.kafka.streams, not in org.apache.kafka.streams.test where > other > > classes are. > > > > Did I skip something? > > Ah, no, you're right. I'm not sure why that is. I admit it's > confusing. I don't think the package matters *that* much, just keep it > wherever you think is appropriate. > > > That's all! Thanks for entertaining my thoughts. > -John >