Hi Jukka, thanks for the reply! I think this is a good summary (the discussion was getting a little unwieldy. I'll reply inline.
Also, thanks for clarify about your library vs. this KIP. That makes perfect sense to me. > > 1. Add JavaDoc for KIP > > Is there a good example of KIP where Javadoc is included, so I can follow? > I create this KIP based on this as an example:: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-247%3A+Add+public+test+utils+for+Kafka+Streams > > > Now added some comments to KIP page to clarify timestamp handling, but I > did not want to add full JavaDoc of each methods. > Is that enough? That's fine. I was just trying to make the review process more efficient for other reviewers (which makes getting your KIP accepted more efficient). I reviewed a few recent KIPs, and, indeed, I see that javadocs are not actually as common as I thought. > 2. TTD usability changes and swapping ConsumerRecord and ProducerRecord in > APIs > > To my point of view only: > - changing readRecord to return ConsumerRecord would cause we cannot use > OutputVerifier Yes, we'd likely have to provide new methods in OutputVerifier to work with ConsumerRecord. If you buy into the plan of deprecating most of the current-style interactions, this wouldn't be that confusing, since all the ProducerRecord verifications would be deprecated, and only the ConsumerRecord verifications would remain "live". > - changing pipeInput to take in ProducerRecord, but not providing easy way > to contruct those like ConsumerRecordFactory I didn't follow this as well. The ConsumerRecordFactory is there because it's a pain to construct ConsumerRecords. Conversely, ProducerRecord has many convenience constructors, so we wouldn't need a factory at all. This is a net win for users, since there's less surface area for them to deal with. Under my proposal, we'd deprecate the whole ConsumerRecordFactory. Note that there's an "idea parity check" here: ConsumerRecords are hard to construct because developers aren't meant to ever construct them. They are meant to construct ProducerRecords, which is why it's made easy. TTD has inverted the relationships of these classes, which is why the ConsumerRecordFactory is necessary, but if we correct it, and return to a "normal" interaction with the Client library, then we don't need special support classes. > - if initializing ConsumerRecord to/from ProducerRecord in these classes > field by field contructor, there are risk new fields are not added to this > classes if there are changes in ProducerRecord or ConsumerRecord This risk seems pretty low, to be honest. We will have tests that exercise this testing framework, so if anyone changes ProducerRecord or ConsumerRecord, our tests will break. Since both libraries are build together, the problem would be fixed before the change is ever merged to trunk. > I would propose a separate KIP for these and probably other enhanchements: > -superclass or common interface for ConsumerRecord and ProducerRecord > -contructors to ConsumerRecord and ProducerRecord to initialize with this > superclass > -modify OutputVerifier to work with both ConsumerRecord and ProducerRecord > -create new RecordFactory to replace ConsumerRecordFactory I understand your motivation to control the scope of this change, but I actually think that it's better for user-facing design changes to occur in fewer, bigger, chunks, rather than many small changes. People will get fatigued if multiple releases in a row change the test-support library from under their feet. Better to do it in one shot. Plus, this is a design discussion. We need to include the whole scope of the system in the design, or we may realize in Phase 3 that there was some design error in Phase 1, since we were only designing locally. This doesn't mean that we only need one Jira ticket, there can be many in support of this KIP, or that we only need one PR, it's certainly better to send multiple small PRs to decrease risk and ease reviews. But the design discussion doesn't need to be fragmented similarly. > 3. return null vs NoSuchElementException when empty queue > > Should this be also included to the above TTD usability changes? > If single row read methods is changed to throw expectiong, it would require > addition of hasRecords to able to verified the empty queue scenarios. > I do not know how to implement it currently without modifying TTD to > provide some kind way to get the queue size or peak items. Yes, it's absolutely within bounds to propose changes to TTD to support the ergonomic API you're proposing. > 4. IllegalArgumentException("topic doesn't exist") > Is this worth separate ticket? This is your call. That was just an idea in response to your experience. > 5. org.apache.kafka.streams.test vs org.apache.kafka.streams > > I was thinking org.apache.kafka.streams.test where also OutputVerifier and > ConsumerRecordFactory exist would be more logical place, but > I do not know is there some technical reasons why TTD are in > org.apache.kafka.streams, not in org.apache.kafka.streams.test where other > classes are. > > Did I skip something? Ah, no, you're right. I'm not sure why that is. I admit it's confusing. I don't think the package matters *that* much, just keep it wherever you think is appropriate. That's all! Thanks for entertaining my thoughts. -John