pnowojski commented on PR #21690: URL: https://github.com/apache/flink/pull/21690#issuecomment-1386685819
> The other thing that looks strange if I take a step back, is that we now have to different while (true) loops. One in the MailboxProcessor#runMailboxLoop the other in SourceOperator and with [FLINK-30709](https://issues.apache.org/jira/browse/FLINK-30709) in the network source as well, doing basically the same thing. I have thought about this a bit more, and I don't see an easy to implement alternative, so let's keep it as it is. > After thinking about DataOutput#canEmitBatchOfRecords, I think that it looks strange to have DataOutput determine whether it can continue to emit records. The source of truth of canEmitBatchOfRecords() comes from things like mailbox, and it is more intuitive to keep it outside/decoupled from DataOutput. For me it's the opposite thing :) Take a look at the `PushingAsyncDataInput`, that defines the internal APIs for pushing/pulling/emitting records. It would be strange for me that in order to fully implement this interface and use it as it's designed, one would have to ask "mailbox" or anything else. Why does the implementations (source operator/network input) should care where does this information come from? IMO Implementations should just emit record and check if they can immediately emit next one, ideally from the returned value of the `DataOutput#emitRecord` method. This pattern would be also consistent with returned status from the `PushingAsyncDataInput#emitNext` method. And on top of that, it would save us one extra method call, so should be slightly faster :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org