pnowojski commented on PR #21690:
URL: https://github.com/apache/flink/pull/21690#issuecomment-1386685819

   > The other thing that looks strange if I take a step back, is that we now 
have to different while (true) loops. One in the 
MailboxProcessor#runMailboxLoop the other in SourceOperator and with 
[FLINK-30709](https://issues.apache.org/jira/browse/FLINK-30709) in the network 
source as well, doing basically the same thing.
   
   I have thought about this a bit more, and I don't see an easy to implement 
alternative, so let's keep it as it is.
   
   > After thinking about DataOutput#canEmitBatchOfRecords, I think that it 
looks strange to have DataOutput determine whether it can continue to emit 
records. The source of truth of canEmitBatchOfRecords() comes from things like 
mailbox, and it is more intuitive to keep it outside/decoupled from DataOutput.
   
   For me it's the opposite thing :) Take a look at the 
`PushingAsyncDataInput`, that defines the internal APIs for 
pushing/pulling/emitting records. It would be strange for me that in order to 
fully implement this interface and use it as it's designed, one would have to 
ask "mailbox" or anything else. Why does the implementations (source 
operator/network input) should care where does this information come from? IMO 
Implementations should just emit record and check if they can immediately emit 
next one, ideally from the returned value of the `DataOutput#emitRecord` 
method. This pattern would be also consistent with returned status from the 
`PushingAsyncDataInput#emitNext` method. And on top of that, it would save us 
one extra method call, so should be slightly faster :)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to