Lianet Magrans created KAFKA-17696:
--------------------------------------

             Summary: New consumer background operations unaware of metadata 
errors
                 Key: KAFKA-17696
                 URL: https://issues.apache.org/jira/browse/KAFKA-17696
             Project: Kafka
          Issue Type: Bug
          Components: clients, consumer
            Reporter: Lianet Magrans


When a metadata error happens (ie. Unauthorized topic), the network layer is 
the one to detect it and it just propagates it to the app thread via en 
ErrorEvent.

[https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/NetworkClientDelegate.java#L153]

That allows api calls that processBackgroundEvents to throw the error in the 
app thread (ex. poll, unsubscribe and close, which are the only api calls that 
currently processBackgroundEvents).

This means that all other api calls that do not processBackgroundEvent will 
never know about errors like Unauthorized topics. Moreover, it really means 
that the background operations are not notified/aborted when a metadata error 
happens (auth error). Ex. call to position block waiting for the 
updateFetchPositions 
([here|https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L1586]),
 will leave a
pendingOffsetFetchEvent waiting to complete, even when the background already 
got an Unauthorized exception (but it only passed it to the app thread via 
ErrorEvent)
I wonder if we should ensure that metadata errors are not only propagated to 
the app thread via ErrorEvents, but also ensure that we notify all request 
managers in the background (so that they can decide if completeExceptionally 
their outstanding events). Ex. OffsetsRequestManager.onMetadataError should 
completeExceptionally the pendingOffsetFetchEvent (just first thought, there 
could be other approaches, but note that calling processBackgroundEvent in api 
calls like positions will not do because we would block first on the 
CheckAndUpdatePositions, then processBackgroundEvents that would only happen 
after the CheckAndUpdate)

 

This behaviour can be repro with the integration test 
testOffsetFetchWithNoAccess with the new consumer enabled
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to