weeco commented on pull request #10807:
URL: https://github.com/apache/kafka/pull/10807#issuecomment-853223533


   Hey @tombentley , thanks for tackling this! I understand your performance 
concerns in regards to the quota implementation; I can't judge if a quota for 
that is actually too expensive. However here are my thoughts regarding your PR:
   
   Unless I missunderstand your proposed changes here, I believe that logging 
the fetch session slot evictor is currently not super helpful because:
   1. I believe fetch sessions are expected to be evicted. For me it seems like 
fetch sessions are not proactively relived from the session slots cache. In my 
clusters I can see the fetch session slots slowly growing over time (hours 
until all slots are taken) and then fetch sessions are regularly evicted at a 
rate of 1-3 evictions / second. I noticed this by tracking this via the JMX 
metrics `IncrementalFetchSessionEvictionsPerSec` and 
`NumIncrementalFetchSessions` (introduced in KIP-227)
   2. A potential attacker (or misbehaving client such as sarama v1.26.0) can 
quickly create new fetch sessions (thousands of fetch sessions per seconds) 
which would then cause one log line for each eviction. This again can cause 
high CPU usage (for printing the log line) or at least cause problems for many 
logging pipelines.
   
   **I could envision two other solutions to mitigate the issue around leaking 
fetch sessions:**
   
   1. Create one fetch session slot cache per client with a fixed, relatively 
low number of available fetch sessions. This way a single client can not 
negatively impact other clients. Possibly this is even a performance 
improvement because the lock contention as required in one large cache that is 
concurrently accessed from many fetch different fetch requests may be reduced?
   2. Introduce a new counter metric that logs the number of fetch sessions 
created per client id. Something like 
`IncrementalFetchSessionsCreatedPerSec<ClientId=([-.\w]+)>`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to