[ 
https://issues.apache.org/jira/browse/KAFKA-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17490617#comment-17490617
 ] 

RivenSun commented on KAFKA-13576:
----------------------------------

Hi [~rsivaram]  [~ijuma] , [~guozhang] 

can you give any suggestions?
Thanks.

> Processor.ConnectionQueueSize provides configuration & metrics, 
> SelectorMetrics adds connection-register related metrics
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-13576
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13576
>             Project: Kafka
>          Issue Type: Improvement
>          Components: metrics, network
>    Affects Versions: 3.0.0
>            Reporter: RivenSun
>            Assignee: Luke Chen
>            Priority: Major
>
> h1. Problem:
> After all client machines are switched to the company's private BYOIP, 
> producers who send messages frequently have a significant increase in time 
> consumption. Producers who send messages infrequently often throw out 
> exceptions that send messages to obtain metadata timeout. Everything was 
> normal before switching
> h1. RC:
> 1. The client's BYOIP lacks DNS-PTR configuration
> 2. When the port uses SASL_SSL protocol, the underlying method 
> SaslChannelBuilder#buildTransportLayer of Processor#configureNewConnections 
> will call socketChannel.socket().getInetAddress().getHostName() to trigger 
> DNS reverse lookup. If clientIp lacks PTR configuration, this will cause 
> getHostName() will be time consuming.
> 3. Several steps in the processor's run method are executed serially. If 
> configureNewConnections takes time, it will inevitably cause the completed 
> response to not be sent to the client in time, resulting in an increase in 
> the ack time for the producer to send messages
> 4. ConfigureNewConnections is time-consuming, which will cause the elements 
> in Processor.newConnections to not be removed in time, which will increase 
> the time-consuming of the Acceptor#assignNewConnection method. 
> AssignNewConnection will even block in newConnections.put(socketChannel). At 
> this time, the Acceptor thread may reject any new creation TCP connection 
> request.
> h1. Solution:
> 1. Add DNS-PTR configuration to the BYOIP of the client
> 2. Kafka high version has fixed this problem,
> https://issues.apache.org/jira/browse/KAFKA-8562
> [https://github.com/apache/kafka/pull/10059]
> 3. Selector Metrics of each processor’s selector, add *connection-register* 
> related metrics.
> Selector#register(String id, SocketChannel socketChannel) In this method, 
> update the connection-register related indicators, the metrics indicator type 
> is expected to use newHistogram, which is similar to the attribute field of 
> *responseQueueTimeMs*
> 4.
> 1) The queue size of Processor.newConnections is recommended to be 
> configurable
> Source code:
> {code:java}
> private[kafka] object Processor {
>   val IdlePercentMetricName = "IdlePercent"
>   val NetworkProcessorMetricTag = "networkProcessor"
>   val ListenerMetricTag = "listener"
>   val ConnectionQueueSize = 20
> }{code}
> The current value is 20, and the code is hard-coded here, perhaps for design 
> considerations, but it is still recommended to provide configuration, 
> *queued.max.connections* acts on processors of all ports,
> Or the processor of each listener port provides independent configuration
> *listener.name.\{listenerName}.queued.max.connections*
> 2) Provide metrics statistics for each processor’s newConnections queue size: 
> {*}ConnectionQueueSize{*}, ConnectionQueueSize metrics can refer to the 
> *ResponseQueueSize* maintained in RequestChannel



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to