[ https://issues.apache.org/jira/browse/KAFKA-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15533233#comment-15533233 ]
ASF GitHub Bot commented on KAFKA-4206: --------------------------------------- GitHub user edoardocomar opened a pull request: https://github.com/apache/kafka/pull/1934 KAFKA-4206 Improve handling of invalid credentials to mitigate DOS issue Delay closing channels for connections where a SALException has been thrown. This PR is a proof of concept and would like to stimulate feedback. This same approach has been used successfully in IBM MessageHub and **proved** capable of reducing dramatically the impact of SSL connections with wrong SASL credentials. Without this patch, a lot of cpu time is dedicated to SSL handshakes, many network threads are busy and the overall latencies suffer for already authenticated clients. this PR has been codeveloped with @mimaison You can merge this pull request into a Git repository by running: $ git pull https://github.com/edoardocomar/kafka KAFKA-4206 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/1934.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1934 ---- commit ee21cc5e09a4f8cd4fbd42689d85907f16ee204a Author: Edoardo Comar <eco...@uk.ibm.com> Date: 2016-09-29T16:16:40Z KAFKA-4206 Improve handling of invalid credentials to mitigate DOS issue Delay closing channels for connections where a SALException has been thrown ---- > Improve handling of invalid credentials to mitigate DOS issue (especially on > SSL listeners) > ------------------------------------------------------------------------------------------- > > Key: KAFKA-4206 > URL: https://issues.apache.org/jira/browse/KAFKA-4206 > Project: Kafka > Issue Type: Improvement > Components: network, security > Affects Versions: 0.10.0.0, 0.10.0.1 > Reporter: Edoardo Comar > Assignee: Edoardo Comar > > The current handling of invalid credentials (ie wrong user/password) is to > let the {{SaslException}} thrown from an implementation of > {{javax.security.sasl.SaslServer.evaluateResponse()}} > bubble up the call stack until it gets caught in > {{org.apache.kafka.common.network.Selector.pollSelectionKeys()}} > where the {{KafkaChannel}} gets closed - which will cause the client that > made the request to be disconnected. > This will happen however after the server has used considerable resources, > especially for the SSL handshake which appears to be computationally > expensive in Java. > We have observed that if just a few clients keep repeating requests with the > wrong credentials, it is quite easy to get all the network processing threads > in the Kafka server busy doing SSL handshakes. > This makes a Kafka cluster to easily suffer from a Denial Of Service - also > non intentional - attack. > It can be non intentional, i.e. also caused by friendly clients, for example > because a Kafka Java client Producer supplied with the wrong credentials will > not throw an exception on publishing, so it may keep attempting to connect > without the caller realising. > An easy fix which we have implemented and will supply a PR for is to *delay* > considerably closing the {{KafkaChannel}} in the {{Selector}}, but obviously > without blocking the processing thread. > This has been tested to be very effective in reducing the cpu usage spikes > caused by non malicious clients using invalid SASL PLAIN credentials over SSL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)