So the kafka performance tools seem to indicate that the problem is not in
the broker, but rather somewhere in librdkafka/OpenSSL.  I'm not completely
sure I got the configs right to try and eliminate any batching
considerations in the latency calculation (it seems like encrypting /
decrypting a batch of 1000 messages 1 time would be more efficient than
encrypting / decrypting 1 message 1000 times and I am interested in the
latter).  BUT doing a relative test of plaintext vs ssl seemed to show the
promised 20% - 50% overhead rather than the thousands of percent I am
seeing with librdkafka + OpenSSL.

Plaintext connection
/usr/local/kafka/kafka_2.11-0.10.1.0$ bin/kafka-run-class.sh
kafka.tools.EndToEndLatency <broker hostname>:9092 latency.test 10000 1 128
/usr/local/kafka/kafka_2.11-0.10.1.0/config/client.properties
...
Avg latency: 1.8739 ms

SSL Connection
bin/kafka-run-class.sh kafka.tools.EndToEndLatency <broker hostname>:9093
latency.test 10000 1 128
/usr/local/kafka/kafka_2.11-0.10.1.0/config/client-ssl.properties
...
Avg latency: 2.4234 ms

Also bin/kafka-producer-perf-test.sh --topic producer.latency.test
--num-records 20 --record-size 128 --throughput 1 --producer.config
/usr/local/kafka/kafka_2.11-0.10.1.0/config/client-ssl.properties
7 records sent, 1.3 records/sec (0.00 MB/sec), 94.7 ms avg latency, 591.0
max latency.
6 records sent, 1.0 records/sec (0.00 MB/sec), 1.8 ms avg latency, 2.0 max
latency.
5 records sent, 1.0 records/sec (0.00 MB/sec), 2.8 ms avg latency, 3.0 max
latency.
20 records sent, 1.022809 records/sec (0.00 MB/sec), 34.75 ms avg latency,
591.00 ms max latency, 3 ms 50th, 591 ms 95th, 591 ms 99th, 591 ms 99.9th.

Seems to show decent latency after the initial SSL handshake as well.

So I will try to look harder at how librdkafka + OpenSSL are doing SSL.  If
I figure anything out, I'll do one last follow up email to save someone
else with this stack a similar headache.

Thanks for teaching me about the command line tools, guys!
- Aaron



On Fri, Nov 18, 2016 at 2:59 PM, Aaron Wilkinson <aa...@modopayments.com>
wrote:

> Thank you both, Hans and Rajini.
>
> I will try out all the methods you suggested and report back.
>
> As an aside my investigation into the known, slow software implementation
> of the GCM class of cipher algorithms in java 8 was a bust.  I tried all of
> the default cipher suites common to OpenSSL (on the client) and java (on
> the broker) and they all gave consistent (slow) results of about 40 ms per
> hop.
>
> For posterity at the time of this writing those were (OpenSSL format):
> DHE-DSS-AES256-GCM-SHA384
> DHE-DSS-AES256-SHA256
> DHE-DSS-AES256-SHA
> DHE-DSS-AES128-GCM-SHA256
> DHE-DSS-AES128-SHA256
> DHE-DSS-AES128-SHA
> EDH-DSS-DES-CBC3-SHA
>
> I can't guarantee that I'm not looking at a problem where the java crypto
> module is not using hardware acceleration.  (I've verified that OpenSSL has
> access to the aesni hardware instructions, but I have no idea how to tell
> if the java crypto module is making use of them.)  However, it would appear
> that it is at least not a problem specific to the GCM algorithm.
>
> - Aaron
>
>
> On Fri, Nov 18, 2016 at 2:37 AM, Rajini Sivaram <
> rajinisiva...@googlemail.com> wrote:
>
>> You can use the tools shipped with Kafka to measure latency.
>>
>> For latency at low load, run:
>>
>>
>>    - bin/kafka-run-class.sh kafka.tools.EndToEndLatency
>>
>>
>> You may also find it useful to run producer performance test at different
>> throughputs. The tool prints out latency as well:
>>
>>
>>    - bin/kafka-producer-perf-test.sh
>>
>>
>> On Fri, Nov 18, 2016 at 1:25 AM, Hans Jespersen <h...@confluent.io>
>> wrote:
>>
>> > Publish lots of messages and measure in seconds or minutes. Otherwise
>> you
>> > are just benchmarking the initial SSL handshake setup time which should
>> > normally be a one time overhead, not a per message overhead. If you just
>> > send one message then of course SSL is much slower.
>> >
>> > -hans
>> >
>> > > On Nov 18, 2016, at 1:07 AM, Aaron Wilkinson <aa...@modopayments.com>
>> > wrote:
>> > >
>> > > Hi, Hans.  I was able to get the command line producer / consumer
>> working
>> > > with SSL but I'm not sure how to measure millisecond resolution
>> latency
>> > > with them.  I thought maybe the '--property print.timestamp=true'
>> > argument
>> > > would help, but only has second resolution.  Do you know of any way to
>> > get
>> > > the consumer to print out a receipt time-stamp with millisecond
>> > > resolution?  Or of any extended documentation on the command line
>> tools
>> > in
>> > > general?
>> > >
>> > > Oh also, a couple other tidbits that may help:
>> > > Ubuntu 16.04
>> > > Kafka 10.1.0
>> > > openjdk version "1.8.0_111"
>> > > TLS 1.2
>> > >
>> > > I was wondering if maybe this could be my problem:
>> > > http://stackoverflow.com/questions/25992131/slow-aes-
>> > gcm-encryption-and-decryption-with-java-8u20
>> > >
>> > > I didn't specify any cipher suites in either the broker or the client
>> > > config which I gather leaves it up to the broker/client to decide
>> during
>> > > TLS handshaking.  I'm not sure if there is an easy way to figure out
>> > which
>> > > one they ended up with...  I'll work on specifying which cipher suite
>> I
>> > > want and try to pick something with which java is simpatico.
>> > >
>> > >
>> > >> On Thu, Nov 17, 2016 at 4:04 PM, Hans Jespersen <h...@confluent.io>
>> > wrote:
>> > >>
>> > >> What is the difference using the bin/kafka-console-producer and
>> > >> kafka-console-consumer as pub/sub clients?
>> > >>
>> > >> see http://docs.confluent.io/3.1.0/kafka/ssl.html
>> > >>
>> > >> -hans
>> > >>
>> > >> /**
>> > >> * Hans Jespersen, Principal Systems Engineer, Confluent Inc.
>> > >> * h...@confluent.io (650)924-2670
>> > >> */
>> > >>
>> > >> On Thu, Nov 17, 2016 at 11:56 PM, Aaron Wilkinson <
>> > aa...@modopayments.com>
>> > >> wrote:
>> > >>
>> > >>> Pardon if this is a oft repeated issue, but all the information I
>> could
>> > >>> find said I should expect a 20-50% performance hit when using SSL
>> with
>> > >>> kafka, and I am seeing closer to 2000-3000%
>> > >>>
>> > >>> I'm trying to get kafka to behave like a fast, secured message bus.
>> > So I
>> > >>> am sending small messages, one at a time.  I have set up a simple, 2
>> > >>> machine experiment in AWS with 1 client machine and 1
>> zookeeper/broker
>> > >>> machine and I'm an running a very linear test.
>> > >>>
>> > >>> There are 2 topics: "request" and "response" and 2 threads on the
>> > client
>> > >>> machine each of which connects to those 2 topics.  Thread 1
>> produces a
>> > >>> "request", thread 2 consumes it and then produces a "response" which
>> > >> thread
>> > >>> 1 then consumes.  At that point thread 1 proceeds to send the next
>> > >>> "request" and the process repeats.
>> > >>>
>> > >>> So there are a total of 4 connections to the broker.
>> > >>>
>> > >>> I can run a sustained test without SSL and see 1 to 1.5 ms per
>> message
>> > >> hop
>> > >>> (where a "hop" means the message has traveled across 1 of the 4
>> > >>> connections- either a production or a consumption of either the
>> request
>> > >> or
>> > >>> the response).
>> > >>>
>> > >>> Each connection for which I turn on SSL increases the hop time 35
>> to 45
>> > >> ms.
>> > >>>
>> > >>> Now, the problem could be with the stack I'm using (PHP 7 talking to
>> > the
>> > >>> broker via the librdkafka C library).  But before I go about trying
>> to
>> > >>> reproduce this with a java client (which is not my forte) I was
>> > wondering
>> > >>> if anyone else has run into a similar issue either with PHP or any
>> > other
>> > >>> language / library.  Or does anyone know a direct way to figure out
>> > >> whether
>> > >>> this slow down is at the broker or at the client?
>> > >>>
>> > >>> Thanks in advance for your help!
>> > >>> Aaron
>> > >>>
>> > >>
>> >
>>
>>
>>
>> --
>> Regards,
>>
>> Rajini
>>
>
>

Reply via email to