Hi,
We recently enabled timestamp and security features in our production
clusters. We have 5 clusters which are smaller and 2 larger aggreagtion
clusters which mirror data from the 5 clusters.

The version of Kafka is 0.10.1.1.

For security we enabled the brokers to have both PLAINTEXT and
 SASL_PLAINTEXT listeners and also enabled inter broker security and
authorization.

Enabling the above features did not have any impact on the smaller clusters
but we saw a dramatic decrease in throughput and packets in each of the
broker servers of the aggregation clusters.
MirrorMaker was keeping up with the lag from the smaller clusters, but some
of the consumer clients which were consuming from aggregation clusters
could not keep up with the load anymore.

We also saw a lot of ISR shrinks and expands, but increasing the
num.replica.fetchers
replica.lag.time.max.ms seemed to fix the ISR issue but we continued to see
the throughput and packet issue. We then disabled just inter broker
security but again that did not make a difference. We finally rolled back
all the security related changes, No authentication or authorization on the
aggregation cluster and that seemed to fix the throughput and packet issue.
Both these parameters look normal again.

Any ideas or thoughts on what could have gone wrong or is this the expected
behavior ?

Thanks,
Meghana

Reply via email to