Consume data-skewed partitions using Kafka-streams causes consumer load balancing issue.

2022-07-07 Thread ankit Soni
Hello kafka-users,

I have 50 topics, each with 32 partitions where data is being ingested
continuously.

Data is being published in these 50 partitions externally (no control)
which causes data skew amount the partitions of each topic.

For example: For topic-1, partition-1 contains 100 events, while
partition-2 can have 10K events and so on for all 50 topics.

*Consuming data from all 50 topics using kafka-stream mechanism,*

   - Running 4 consumer instances, all within the same consumer-group.
   - Num of threads per consumer process: 8


As data among partitions are not evenly distributed (Data-skewed partitions
across topics), I see 1 or 2 consumer instances (JVM) are
processing/consuming very less records compared to other 2 instances, My
guess is these instances process partitions with less data.

*Can someone help, how can I balance the consumers here (distribute
consumer workload evenly across 4 consumer instances)? Expectation here is
that all 4 consumer instances should process approx. same amount of
events. *

Looking forward to hearing your inputs.

Thanks in advance.

*Ankit.*


Inquiry about using SSL encryption and SASL authentication for Kafka without specifying IP address in SAN in the CA certificate

2022-07-07 Thread Deepak Jain
Hi Luke,

We are using Kafka 2.8.1 Broker/Client system in our prod environment with 
SASL_SSL communication between Kafka Clients and Broker.  We are using the IP 
for the property “bootstrap.servers” while initiating the KafkaConsumer. Due to 
some reason, one of our Customer is unable to use the IP in the CA certificate 
and provided only hostname in the SAN entry in the certificate due to which he 
is getting following exception in the logs:

org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed
Caused by: javax.net.ssl.SSLHandshakeException: No subject alternative names 
matching IP address xx.xx.xx.xx found
at sun.security.ssl.Alert.createSSLException(Alert.java:131)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:324)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:267)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:262)
at 
sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:654)
Even after disabling the hostname verifier, he is unable to send the data from 
Client to broker. He has also added the Ip – hostname of the broker entry in 
/etc/hosts file

Can you please let us know:


  1.  Is IP and DNS both field mandatory in SAN for Kafka Certificates?
  2.  If no, why the communication is failing without the IP?


Regards,
Deepak Jain
Cumulus Systems


Clarification Kafka - MySQL

2022-07-07 Thread Alalasundaram Saravanan

Dear Team,

We wanted to use kafka and therefore installed on ubuntu ( Ubuntu 
20.04.4 LTS ) kafka ( 3.2.0  & zoopkeeper 
3.6.3--6401e4ad2087061bc6b9f80dec2d69f2e3c8660a) .



With this installation, we wanted to connect to mysql server installed 
on the same  server.


Is it possible? If so what document to follow ?

Or We need to install and additional like confluent etc ..


With warm regards

Alalasundaram Saravanan