Consume data-skewed partitions using Kafka-streams causes consumer load balancing issue.
Hello kafka-users, I have 50 topics, each with 32 partitions where data is being ingested continuously. Data is being published in these 50 partitions externally (no control) which causes data skew amount the partitions of each topic. For example: For topic-1, partition-1 contains 100 events, while partition-2 can have 10K events and so on for all 50 topics. *Consuming data from all 50 topics using kafka-stream mechanism,* - Running 4 consumer instances, all within the same consumer-group. - Num of threads per consumer process: 8 As data among partitions are not evenly distributed (Data-skewed partitions across topics), I see 1 or 2 consumer instances (JVM) are processing/consuming very less records compared to other 2 instances, My guess is these instances process partitions with less data. *Can someone help, how can I balance the consumers here (distribute consumer workload evenly across 4 consumer instances)? Expectation here is that all 4 consumer instances should process approx. same amount of events. * Looking forward to hearing your inputs. Thanks in advance. *Ankit.*
Inquiry about using SSL encryption and SASL authentication for Kafka without specifying IP address in SAN in the CA certificate
Hi Luke, We are using Kafka 2.8.1 Broker/Client system in our prod environment with SASL_SSL communication between Kafka Clients and Broker. We are using the IP for the property “bootstrap.servers” while initiating the KafkaConsumer. Due to some reason, one of our Customer is unable to use the IP in the CA certificate and provided only hostname in the SAN entry in the certificate due to which he is getting following exception in the logs: org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed Caused by: javax.net.ssl.SSLHandshakeException: No subject alternative names matching IP address xx.xx.xx.xx found at sun.security.ssl.Alert.createSSLException(Alert.java:131) at sun.security.ssl.TransportContext.fatal(TransportContext.java:324) at sun.security.ssl.TransportContext.fatal(TransportContext.java:267) at sun.security.ssl.TransportContext.fatal(TransportContext.java:262) at sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:654) Even after disabling the hostname verifier, he is unable to send the data from Client to broker. He has also added the Ip – hostname of the broker entry in /etc/hosts file Can you please let us know: 1. Is IP and DNS both field mandatory in SAN for Kafka Certificates? 2. If no, why the communication is failing without the IP? Regards, Deepak Jain Cumulus Systems
Clarification Kafka - MySQL
Dear Team, We wanted to use kafka and therefore installed on ubuntu ( Ubuntu 20.04.4 LTS ) kafka ( 3.2.0 & zoopkeeper 3.6.3--6401e4ad2087061bc6b9f80dec2d69f2e3c8660a) . With this installation, we wanted to connect to mysql server installed on the same server. Is it possible? If so what document to follow ? Or We need to install and additional like confluent etc .. With warm regards Alalasundaram Saravanan