Hi Andy,
I may suggest to check Kafka logs and perhaps see if anything useful comes out of librdkafka stats (ie. set "global, statistics.interval.ms, 60000" in your librdkafka.conf). Check also that, if you are adding load to existing load, the Kafka broker is not pegging 100% CPU or maxing out some threads count (or perhaps, if this is a testing environment, remove the existing load and test with only the pmbgpd export .. that may proof something too).
My first suggestion would have been to tune buffers in librdkafka but you did that already. In any case this is, yes, an interaction between librdkafka and the Kafka broker; i go a bit errands here: make sure you have recent versions of both the library and the broker.
Especially if the topic is newly provisioned i may also suggest to try to produce / consume some data "by hand", like using the kafka-console-producer.sh and kafka-console-consumer.sh scripts shipped with Kafka to proof data passing through no problem.
Paolo On 02/09/2020 09:09, Andy Davidson wrote:
Hello! I am feeding some BMP feeds via pmbmpd into Kafka and it’s working well. I now want to feed some BGP feeds into a Kafka topic using pmbgpd but similar configuration is causing a different behaviour. Sep 1 22:48:00 bump pmbgpd[10992]: INFO ( default/core ): Reading configuration file '/etc/pmacct/pmbgpd.conf'. Sep 1 22:48:00 bump pmbgpd[10992]: INFO ( default/core ): maximum BGP peers allowed: 100 Sep 1 22:48:00 bump pmbgpd[10992]: INFO ( default/core ): waiting for BGP data on 185.1.94.6:179 Sep 1 22:48:03 bump pmbgpd[10992]: INFO ( default/core ): [185.1.94.1] BGP peers usage: 1/100 Sep 1 22:48:03 bump pmbgpd[10992]: INFO ( default/core ): [185.1.94.1] Capability: MultiProtocol [1] AFI [1] SAFI [1] Sep 1 22:48:03 bump pmbgpd[10992]: INFO ( default/core ): [185.1.94.1] Capability: 4-bytes AS [41] ASN [59964] Sep 1 22:48:03 bump pmbgpd[10992]: INFO ( default/core ): [185.1.94.1] BGP_OPEN: Local AS: 43470 Remote AS: 59964 HoldTime: 240 Sep 1 22:48:05 bump pmbgpd[10992]: ERROR ( default/core ): Failed to produce to topic bgptest partition -1: Local: Queue full Sep 1 22:48:05 bump pmbgpd[10992]: ERROR ( default/core ): Connection failed to Kafka: p_kafka_close() Sep 1 22:48:05 bump systemd[1]: pmbgpd.service: Main process exited, code=killed, status=11/SEGV Sep 1 22:48:05 bump systemd[1]: pmbgpd.service: Failed with result 'signal'. I have verified that it's not connectivity - the topic is created at the Kafka end of the link, and I can open a tcp socket with telnet from the computer running pmbgpd and the Kafka server's port 9092 I have of course read some Github issues and list archives about the Local: Queue full fault and it suggests some librdkafka buffer and timer tweaking, I have played with various values (some of them insane) and I don't see any different behaviour logged by pmbgpd: root@bump:/home/andy# cat /etc/pmacct/pmbgpd.conf bgp_daemon_ip: 185.1.94.6 bgp_daemon_max_peers: 100 bgp_daemon_as: 43470 ! syslog: user daemonize: true ! kafka_config_file: /etc/pmacct/librdkafka.conf ! bgp_daemon_msglog_kafka_output: json bgp_daemon_msglog_kafka_broker_host: xxxx.hostname bgp_daemon_msglog_kafka_broker_port: 9092 bgp_daemon_msglog_kafka_topic: bgptest root@bump:/home/andy# cat /etc/pmacct/librdkafka.conf global, queue.buffering.max.messages, 8000000 global, batch.num.messages, 100000 global, queue.buffering.max.messages, 20000 global, queue.buffering.max.ms, 100 global, queue.buffering.max.kbytes, 9000000 global, linger.ms, 100 global, socket.request.max.bytes, 104857600 global, socket.receive.buffer.bytes, 10485760 global, socket.send.buffer.bytes, 10485760 global, queued.max.requests, 1000 Any advice on where to troubleshoot next? Thanks Andy _______________________________________________ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
_______________________________________________ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists