Hi Andy,

I may suggest to check Kafka logs and perhaps see if anything useful comes out of librdkafka stats (ie. set "global, statistics.interval.ms, 60000" in your librdkafka.conf). Check also that, if you are adding load to existing load, the Kafka broker is not pegging 100% CPU or maxing out some threads count (or perhaps, if this is a testing environment, remove the existing load and test with only the pmbgpd export .. that may proof something too).

My first suggestion would have been to tune buffers in librdkafka but you did that already. In any case this is, yes, an interaction between librdkafka and the Kafka broker; i go a bit errands here: make sure you have recent versions of both the library and the broker.

Especially if the topic is newly provisioned i may also suggest to try to produce / consume some data "by hand", like using the kafka-console-producer.sh and kafka-console-consumer.sh scripts shipped with Kafka to proof data passing through no problem.

Paolo

On 02/09/2020 09:09, Andy Davidson wrote:
Hello!

I am feeding some BMP feeds via pmbmpd into Kafka and it’s working well.  I now 
want to feed some BGP feeds into a Kafka topic using pmbgpd but similar 
configuration is causing a different behaviour.

Sep  1 22:48:00 bump pmbgpd[10992]: INFO ( default/core ): Reading 
configuration file '/etc/pmacct/pmbgpd.conf'.
Sep  1 22:48:00 bump pmbgpd[10992]: INFO ( default/core ): maximum BGP peers 
allowed: 100
Sep  1 22:48:00 bump pmbgpd[10992]: INFO ( default/core ): waiting for BGP data 
on 185.1.94.6:179
Sep  1 22:48:03 bump pmbgpd[10992]: INFO ( default/core ): [185.1.94.1] BGP 
peers usage: 1/100
Sep  1 22:48:03 bump pmbgpd[10992]: INFO ( default/core ): [185.1.94.1] 
Capability: MultiProtocol [1] AFI [1] SAFI [1]
Sep  1 22:48:03 bump pmbgpd[10992]: INFO ( default/core ): [185.1.94.1] 
Capability: 4-bytes AS [41] ASN [59964]
Sep  1 22:48:03 bump pmbgpd[10992]: INFO ( default/core ): [185.1.94.1] 
BGP_OPEN: Local AS: 43470 Remote AS: 59964 HoldTime: 240
Sep  1 22:48:05 bump pmbgpd[10992]: ERROR ( default/core ): Failed to produce 
to topic bgptest partition -1: Local: Queue full
Sep  1 22:48:05 bump pmbgpd[10992]: ERROR ( default/core ): Connection failed 
to Kafka: p_kafka_close()
Sep  1 22:48:05 bump systemd[1]: pmbgpd.service: Main process exited, 
code=killed, status=11/SEGV
Sep  1 22:48:05 bump systemd[1]: pmbgpd.service: Failed with result 'signal'.

I have verified that it's not connectivity - the topic is created at the Kafka 
end of the link, and I can open a tcp socket with telnet from the computer 
running pmbgpd and the Kafka server's port 9092

I have of course read some Github issues and list archives about the Local: 
Queue full fault and it suggests some librdkafka buffer and timer tweaking, I 
have played with various values (some of them insane) and I don't see any 
different behaviour logged by pmbgpd:

root@bump:/home/andy# cat /etc/pmacct/pmbgpd.conf
bgp_daemon_ip: 185.1.94.6
bgp_daemon_max_peers: 100
bgp_daemon_as: 43470
!
syslog: user
daemonize: true
!
kafka_config_file: /etc/pmacct/librdkafka.conf
!
bgp_daemon_msglog_kafka_output: json
bgp_daemon_msglog_kafka_broker_host: xxxx.hostname
bgp_daemon_msglog_kafka_broker_port: 9092
bgp_daemon_msglog_kafka_topic: bgptest

root@bump:/home/andy# cat /etc/pmacct/librdkafka.conf
global, queue.buffering.max.messages, 8000000
global, batch.num.messages, 100000
global, queue.buffering.max.messages, 20000
global, queue.buffering.max.ms, 100
global, queue.buffering.max.kbytes, 9000000
global, linger.ms, 100
global, socket.request.max.bytes, 104857600
global, socket.receive.buffer.bytes, 10485760
global, socket.send.buffer.bytes, 10485760
global, queued.max.requests, 1000

Any advice on where to troubleshoot next?


Thanks
Andy

_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists



_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Reply via email to