Hi All, I have been running a streams application for sometime. The application runs fine for sometime but after a day or two I see the below log getting printed continuously on to the console.
WARN 2018-02-05 02:50:04.060 [kafka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient - Connection to node -1 could not be established. Broker may not be available. WARN 2018-02-05 02:50:04.160 [kafka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient - Connection to node -1 could not be established. Broker may not be available. WARN 2018-02-05 02:50:04.261 [kafka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient - Connection to node -1 could not be established. Broker may not be available. WARN 2018-02-05 02:50:04.311 [kafka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient - Connection to node -1 could not be established. Broker may not be available. WARN 2018-02-05 02:50:04.361 [kafka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient - Connection to node -1 could not be established. Broker may not be available. WARN 2018-02-05 02:50:04.411 [kafka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient - Connection to node -1 could not be established. Broker may not be available. At this time, though the application is able to process the messages, I could also see lag building up in the consumers and the processing time for a batch has increased 15 folds. I am using a single zoo-keeper instance with 2 brokers and 4 application instances. I checked the broker and zoo-keeper status, they are all running fine as I could see. I have also verified the connectivity between the application and broker instances using telnet and it seems intact. The kafka broker and streams/client versions are 0.11.0.2. Results of broker status results from zoo-keeper below [root@app100 kafka]# echo dump | nc localhost 2181 SessionTracker dump: Session Sets (3): 0 expire at Mon Feb 05 06:16:39 UTC 2018: 1 expire at Mon Feb 05 06:16:42 UTC 2018: 0x161562860970001 1 expire at Mon Feb 05 06:16:45 UTC 2018: 0x161562860970000 ephemeral nodes dump: Sessions with Ephemerals (2): 0x161562860970000: /brokers/ids/0 /controller 0x161562860970001: /brokers/ids/1 [root@app100 kafka]# ./kafka_2.11-0.11.0.2/bin/zookeeper-shell.sh localhost:2181 <<< "get /brokers/ids/0" Connecting to localhost:2181 Welcome to ZooKeeper! JLine support is disabled WATCHER:: WatchedEvent state:SyncConnected type:None path:null {"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"} ,"endpoints":["PLAINTEXT://172.31.10.35:9092"],"jmx_port": 55555,"host":"172.31.10.35","timestamp":"1517569007467"," port":9092,"version":4} cZxid = 0x1c ctime = Fri Feb 02 10:56:47 UTC 2018 mZxid = 0x1c mtime = Fri Feb 02 10:56:47 UTC 2018 pZxid = 0x1c cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x161562860970000 dataLength = 197 numChildren = 0 [root@app100 kafka]# ./kafka_2.11-0.11.0.2/bin/zookeeper-shell.sh localhost:2181 <<< "get /brokers/ids/1" Connecting to localhost:2181 Welcome to ZooKeeper! JLine support is disabled WATCHER:: WatchedEvent state:SyncConnected type:None path:null {"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"} ,"endpoints":["PLAINTEXT://172.31.14.8:9092"],"jmx_port": 55555,"host":"172.31.14.8","timestamp":"1517569016562"," port":9092,"version":4} cZxid = 0x21 ctime = Fri Feb 02 10:56:56 UTC 2018 mZxid = 0x21 mtime = Fri Feb 02 10:56:56 UTC 2018 pZxid = 0x21 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x161562860970001 dataLength = 195 numChildren = 0 Could you please throw some light on this as to what could be going wrong here? Thanks, Tony