yomipq opened a new issue, #1873: URL: https://github.com/apache/cassandra-gocql-driver/issues/1873
I have a trouble connecting to AWS Amazon Keyspaces (for Apache Cassandra). My program on EC2 can connect to Amazon Keyspaces without any issues for a while, but after a few days or weeks, it loses the connection and any query causes the error below. ``` gocql: no hosts available in the pool ``` Go version: 1.23.4 GoCQL version: 1.7.0 I built the program with gocql_debug enabled, and I got following logs. ``` 2025/03/25 03:20:29 gocql: Session.handleNodeConnected: 172.16.1.14:9142 2025/03/25 03:20:29 gocql: conns of pool after stopped "172.16.1.14": 2 2025/03/25 03:20:29 gocql: Session.handleNodeConnected: 172.16.1.28:9142 2025/03/25 03:20:29 gocql: conns of pool after stopped "172.16.1.28": 2 2025/03/25 03:21:29 Session.ring:[172.16.1.14:UP][172.16.1.28:UP] ... 2025/03/26 15:11:11 gocql: unable to dial "[HostInfo hostname=\"\" connectAddress=\"127.0.0.1\" peer=\"<nil>\" rpc_address=\"127.0.0.1\" broadcast_address=\"127.0.0.1\" preferred_ip=\"<nil>\" connect_addr=\"127.0.0.1\" connect_addr_source=\"connect_address\" port=9142 data_centre=\"ap-northeast-1\" rack=\"ap-northeast-1\" host_id=\"be0f3a14-e107-3fee-a5e5-415c10539abd\" version=\"v3.11.2\" state=UP num_tokens=0]": dial tcp 127.0.0.1:9142: connect: connection refused 2025/03/26 15:11:11 gocql: filling stopped "127.0.0.1": dial tcp 127.0.0.1:9142: connect: connection refused 2025/03/26 15:11:11 gocql: conns of pool after stopped "127.0.0.1": 0 2025/03/26 15:11:11 gocql: Session.handleNodeDown: 127.0.0.1:9142 2025/03/26 15:11:11 gocql: unable to refresh ring: get existing host=[HostInfo hostname="" connectAddress="172.16.1.14" peer="172.16.1.14" rpc_address="172.16.1.14" broadcast_address="<nil>" preferred_ip="172.16.1.14" connect_addr="172.16.1.14" connect_addr_source="connect_address" port=9142 data_centre="ap-northeast-1" rack="ap-northeast-1" host_id="be0f3a14-e107-3fee-a5e5-415c10539abd" version="v3.11.2" state=UP num_tokens=1] from prevHosts: cannot find host 2025/03/26 15:11:29 Session.ring:[127.0.0.1:DOWN][172.16.1.28:UP] ... 2025/03/26 22:43:35 gocql: unable to dial "[HostInfo hostname=\"\" connectAddress=\"127.0.0.1\" peer=\"<nil>\" rpc_address=\"127.0.0.1\" broadcast_address=\"127.0.0.1\" preferred_ip=\"<nil>\" connect_addr=\"127.0.0.1\" connect_addr_source=\"connect_address\" port=9142 data_centre=\"ap-northeast-1\" rack=\"ap-northeast-1\" host_id=\"b666465e-cb85-3efa-b3ab-f6cf139e5a39\" version=\"v3.11.2\" state=UP num_tokens=0]": dial tcp 127.0.0.1:9142: connect: connection refused 2025/03/26 22:43:35 gocql: filling stopped "127.0.0.1": dial tcp 127.0.0.1:9142: connect: connection refused 2025/03/26 22:43:35 gocql: conns of pool after stopped "127.0.0.1": 0 2025/03/26 22:43:35 gocql: Session.handleNodeDown: 127.0.0.1:9142 2025/03/26 22:43:35 gocql: unable to refresh ring: get existing host=[HostInfo hostname="" connectAddress="172.16.1.28" peer="172.16.1.28" rpc_address="172.16.1.28" broadcast_address="<nil>" preferred_ip="172.16.1.28" connect_addr="172.16.1.28" connect_addr_source="connect_address" port=9142 data_centre="ap-northeast-1" rack="ap-northeast-1" host_id="b666465e-cb85-3efa-b3ab-f6cf139e5a39" version="v3.11.2" state=UP num_tokens=1] from prevHosts: cannot find host 2025/03/26 22:44:29 Session.ring:[127.0.0.1:DOWN][127.0.0.1:DOWN] ``` On startup, It has two hosts 172.16.1.14 and 172.16.1.28. After a while, the connection to 172.16.1.14 got lost with error `cannot find host` and try to reconnect to 127.0.0.1 instead of 172.16.1.14. After another while, the other connection also got lost with the same error and also try to reconnect to 127.0.0.1 instead of 172.16.1.28. As a result, all connections got lost. So here are my questions: First, in what situation the error `cannot find host` occur? Is this an expected error? I read the source code, but I couldn't understand it well. Second, what makes it reconnect to 127.0.0.1 instead of original address? Is this an expected behavior? If anyone has any idea, please let me know. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org