Did you try netcat to verify that you can get to the internal port on machine X from machine Y?
On Fri, Jun 24, 2011 at 8:20 AM, David McNelis <dmcne...@agentisenergy.com> wrote: > Running on Centos. > We had a massive power failure and our UPS wasn't up to 48 hours without > power... > In this situation the IP addresses have all stayed the same. I can still > connect to the "other" node from cli, so I don't think its an issue where > the iptables settings weren't saved and started blocking traffic. > In terms of the log files, the only related line from the log files is > saying: > INFO [main] 2011-06-24 07:48:44,750 StorageService.java (line 382) Loading > persisted ring state > INFO [main] 2011-06-24 07:48:44,757 StorageService.java (line 418) Starting > up server gossip > When I turn on debugging and restart the non-seed node I get this line: > DEBUG [WRITE-/192.168.80.XXX] 2011-06-24 08:04:48,798 > OutboundTcpConnection.java (line 161) attempting to connect to > /192.168.80.XXX > But no errors after it. > > On Fri, Jun 24, 2011 at 7:58 AM, Sasha Dolgy <sdo...@gmail.com> wrote: >> >> Normally, no. What you've done is fine. What is the environment? >> >> On amazon EC2 for example, the instance could have crashed, a new one >> is brought online and has a different internal IP ... >> >> in the cassandra/logs/system.log are there any messages on the 2nd >> node and how it relates to the seed node? >> >> On Fri, Jun 24, 2011 at 2:49 PM, David McNelis >> <dmcne...@agentisenergy.com> wrote: >> > I am running 0.8.0 on CentOS. I have a 2 nodes in my cluster, one is a >> > seed, the other is autobootstrapped. >> > After having an unexpected shutdown of both of the physical machines I >> > am >> > trying to restart the cluster. I first started the seed node, it went >> > through the normal startup process and finished without error. Once >> > that >> > was complete I started the second node, again no errors in the log as it >> > was >> > starting, it started the gossip server, ect. >> > However when I look at the ring using nodetool, both machines show >> > their >> > own status as up, then show the other machine as Down with a state of >> > Normal >> > and a load of ?. I have tried restarting the individual nodes in >> > different >> > orders, waiting a while after restarting a node, but still the 'other' >> > node >> > always has a status of "down". nodetool repair [keyspace] did not make >> > any >> > difference either and nodetool join just told me that the nodes were >> > already >> > a part of the ring. >> > I can't imagine this is how it *should* be behaving... is there a piece >> > I'm >> > missing in terms of getting one node to recognize the other as being Up? > > > > -- > David McNelis > Lead Software Engineer > Agentis Energy > www.agentisenergy.com > o: 630.359.6395 > c: 219.384.5143 > A Smart Grid technology company focused on helping consumers of energy > control an often under-managed resource. > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com