I have a multiDC ring with 6 nodes in each DC. I have a single node which runs some jobs (including Hadoop Map-Reduce with PIG) every 15minutes.
Lately there has been high CPU load and memory issues on this node. What I could see from Ganglia is high CPU load on this server and also number of TCP connection on port 9160 is around 600+ all the time.The distribution of these connections say that we have connections from this machine to other DC machines are around 90 odd each. For port 7000 its around 45. Further I need to understand that for internal read/write does cassandra uses thrift for doing so over an rpc connection(port 9160) or 7000 as for inter node communication.May be that also could be a reason for so many connections on 9160. I have an 8Core machine with 14Gb RAM and 8Gb Heap. rpc min and max threads are default and so are the other rpc based properties RF:3 each DC and Read/Write CL:1 and Read Repair Chance=0.1. cassandra version is 0.8.6 Regards, Shubham