Re: tcp CLOSE_WAIT bug

2010-04-25 Thread yangfeng
I encountered the same problem! Hope to get some help.Tks. 2010/4/22 Ingram Chen > arh! That's right. > > I check OutboundTcpConnection and it only does closeSocket() after > something went wrong. I will log more in OutboundTcpConnection to see what > actually happens. > > Thank your help. > > >

Re: tcp CLOSE_WAIT bug

2010-04-21 Thread Ingram Chen
arh! That's right. I check OutboundTcpConnection and it only does closeSocket() after something went wrong. I will log more in OutboundTcpConnection to see what actually happens. Thank your help. On Thu, Apr 22, 2010 at 10:03, Jonathan Ellis wrote: > But those connections aren't supposed to

Re: tcp CLOSE_WAIT bug

2010-04-21 Thread Jonathan Ellis
But those connections aren't supposed to ever terminate unless a node dies or is partitioned. So if we "fix" it by adding a socket.close I worry that we're covering up something more important. On Wed, Apr 21, 2010 at 8:53 PM, Ingram Chen wrote: > I agree your point. I patch the code and log mor

Re: tcp CLOSE_WAIT bug

2010-04-21 Thread Ingram Chen
I agree your point. I patch the code and log more informations to find out the real cause. Here is the code snip I think may be the cause: IncomingTcpConnection: public void run() { while (true) { try { MessagingService.validateMagi

Re: tcp CLOSE_WAIT bug

2010-04-21 Thread Jonathan Ellis
I'd like to get something besides "I'm seeing close wait but i have no idea why" for a bug report, since most people aren't seeing that. On Tue, Apr 20, 2010 at 9:33 AM, Ingram Chen wrote: > I trace IncomingStreamReader source and found that incoming socket comes > from MessagingService$SocketThr

Re: tcp CLOSE_WAIT bug

2010-04-20 Thread Ingram Chen
I trace IncomingStreamReader source and found that incoming socket comes from MessagingService$SocketThread. but there is no close() call on either accepted socket or socketChannel. Should I file a bug report ? On Tue, Apr 20, 2010 at 11:02, Ingram Chen wrote: > this happened after several hour

Re: tcp CLOSE_WAIT bug

2010-04-19 Thread Ingram Chen
this happened after several hours of operations and both nodes are started at the same time (clean start without any data). so it might not relate to Bootstrap. In system.log I do not see any logs like "xxx node dead" or exceptions. and both nodes in test are alive. they serve read/write well, too

Re: tcp CLOSE_WAIT bug

2010-04-19 Thread Jonathan Ellis
Is this after doing a bootstrap or other streaming operation? Or did a node go down? The internal sockets are supposed to remain open, otherwise. On Mon, Apr 19, 2010 at 10:56 AM, Ingram Chen wrote: > Thank your information. > > We do use connection pools with thrift client and ThriftAdress is

Re: tcp CLOSE_WAIT bug

2010-04-19 Thread Ingram Chen
Thank your information. We do use connection pools with thrift client and ThriftAdress is on port 9160. Those problematic connections we found are all in port 7000, which is internal communications port between nodes. I guess this related to StreamingService. On Mon, Apr 19, 2010 at 23:46, Brand

Re: tcp CLOSE_WAIT bug

2010-04-19 Thread Brandon Williams
On Mon, Apr 19, 2010 at 10:27 AM, Ingram Chen wrote: > Hi all, > > We have observed several connections between nodes in CLOSE_WAIT after > several hours of operation: > This is symptomatic of not pooling your client connections correctly. Be sure you're using one connection per thread, not

tcp CLOSE_WAIT bug

2010-04-19 Thread Ingram Chen
Hi all, We have observed several connections between nodes in CLOSE_WAIT after several hours of operation: At node 87: netstat -tn | grep 7000 tcp0 0 :::192.168.2.87:7000:::192.168.2.88:57625 CLOSE_WAIT tcp0 0 :::192.168.2.87:7000:::192.168.2