> 1. is this a nodetool bug?  is there any way to propagate the
> java.io.IOException back to nodetool?
The repair continues to work even if nodetool fails, it's a server side thing. 

> 2. network problems on EC2, I'm shocked!  are there recommended
> network settings for EC2?
Streaming does not put a timeout on the socket, in this case check the node to see why the pipe broke. 

Aaron Morton
Freelance Cassandra Consultant
New Zealand


On 13/03/2013, at 4:28 PM, Dane Miller <d...@optimalsocial.com> wrote:

> On Wed, Mar 13, 2013 at 12:39 PM, Wei Zhu <wz1...@yahoo.com> wrote:
>> My guess would be there is some exception during the repair and your session 
>> is aborted.
>> Here is the code of doing repair:
>> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/AntiEntropyService.java
>> looking for
>> logger.info
>> Compare that with your log file, it should give you a rough idea in which 
>> stage repaired died.
> Thanks for the link to the source.  That's a little hard to grok, but
> your suggestion to examine the logs more thoroughly was helpful.  I
> was able to determine that repair hung due to connection errors during
> streaming.  I'll include log snippets below, but this leads me to
> other more important questions...
> 1. is this a nodetool bug?  is there any way to propagate the
> java.io.IOException back to nodetool?
> 2. network problems on EC2, I'm shocked!  are there recommended
> network settings for EC2?
> Dane
> Here are the relevant logs showing (A) repair progress, and (B)
> java.io.IOExceptions
> (A) repair progress
> INFO [Thread-5314] 2013-03-11 23:29:28,866 StorageService.java (line
> 2364) Starting repair command #9, repairing 1 ranges for keyspace
> OpsCenter
> INFO [AntiEntropySessions:13] 2013-03-11 23:29:28,867
> AntiEntropyService.java (line 652) [repair
> #84e86020-8aa3-11e2-abb2-17112e360b9a] new session: will sync
> /, / on range
> (0,28356863910078205288614550619314017621] for OpsCenter.[events,
> rollups60, settings, pdps, rollups86400, events_timeline, rollups300,
> rollups7200]
> INFO [Thread-5320] 2013-03-11 23:29:29,198 AntiEntropyService.java
> (line 765) [repair #84e86020-8aa3-11e2-abb2-17112e360b9a] events is
> fully synced (7 remaining column family to sync for this session)
> INFO [AntiEntropyStage:1] 2013-03-11 23:38:02,198
> AntiEntropyService.java (line 765) [repair
> #84e86020-8aa3-11e2-abb2-17112e360b9a] settings is fully synced (6
> remaining column family to sync for this session)
> INFO [AntiEntropyStage:1] 2013-03-11 23:38:02,617
> AntiEntropyService.java (line 765) [repair
> #84e86020-8aa3-11e2-abb2-17112e360b9a] pdps is fully synced (5
> remaining column family to sync for this session)
> INFO [Streaming to /] 2013-03-11 23:38:12,491
> AntiEntropyService.java (line 765) [repair
> #84e86020-8aa3-11e2-abb2-17112e360b9a] rollups86400 is fully synced (4
> remaining column family to sync for this session)
> INFO [Streaming to /] 2013-03-11 23:39:55,886
> AntiEntropyService.java (line 765) [repair
> #84e86020-8aa3-11e2-abb2-17112e360b9a] rollups7200 is fully synced (3
> remaining column family to sync for this session)
> (B) java.io.IOException
> # grep -A1 ERROR /var/log/cassandra/system.log.2
> ERROR [Streaming to /] 2013-03-11 23:38:12,654
> CassandraDaemon.java (line 132) Exception in thread Thread[Streaming
> to /,5,main]
> java.lang.RuntimeException: java.io.IOException: Connection reset by peer
> --
> ERROR [Streaming to /] 2013-03-11 23:38:12,692
> CassandraDaemon.java (line 132) Exception in thread Thread[Streaming
> to /,5,main]
> java.lang.RuntimeException: java.io.IOException: Broken pipe
> --
> ERROR [Streaming to /] 2013-03-11 23:39:55,932
> CassandraDaemon.java (line 132) Exception in thread Thread[Streaming
> to /,5,main]
> java.lang.RuntimeException: java.io.IOException: Broken pipe

Reply via email to