Re: LevelDB compaction and timeouts

2013-01-09 Thread Matthew Von-Maszewski
FYI: my theory of the moment (until LOG files arrive) is that maybe a couple of the machines are using the operating system swap file during the list_keys operation. That would explain everything. But maybe you have already ruled that out? Matthew On Jan 8, 2013, at 2:53 PM, Parnell Spring

Re: LevelDB compaction and timeouts

2013-01-09 Thread Matthew Von-Maszewski
Parnell, I confirmed with the Basho team that "list_keys" is a read only process. Yes, some read operations would initiate compactions in Riak 1.1, but you have 1.2.1. I therefore suspect that there is a secondary issue. Would you mind gathering the LOG files from one of the machines that y

Re: LevelDB compaction and timeouts

2013-01-08 Thread Matthew Von-Maszewski
Parnell, Would appreciate some configuration info: - what version of Riak are you running? - would you copy/paste the eleveldb section of your app.config? - how many vnodes and physical servers are you running? - what is hardware? cpu, memory, disk arrays - are you seeing the work "waiting" i

LevelDB compaction and timeouts

2013-01-07 Thread Parnell Springmeyer
I've had a few situations arise where one or two nodes (all it needs is one node) will begin a heavy compaction cycle (determined by using gstat + looking at leveldb LOG files) and ALL queries put through the cluster (it doesn't matter which node) return a timeout. I can fix this situation by kill