FYI: my theory of the moment (until LOG files arrive) is that maybe a couple
of the machines are using the operating system swap file during the list_keys
operation. That would explain everything. But maybe you have already ruled
that out?
Matthew
On Jan 8, 2013, at 2:53 PM, Parnell Spring
Parnell,
I confirmed with the Basho team that "list_keys" is a read only process. Yes,
some read operations would initiate compactions in Riak 1.1, but you have
1.2.1. I therefore suspect that there is a secondary issue.
Would you mind gathering the LOG files from one of the machines that y
Parnell,
Would appreciate some configuration info:
- what version of Riak are you running?
- would you copy/paste the eleveldb section of your app.config?
- how many vnodes and physical servers are you running?
- what is hardware? cpu, memory, disk arrays
- are you seeing the work "waiting" i
I've had a few situations arise where one or two nodes (all it needs is
one node) will begin a heavy compaction cycle (determined by using gstat
+ looking at leveldb LOG files) and ALL queries put through the cluster
(it doesn't matter which node) return a timeout.
I can fix this situation by kill