> Rereading through everything again I am starting to wonder if the page cache > is being affected by compaction. We have been heavily loading data for weeks > and compaction is basically running non-stop. The manual compaction should > be done some time tomorrow, so when totally caught up I will try again.
If your 65 ms measurements were taken as an average while compaction/repair was running, that would most definitely be a very very likely candidate for a root cause. Especially if your compaction is disk bound or close to it (rather than CPU bound). What made me concerned was that it sounded like you were getting the 65ms latencies to reads with *no* other activity going on. But was compaction/repair still running at that point? And yes - definitely make sure to time it again when there's no active compaction/repair going on. > What > changes can be hoped for in 1470 or 1882 in terms of isolating compactions > (or writes) affects on read requests? Speaking only for myself now and my expectations (not making any statements officially for cassandra): Under the assumption of large data sets with disk I/O and cache effectiveness being the primary concerns, the negative impact of background bulk I/O is falling into two categories: (1) Direct impact on latency resulting from the I/O being done at any given moment. (2) Indirect impact resulting from eviction of hot data from page cache. 1470 is part of decimating (2). It sounds like 1470 itself will be closed with fadvise working, but there is more to be done to achieve a final goal of mitigating (2). Various options are discussed in 1470 itself; I guess the latest is the fadvise+mincore plan provided that it pans out. It is worth noting though that barring a user-level page cache, the effect of (2) will likely never be completely eliminated. Even given fadvise+mincore, there are other concerns such as blowing away recenticity information and defeating the LRU behavior (or similar) of the OS page cache. 1882 is about controlling (1) and it is considerably easier to get something "good enough" working for 1882 than 1470. Although certainly the general problem of I/O scheduling is a difficult one, given the specific use-case in Cassandra and the low hanging fruit to be picked, I expect 1882 even in it's simplest form to significantly help for (1) (but this only matters if (1) is your problem; if you are sufficiently CPU bound already so that I/O is sufficiently rate limited in practice anyway, 1882 will make no difference at all). -- / Peter Schuller