I am going to respond to multiple questions in one email to keep down the thread insanity:
On Mon, Oct 25, 2010 at 12:39 AM, David Dabbs <dmda...@gmail.com> wrote: > Sorry, Eric I’m not following you. You’ve set the JVM’s processor > affinity so it only runs on one of the processors? > My understanding is that Linux will launch a given process on one "node" (processor in this case) or another and then attempt to allocate memory only from that node for that process. If free memory is unavailable on that node it will assign memory from the other node. The process scheduler will try and schedule the process on that node as well. My knowledge is very limited here, and in fact, most of what I know comes from this article: http://jcole.us/blog/archives/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/ On Mon, Oct 25, 2010 at 8:25 AM, Edward Capriolo <edlinuxg...@gmail.com> wrote: > If reading properly it looks like you used Linux Software Raid on top > of the SSD devices. Can you talk about this? I would think that even > with a simple RAID this would drive you CPU high. But it seems you may > not have other options since SSD RAID cards probably do not exist. > Yes, we are running Linux kernel raid (not LVM). This is mostly because our first batch of machines had the SSD's hooked directly to the onboard Intel ICH10 SATA controller rather than any add in RAID card. We are only doing RAID 0 here so I would not expect this to take any CPU to speak of since it's just doing a mod operator (or something simple) to figure out which disk the data goes on. With RAID 0 there is no parity calculation. Even if there was more work to be done, there are 8 cores (and 16 virtual processors when you consider hyperthreading) for that operation to be scheduled on. We don't seem to be CPU bound. That being said, we really should try out the LSI 2008's RAID 0 capability, but we have not had a chance yet. On Mon, Oct 25, 2010 at 9:07 AM, Jonathan Ellis <jbel...@gmail.com> wrote: > On Mon, Oct 25, 2010 at 10:25 AM, Edward Capriolo <edlinuxg...@gmail.com> > wrote: > >> 2. We gave up on using Cassandra's row cache as loading any reasonable > >> amount of data into the cache would take days/weeks with our tiny row > size. > >> We instead are using file system cache. > > I don't follow the reasoning there. Row cache or fs cache, it will be > hot after reading it once, the difference is that doing a read to the > cached data is much faster from row cache. Yeah, I would have thought the same. Benjamin Black actually recommended we go this route as with our dataset (we have huge numbers of super-tiny rows) it would take weeks of running for the row cache to become useful. -Eric