If you start with DEBUG logging (or just enable logging for SSTableReader) you will get some more information on what's taking time at startup.
If you want to dig a little further take a look at the iostat and cpu load. During startup a thread is created for each core on the machine and used to open a file. I've wondered if this could overload the IO on machines that report 16 cores. You'll see messages like this INFO [SSTableBatchOpen:1] where the number is the thread number. Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 16/04/2012, at 5:13 AM, Derek Barnes wrote: > Hi, > > I have 2 column families with approx 50 GB of compressed data (~150GB > uncompressed). The data resides in a keyspace replicated 2-way, hosted by a > 2-node Cassandra cluster (v1.0.8), both with 74GB RAM and 16 cores. Key > caches are set to 1.0. > > I'm noticing that it can take upwards of 15+ minutes for the node to start up > (i.e. before it becomes responsive to thrift clients). During this time, the > logs suggest the system is blocked opening the data files. > > Is this expected behaviour? Are there any best practices for reducing node > startup time? > > Thanks in advance!