Some discussion of large data here http://wiki.apache.org/cassandra/LargeDataSetConsiderations
When creating large rows you also need to be aware of in_memory_compaction_limit_in_mb (see the yaml) and that all columns for a row are stored on the same node. So if you store one file in a one row you may not get the best load distribution. I've heard mention before that 10MB is a reasonable max for a row if you have no natural partitions. That said CFS in Brisk put each block on a row, and used columns for the sub blocks. And the default settings for HFS are <!-- 64 MB default --> <property> <name>fs.local.block.size</name> <value>67108864</value> </property> <!-- 2 MB SubBlock Size --> <property> <name>fs.local.subblock.size</name> <value>2097152</value> </property> Hope that helps. ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 24/09/2011, at 9:27 PM, Radim Kolar wrote: > Dne 24.9.2011 0:05, Jonathan Ellis napsal(a): >> Really large messages are not encouraged because they will fragment >> your heap quickly. Other than that, no. > what is recommended chunk size for storing multi gigabyte files in cassandra? > 64MB is okay or its too large?