We store objects that are a couple of tens of K, sometimes 100K, and we store quite a few of these per row, sometimes hundreds of thousands.
One problem we encountered early was that these rows would become so big that C* couldn't compact the rows in-memory and had to revert to slow two-pass compactions where it spills partially compacted rows to disk. we solved that in two ways, first by increasing in_memory_compaction_limit_in_mb from 64 to 128, and although it helped a little bit we quickly realized didn't have much effect because most of the time was taken up by really huge rows many times larger than that. We ended up implementing a simple sharding scheme where each row is actually 36 rows that each contain 1/36 of the range (we take the first letter in the column key and stick that on the row key on writes, and on reads we read all 36 rows -- 36 because there are 36 letters and numbers in the ascii alphabet and our column keys happen to distribute over that quite nicely). Cassandra works well with semi-large objects, and it works well with wide rows, but you have to be careful about the combination where rows get larger than 64 Mb. T# On Mon, Jul 8, 2013 at 8:13 PM, S Ahmed <sahmed1...@gmail.com> wrote: > Hi Peter, > > Can you describe your environment, # of documents and what kind of usage > pattern you have? > > > > > On Mon, Jul 8, 2013 at 2:06 PM, Peter Lin <wool...@gmail.com> wrote: > >> I regularly store word and pdf docs in cassandra without any issues. >> >> >> >> >> On Mon, Jul 8, 2013 at 1:46 PM, S Ahmed <sahmed1...@gmail.com> wrote: >> >>> I'm guessing that most people use cassandra to store relatively smaller >>> payloads like 1-5kb in size. >>> >>> Is there anyone using it to store say 100kb (1/10 of a megabyte) and if >>> so, was there any tweaking or gotchas that you ran into? >>> >> >> >