On Mon, Jul 21, 2014 at 5:45 PM, Marcelo Elias Del Valle < marc...@s1mbi0se.com.br> wrote:
> Although several sstables (disk fragments) may have the same row key, > inside a single sstable row keys and column keys are indexed, right? > Otherwise, doing a GET in Cassandra would take some time. > From the M/R perspective, I was reffering to the mem table, as I am trying > to compare the time to insert in Cassandra against the time of sorting in > hadoop. > I was confused, because unless you are using new "in-memory" columnfamilies, which I believe are only available in DSE, there is no way to ensure that any given row stays in a memtable. Very rarely is there a view of the function of a memtable that only cares about its properties and not the closely related properties of SSTables. However yours is one of them, I see now why your question makes sense, you only care about the memtable for how quickly it sorts. But if you are only relying on memtables to sort writes, that seems like a pretty heavyweight reason to use Cassandra? I'm certainly not an expert in this area of Cassandra... but Cassandra, as a datastore with immutable data files, is not typically a good choice for short lived intermediate result sets... are you planning to use DSE? =Rob