Has anyone done performance tests on sstable reading vs. M/R?  I did a quick 
test on reading all SSTAbles in a LCS column family on 23 tables and took the 
average time it took sstable2json(to /dev/null to make it faster) which was 7 
seconds per table.  (reading to stdout took 16 seconds per table).  This then 
worked out to an estimation of 12.5 hours up to 27 hours(from to stdout 
calculation).  I am suspecting the map/reduce time may be much worse since 
there are not as many repeated rows in LCS????

Ie. I am wondering if I should just read from SSTAbles directly instead of 
map/reduce?   I am about to dig around in the code of M/R and sstable2json to 
see what each is doing specifically.

Thanks,
Dean

Reply via email to