SequenceFile compared to RCFile: * More widely deployed. * Available from MapReduce and Pig * Doesn't compress as small (in RCFile all of each columns values are put together) * Uncompresses and deserializes all of the columns, even if you are only reading a few
In either case, for long term storage, you should seriously consider the default codec since that will provide much tighter compression (at the cost of cpu to compress it). -- Owen