We use 2MB chunks for our CFS implementation of HDFS: http://www.datastax.com/dev/blog/cassandra-file-system-design
On Mon, Apr 2, 2012 at 4:23 AM, Franc Carter <franc.car...@sirca.org.au> wrote: > > Hi, > > We are in the early stages of thinking about a project that needs to store > data that will be accessed by Hadoop. One of the concerns we have is around > the Latency of HDFS as our use case is is not for reading all the data and > hence we will need custom RecordReaders etc. > > I've seen a couple of comments that you shouldn't put large chunks in to a > value - however 'large' is not well defined for the range of people using > these solutions ;-) > > Doe anyone have a rough rule of thumb for how big a single value can be > before we are outside sanity? > > thanks > > -- > > Franc Carter | Systems architect | Sirca Ltd > > franc.car...@sirca.org.au | www.sirca.org.au > > Tel: +61 2 9236 9118 > > Level 9, 80 Clarence St, Sydney NSW 2000 > > PO Box H58, Australia Square, Sydney NSW 1215 > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com