On Wed, Apr 4, 2012 at 8:56 AM, Jonathan Ellis <jbel...@gmail.com> wrote:
> We use 2MB chunks for our CFS implementation of HDFS: > http://www.datastax.com/dev/blog/cassandra-file-system-design > thanks > > On Mon, Apr 2, 2012 at 4:23 AM, Franc Carter <franc.car...@sirca.org.au> > wrote: > > > > Hi, > > > > We are in the early stages of thinking about a project that needs to > store > > data that will be accessed by Hadoop. One of the concerns we have is > around > > the Latency of HDFS as our use case is is not for reading all the data > and > > hence we will need custom RecordReaders etc. > > > > I've seen a couple of comments that you shouldn't put large chunks in to > a > > value - however 'large' is not well defined for the range of people using > > these solutions ;-) > > > > Doe anyone have a rough rule of thumb for how big a single value can be > > before we are outside sanity? > > > > thanks > > > > -- > > > > Franc Carter | Systems architect | Sirca Ltd > > > > franc.car...@sirca.org.au | www.sirca.org.au > > > > Tel: +61 2 9236 9118 > > > > Level 9, 80 Clarence St, Sydney NSW 2000 > > > > PO Box H58, Australia Square, Sydney NSW 1215 > > > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com > -- *Franc Carter* | Systems architect | Sirca Ltd <marc.zianideferra...@sirca.org.au> franc.car...@sirca.org.au | www.sirca.org.au Tel: +61 2 9236 9118 Level 9, 80 Clarence St, Sydney NSW 2000 PO Box H58, Australia Square, Sydney NSW 1215