On Wed, Apr 4, 2012 at 8:56 AM, Jonathan Ellis wrote:
> We use 2MB chunks for our CFS implementation of HDFS:
> http://www.datastax.com/dev/blog/cassandra-file-system-design
>
thanks
>
> On Mon, Apr 2, 2012 at 4:23 AM, Franc Carter
> wrote:
> >
> > Hi,
> >
> > We are in the early stages of th
We use 2MB chunks for our CFS implementation of HDFS:
http://www.datastax.com/dev/blog/cassandra-file-system-design
On Mon, Apr 2, 2012 at 4:23 AM, Franc Carter wrote:
>
> Hi,
>
> We are in the early stages of thinking about a project that needs to store
> data that will be accessed by Hadoop. On
On Tue, Apr 3, 2012 at 4:18 AM, Ben Coverston wrote:
> This is a difficult question to answer for a variety of reasons, but I'll
> give it a try, maybe it will be helpful, maybe not.
>
> The most obvious problem with this is that Thrift is buffer based, not
> streaming. That means that whatever th
This is a difficult question to answer for a variety of reasons, but I'll
give it a try, maybe it will be helpful, maybe not.
The most obvious problem with this is that Thrift is buffer based, not
streaming. That means that whatever the size of your chunk it needs to
be received, deserialized, and
Hi,
We are in the early stages of thinking about a project that needs to store
data that will be accessed by Hadoop. One of the concerns we have is around
the Latency of HDFS as our use case is is not for reading all the data and
hence we will need custom RecordReaders etc.
I've seen a couple of