super
familiar with sailfish, but from what I remember from a while ago it is the
modified version of KFS that is in reality doing the sorting. The maps will
output data to "chunks" aka blocks that when each chunk is full it is sorted.
When the sorting is finished for a chunk the re
Hey Sriram,
We discussed this before, but for the benefit of the wider audience: :)
It seems like the requirements imposed on KFS by Sailfish are in most
ways much simplier than the requirements of a full distributed
filesystem. The one thing we need is atomic record append -- but we
don't
Srivas,
Sailfish is builds upon record append (a feature not present in HDFS).
The software that is currently released is based on Hadoop-0.20.2. You use
the Sailfish version of Hadoop-0.20.2, KFS for the intermediate data, and
then HDFS (or KFS) for storing the job/input. Since the changes
Sriram, Sailfish depends on append. I just noticed the HDFS disabled
append. How does one use this with Hadoop?
On Wed, May 9, 2012 at 9:00 AM, Otis Gospodnetic wrote:
> Hi Sriram,
>
> >> The I-file concept could possibly be implemented here in a fairly self
> contained w
d to seeing this! :)
Otis
--
Performance Monitoring for Solr / ElasticSearch / HBase -
http://sematext.com/spm
>
> From: Sriram Rao
>To: common-dev@hadoop.apache.org
>Sent: Tuesday, May 8, 2012 6:48 PM
>Subject: Re: Sailfish
>
>Dear Andy,
&g
ts of things.
Sriram
On May 8, 2012, at 10:32 AM, Sriram Rao wrote:
> Hi,
>
> I'd like to announce the release of a new open source project, Sailfish.
>
> http://code.google.com/p/sailfish/
>
> Sailfish tries to improve Hadoop-performance, particularly for large-jobs
> whi
How do you propose I, practically, migrate to something like Sailfish
> without a major capital expenditure and/or downtime and/or data loss?
Well, we are not asking for KFS to replace HDFS. One path you could
take is to experiment with Sailfish---use KFS just for the
intermediate data and HDFS fo
May 8, 2012, at 10:32 AM, Sriram Rao wrote:
> Hi,
>
> I'd like to announce the release of a new open source project, Sailfish.
>
> http://code.google.com/p/sailfish/
>
> Sailfish tries to improve Hadoop-performance, particularly for large-jobs
> which process TB
MapReduce workflow.
How do you propose I, practically, migrate to something like Sailfish
without a major capital expenditure and/or downtime and/or data loss?
However, can the Sailfish I-files implementation be plugged in as an
alternate Shuffle implementation in MRv2 (see MAPREDUCE-3060 and
MAPR
Me 2, let me know as well.
> Date: Tue, 8 May 2012 10:35:34 -0700
> From: mrra...@yahoo.com
> Subject: Re: Project announcement: Sailfish (also, looking for colloborators)
> To: common-dev@hadoop.apache.org
>
> Hi Sriram,
>
>I'm interested in getting
gt; From: Sriram Rao
> To: common-dev@hadoop.apache.org
> Sent: Tuesday, May 8, 2012 10:32 AM
> Subject: Project announcement: Sailfish (also, looking for colloborators)
>
> Hi,
>
> I'd like to announce the release of a new open source project, Sailfish.
>
> ht
Hi Sriram,
I'm interested in getting involved. Let me know in what capacity I can
get involved..
Thanks,
Raghu
From: Sriram Rao
To: common-dev@hadoop.apache.org
Sent: Tuesday, May 8, 2012 10:32 AM
Subject: Project announcement: Sailfish
Hi,
I'd like to announce the release of a new open source project, Sailfish.
http://code.google.com/p/sailfish/
Sailfish tries to improve Hadoop-performance, particularly for large-jobs
which process TB's of data and run for hours. In building Sailfish, we
modify how map-output is h
13 matches
Mail list logo