[jira] Created: (HADOOP-6130) ArrayIndexOutOfBoundsException is thrown by KeyFieldBasedPartitioner

2009-07-07 Thread Suman Sehgal (JIRA)
ArrayIndexOutOfBoundsException is thrown by KeyFieldBasedPartitioner Key: HADOOP-6130 URL: https://issues.apache.org/jira/browse/HADOOP-6130 Project: Hadoop Common Issue Ty

Re: [VOTE] Back-port TFile to Hadoop 0.20

2009-07-07 Thread Devaraj Das
+1 On 7/8/09 12:25 AM, "Hong Tang" wrote: I have talked with a few folks in the community who are interested in using TFile (HADOOP-3315) in their projects that are currently dependent on Hadoop 0.20, and it would significantly simplify the release process as well as their lives if we could bac

Re: Need help understanding the source

2009-07-07 Thread jason hadoop
When you have 0 reduces, the map outputs themselves are moved to the output directory for you. It is also straight forward to open your own file and write to it directory instead of using the output collector. On Tue, Jul 7, 2009 at 10:14 AM, Todd Lipcon wrote: > On Tue, Jul 7, 2009 at 1:13 AM,

Re: [VOTE] Back-port TFile to Hadoop 0.20

2009-07-07 Thread Chris Douglas
+1

Re: [VOTE] Back-port TFile to Hadoop 0.20

2009-07-07 Thread Mahadev Konar
+1 mahadev On 7/7/09 12:18 PM, "Milind Bhandarkar" wrote: > +1. > > > On 7/7/09 11:55 AM, "Hong Tang" wrote: > >> I have talked with a few folks in the community who are interested in >> using TFile (HADOOP-3315) in their projects that are currently >> dependent on Hadoop 0.20, and it woul

Re: [VOTE] Back-port TFile to Hadoop 0.20

2009-07-07 Thread Amr Awadallah
+1 Zheng Shao wrote: +1 -Original Message- From: Arun C Murthy [mailto:a...@yahoo-inc.com] Sent: Tuesday, July 07, 2009 1:30 PM To: common-dev@hadoop.apache.org Subject: Re: [VOTE] Back-port TFile to Hadoop 0.20 On Jul 7, 2009, at 11:55 AM, Hong Tang wrote: I have talked with a

Re: [VOTE] Back-port TFile to Hadoop 0.20

2009-07-07 Thread Owen O'Malley
On Tue, Jul 7, 2009 at 1:39 PM, Dhruba Borthakur wrote: > I think we are trying to change an existing Apache-Hadoop process. The > current process specifically says that a released branch cannot have new > features checked into it. > > This vote seems to be proposing that "If a new feature does n

Re: [VOTE] Back-port TFile to Hadoop 0.20

2009-07-07 Thread Dhruba Borthakur
I think we are trying to change an existing Apache-Hadoop process. The current process specifically says that a released branch cannot have new features checked into it. This vote seems to be proposing that "If a new feature does not change any existing code (other than build.xml), then it is ok t

RE: [VOTE] Back-port TFile to Hadoop 0.20

2009-07-07 Thread Zheng Shao
+1 -Original Message- From: Arun C Murthy [mailto:a...@yahoo-inc.com] Sent: Tuesday, July 07, 2009 1:30 PM To: common-dev@hadoop.apache.org Subject: Re: [VOTE] Back-port TFile to Hadoop 0.20 On Jul 7, 2009, at 11:55 AM, Hong Tang wrote: > I have talked with a few folks in the community

Re: [VOTE] Back-port TFile to Hadoop 0.20

2009-07-07 Thread Arun C Murthy
On Jul 7, 2009, at 11:55 AM, Hong Tang wrote: I have talked with a few folks in the community who are interested in using TFile (HADOOP-3315) in their projects that are currently dependent on Hadoop 0.20, and it would significantly simplify the release process as well as their lives if we

Re: [VOTE] Back-port TFile to Hadoop 0.20

2009-07-07 Thread Owen O'Malley
On Jul 7, 2009, at 11:55 AM, Hong Tang wrote: I have talked with a few folks in the community who are interested in using TFile (HADOOP-3315) in their projects that are currently dependent on Hadoop 0.20, and it would significantly simplify the release process as well as their lives if we

Re: [VOTE] Back-port TFile to Hadoop 0.20

2009-07-07 Thread Matei Zaharia
+1 On Jul 7, 2009, at 11:55 AM, Hong Tang wrote: I have talked with a few folks in the community who are interested in using TFile (HADOOP-3315) in their projects that are currently dependent on Hadoop 0.20, and it would significantly simplify the release process as well as their lives if

[jira] Created: (HADOOP-6129) MapFile doesn't worh with serializables other than Writables

2009-07-07 Thread Justin Patterson (JIRA)
MapFile doesn't worh with serializables other than Writables Key: HADOOP-6129 URL: https://issues.apache.org/jira/browse/HADOOP-6129 Project: Hadoop Common Issue Type: Improvement

Re: [VOTE] Back-port TFile to Hadoop 0.20

2009-07-07 Thread Milind Bhandarkar
+1. On 7/7/09 11:55 AM, "Hong Tang" wrote: > I have talked with a few folks in the community who are interested in > using TFile (HADOOP-3315) in their projects that are currently > dependent on Hadoop 0.20, and it would significantly simplify the > release process as well as their lives if we

[VOTE] Back-port TFile to Hadoop 0.20

2009-07-07 Thread Hong Tang
I have talked with a few folks in the community who are interested in using TFile (HADOOP-3315) in their projects that are currently dependent on Hadoop 0.20, and it would significantly simplify the release process as well as their lives if we could back port TFile to Hadoop 0.20 (instead o

[jira] Resolved: (HADOOP-5976) create script to provide classpath for external tools

2009-07-07 Thread Tsz Wo (Nicholas), SZE (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HADOOP-5976. Resolution: Fixed Fix Version/s: 0.21.0 Release Note: Add a new

Re: Need help understanding the source

2009-07-07 Thread Todd Lipcon
On Tue, Jul 7, 2009 at 1:13 AM, jason hadoop wrote: > > > The other alternative you may try is simply to write your map outputs to > HDFS [ie: setNumReduces(0)], and have a consumer pick up the map outputs as > they appear. If the life of the files is short and you can withstand data > loss, you m

[jira] Created: (HADOOP-6128) Serializer and Deserializer should extend java.io.Closeable

2009-07-07 Thread Tom White (JIRA)
Serializer and Deserializer should extend java.io.Closeable --- Key: HADOOP-6128 URL: https://issues.apache.org/jira/browse/HADOOP-6128 Project: Hadoop Common Issue Type: Improvement

Re: Need help understanding the source

2009-07-07 Thread jason hadoop
If your constraints are loose enough, you could consider using the chain mapping that became available in 19, and have multiple mappers for your jobs. The extra mappers only receive the output of the prior map in the chain and if I remember correctly, the combiner is run at the end of the chain of

Re: Need help understanding the source

2009-07-07 Thread Amr Awadallah
To add to Todd/Ted's wise words, the Hadoop (and MapReduce) architects didn't impose this limitation just for fun, it is very core to enabling Hadoop to be as reliable as it is. If the reducer starts processing mapper output immediately and a specific mapper fails then the reducer would have to