subject:"Re\: Hadoop Compression \- Current Status"

Re: Hadoop Compression - Current Status

2010-07-20 Thread Jeff Hammerbacher

Hey Steve, Owen, can you elaborate a little on the effort for the ASL friendly codec > that you mentioned? > See the work on FastLZ at https://issues.apache.org/jira/browse/HADOOP-6349. Regards, Jeff

RE: Hadoop Compression - Current Status

2010-07-16 Thread Stephen Watt

t; Date: 07/14/2010 10:30 AM Subject: RE: Hadoop Compression - Current Status Sorry for the delay in responding back... Yes, that's kind of my point. You gain some efficiency, however... currently you have an expense of losing your parallelism which really gives you more bang for your b

Re: Hadoop Compression - Current Status

2010-07-14 Thread Owen O'Malley

On Jul 12, 2010, at 11:28 AM, Stephen Watt wrote: 1) It appears most folks are using LZO. Given that it is GPL, are you not worried about it virally infecting your project ? The lzo bindings are not part of Hadoop and therefore can't infect Hadoop. They are a separate project (hadoop-gpl-

RE: Hadoop Compression - Current Status

2010-07-14 Thread Segel, Mike

patrickange...@gmail.com [mailto:patrickange...@gmail.com] On Behalf Of Patrick Angeles Sent: Monday, July 12, 2010 2:13 PM To: common-dev@hadoop.apache.org Subject: Re: Hadoop Compression - Current Status Also, fwiw, the use of codecs and SequenceFiles are somewhat orthogonal. You'll have to

Re: Hadoop Compression - Current Status

2010-07-12 Thread Greg Roelofs

Stephen Watt wrote: > Please let me know if any of assertions are incorrect. I'm going to be > adding any feedback to the Hadoop Wiki. It seems well documented that the > LZO Codec is the most performant codec ( > http://blog.oskarsson.nu/2009/03/hadoop-feat-lzo-save-disk-space-and.html) Spee

Re: Hadoop Compression - Current Status

2010-07-12 Thread Patrick Angeles

Also, fwiw, the use of codecs and SequenceFiles are somewhat orthogonal. You'll have to compress the sequencefile with a codec, be it gzip, bz2 or lzo. SequenceFiles do get you splittability which you won't get with just Gzip (until we get MAPREDUCE-491) or the hadoop-lzo InputFormats. cheers, -

RE: Hadoop Compression - Current Status

2010-07-12 Thread Segel, Mike

How can you say zip files are 'best codecs' to use? Call me silly but I seem to recall that if you're using a zip'd file for input you can't really use a file splitter? (Going from memory, which isn't the best thing to do...) -Mike -Original Message- From: Stephen Watt [mailto:sw...@us

Re: Hadoop Compression - Current Status

RE: Hadoop Compression - Current Status

Re: Hadoop Compression - Current Status

RE: Hadoop Compression - Current Status

Re: Hadoop Compression - Current Status

Re: Hadoop Compression - Current Status

RE: Hadoop Compression - Current Status

7 matches

Site Navigation

Mail list logo

Footer information