subject:"Re\: Hive File Sizes, Merging, and Splits"

Re: Hive File Sizes, Merging, and Splits

2012-09-26 Thread Ruslan Al-Fakikh

tasks is BETTER. > > > > From: John Omernik [j...@omernik.com] > Sent: Tuesday, September 25, 2012 7:11 PM > To: user@hive.apache.org > Subject: Re: Hive File Sizes, Merging, and Splits > > Isn't there an overhead associated with each map task? Based on that, m

RE: Hive File Sizes, Merging, and Splits

2012-09-25 Thread Connell, Chuck

But remember that you are running on parallel machines. Depending on the hardware configuration, more map tasks is BETTER. From: John Omernik [j...@omernik.com] Sent: Tuesday, September 25, 2012 7:11 PM To: user@hive.apache.org Subject: Re: Hive File Sizes

Re: Hive File Sizes, Merging, and Splits

2012-09-25 Thread John Omernik

Isn't there an overhead associated with each map task? Based on that, my hypothesis is if I pay attention to may data, merge up small files after load, and ensure split sizes are close to files sizes, I can keep the number of map tasks to an absolute minimum. On Tue, Sep 25, 2012 at 2:35 PM, Con

RE: Hive File Sizes, Merging, and Splits

2012-09-25 Thread Connell, Chuck

Why do you think the current generated code is inefficient? From: John Omernik [mailto:j...@omernik.com] Sent: Tuesday, September 25, 2012 2:57 PM To: user@hive.apache.org Subject: Hive File Sizes, Merging, and Splits I am really struggling trying to make hears or tails out of how to optimize t

Re: Hive File Sizes, Merging, and Splits

RE: Hive File Sizes, Merging, and Splits

Re: Hive File Sizes, Merging, and Splits

RE: Hive File Sizes, Merging, and Splits

4 matches

Site Navigation

Mail list logo

Footer information