e-
From: Wojciech Langiewicz [mailto:wlangiew...@gmail.com]
Sent: Wednesday, December 07, 2011 8:15 PM
To: user@hive.apache.org
Subject: Re: Hive query taking too much time
Hi,
In this case it's much easier and faster to merge all files using this
command:
cat *.csv> output.csv
hive
ame bytes. What do you suggest?
>
> Kind Regards,
> Keshav C Savant
>
>
> -Original Message-
> From: Wojciech Langiewicz [mailto:wlangiew...@gmail.com]
> Sent: Wednesday, December 07, 2011 8:15 PM
> To: user@hive.apache.org
> Subject: Re: Hive query taking too much ti
taking too much time
Hi,
In this case it's much easier and faster to merge all files using this
command:
cat *.csv > output.csv
hive -e "load data local inpath 'output.csv' into table $table"
On 07.12.2011 07:00, Vikas Srivastava wrote:
> hey if u having the same col
Hi,
In this case it's much easier and faster to merge all files using this
command:
cat *.csv > output.csv
hive -e "load data local inpath 'output.csv' into table $table"
On 07.12.2011 07:00, Vikas Srivastava wrote:
hey if u having the same col of all the files then you can easily merge by
s
14,271,688
Thanks a lot for your help.
Kind Regards,
Keshav C Savant
From: Paul Mackles [mailto:pmack...@adobe.com]
Sent: Tuesday, December 06, 2011 8:14 PM
To: user@hive.apache.org
Subject: RE: Hive query taking too much time
How much time is it spending in the map/reduce phases
t my Blog for answers to commonly asked questions.
From: Vikas Srivastava
To: user@hive.apache.org
Sent: Tuesday, December 6, 2011 10:00 PM
Subject: Re: Hive query taking too much time
hey if u having the same col of all the files then you can easily merg
hey if u having the same col of all the files then you can easily merge by
shell script
list=`*.csv`
$table=yourtable
for file in $list
do
cat $file >>new_file.csv
done
hive -e "load data local inpath '$file' into table $table"
it will merge all the files in single file then you can upload it in
Hi Paul,
I am having the same problem. Do you know any efficient way of merging the
files?
-Mohit
On Tue, Dec 6, 2011 at 8:14 PM, Paul Mackles wrote:
> How much time is it spending in the map/reduce phases, respectively? The
> large number of files could be creating a lot of mappers which creat
How much time is it spending in the map/reduce phases, respectively? The large
number of files could be creating a lot of mappers which create a lot of
overhead. What happens if you merge the 2624 files into a smaller number like
24 or 48. That should speed up the mapper phase significantly.
Fr
Hi,
In your case total file size isn't main factor that reduces performance,
number of files is.
To test this try merging those over 2000 files into one (or few) big,
then upload it to HDFS and test hive performance (it should be
definitely higher). It this works you should think about mergin
10 matches
Mail list logo