RE: Extremely Slow Data Loading with 40k+ Partitions

2015-04-16 Thread Tianqi Tong
h 40k+ Partitions How many reducers are you using? Daniel On 16 באפר׳ 2015, at 00:55, Tianqi Tong mailto:tt...@brightedge.com>> wrote: Hi, I'm loading data to a Parquet table with dynamic partitons. I have 40k+ partitions, and I have skipped the partition stats computation step. Somehow it&#x

Extremely Slow Data Loading with 40k+ Partitions

2015-04-15 Thread Tianqi Tong
Hi, I'm loading data to a Parquet table with dynamic partitons. I have 40k+ partitions, and I have skipped the partition stats computation step. Somehow it's still exetremely slow loading data into partitions (800MB/h). Do you have any hints on the possible reason and solution? Thank

RE: [Hive] Slow Loading Data Process with Parquet over 30k Partitions

2015-04-14 Thread Tianqi Tong
d is still not done yet, with partition stats. Thanks Tianqi Tong From: Slava Markeyev [mailto:slava.marke...@upsight.com] Sent: Monday, April 13, 2015 11:00 PM To: user@hive.apache.org Cc: Sergio Pena Subject: Re: [Hive] Slow Loading Data Process with Parquet over 30k Partitions This is someth

[Hive] Slow Loading Data Process with Parquet over 30k Partitions

2015-04-09 Thread Tianqi Tong
arquet.compression=SNAPPY; SET hive.exec.dynamic.partition.mode=nonstrict; SET hive.exec.max.dynamic.partitions=50; SET hive.exec.max.dynamic.partitions.pernode=5; SET hive.exec.max.created.files=100; Thank you very much! Tianqi Tong