Hi Ted,

It's actually just one partition being created which is what makes it so
weird.

Thanks,
Shaun


On 6 June 2013 18:36, Ted Xu <t...@gopivotal.com> wrote:

> Hi Shaun,
>
> Too many partitions in dynamic partitioning may slow down the mapreduce
> job. Can you estimate how many partitions will be generated after insert?
>
>
> On Thu, Jun 6, 2013 at 4:24 PM, Shaun Clowes <sclo...@atlassian.com>wrote:
>
>> Hi All,
>>
>> Does anyone know the performance impact the dynamic partitions should be
>> expected to have?
>>
>> I have a table that is partitioned by a string in the form 'YYYY-MM'.
>> When I insert in to this table (from an external table that is just an S3
>> bucket containing gzipped logs) using dynamic partitioning I get very slow
>> performance with each node in the cluster unable to process more than 2MB
>> per second. When I run the exact same query with static partition values I
>> get more about 30-40MB/s on each node.
>>
>> I've never seen this type of problem with our internal cluster running
>> Hive 0.7.1 (CDH3u4), but it happens every time in EMR.
>>
>> Thanks,
>> Shaun
>>
>
>
>
> --
> Regards,
> Ted Xu
>

Reply via email to