Hi Ted, It's actually just one partition being created which is what makes it so weird.
Thanks, Shaun On 6 June 2013 18:36, Ted Xu <t...@gopivotal.com> wrote: > Hi Shaun, > > Too many partitions in dynamic partitioning may slow down the mapreduce > job. Can you estimate how many partitions will be generated after insert? > > > On Thu, Jun 6, 2013 at 4:24 PM, Shaun Clowes <sclo...@atlassian.com>wrote: > >> Hi All, >> >> Does anyone know the performance impact the dynamic partitions should be >> expected to have? >> >> I have a table that is partitioned by a string in the form 'YYYY-MM'. >> When I insert in to this table (from an external table that is just an S3 >> bucket containing gzipped logs) using dynamic partitioning I get very slow >> performance with each node in the cluster unable to process more than 2MB >> per second. When I run the exact same query with static partition values I >> get more about 30-40MB/s on each node. >> >> I've never seen this type of problem with our internal cluster running >> Hive 0.7.1 (CDH3u4), but it happens every time in EMR. >> >> Thanks, >> Shaun >> > > > > -- > Regards, > Ted Xu >