Re: New to hive.

2013-06-19 Thread Shaun Clowes
I'd recommend taking a look at Hadoop, The Definitive Guide. It's a good book and will explain what you're looking for. Cheers, Shaun On 20 June 2013 08:54, Bharati wrote: > Hi Folks, > > I am new to hive and need information, tutorials etc that you can point > to. I have installed hive to wor

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

2013-06-17 Thread Shaun Clowes
gt; report is forced to enable. IMHO, fatal error report should not depend on > the "job progress" switch. You can file a JIRA ticket on this one. > > > On Fri, Jun 7, 2013 at 1:55 PM, Shaun Clowes wrote: > >> Hi Ted, All, >> >> Unfortunately profiling turn

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

2013-06-06 Thread Shaun Clowes
do a > profiling<http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Profiling>, > see if there is hot spot. > > > On Thu, Jun 6, 2013 at 4:38 PM, Shaun Clowes wrote: > >> Hi Ted, >> >> It's actually just one partition being created which is

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

2013-06-06 Thread Shaun Clowes
will be generated after insert? > > > On Thu, Jun 6, 2013 at 4:24 PM, Shaun Clowes wrote: > >> Hi All, >> >> Does anyone know the performance impact the dynamic partitions should be >> expected to have? >> >> I have a table that is partitioned by a string

Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

2013-06-06 Thread Shaun Clowes
Hi All, Does anyone know the performance impact the dynamic partitions should be expected to have? I have a table that is partitioned by a string in the form '-MM'. When I insert in to this table (from an external table that is just an S3 bucket containing gzipped logs) using dynamic partitio

UDFs and Thread Safety?

2013-03-10 Thread Shaun Clowes
Hi All, Could anyone describe what the required thread safety for a UDF is? I understand that one is instantiated for each use of the function in an expression, but can there be multiple threads executing the methods of a single UDF object at once? Thanks, Shaun