subject:"amazon elastic mapreduce"

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

2013-06-17 Thread Shaun Clowes

Thanks for following up Ted, I couldn't work out why the progress tracking was being forced on for Dynamic Partition inserts so thanks for your helpful explanation. I'll raise a JIRA issue regarding the problem. Do you have any idea for an alternate approach? I could have a go at implementing a fix

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

2013-06-17 Thread Ted Xu

Hi Shaun, Your findings are valid. Hive uses Hadoop job counters to report fatal error, so the client can kill the MapReduce job before it completes. With regard to your case, because Hive wants to kill the MapReduce job when there is too many partitions using Dynamic Partitioning, counters repor

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

2013-06-06 Thread Shaun Clowes

Hi Ted, All, Unfortunately profiling turns out to be extremely slow, so it's not very fruitful for determining what's going on here. On the other hand I seem to have traced this problem down to the "hive.task.progress" configuration variable. When this is set to true (as it is automatically when

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

2013-06-06 Thread Ted Xu

Hi Shaun, This is weird. I'm not sure if there is any other reasons (e.g., a very complex UDF?) caused this issue, but it would be the best if you can do a profiling, see if there is hot spot. On Thu, Jun 6, 2013 at 4:38 PM, Sh

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

2013-06-06 Thread Shaun Clowes

Hi Ted, It's actually just one partition being created which is what makes it so weird. Thanks, Shaun On 6 June 2013 18:36, Ted Xu wrote: > Hi Shaun, > > Too many partitions in dynamic partitioning may slow down the mapreduce > job. Can you estimate how many partitions will be generated after

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

2013-06-06 Thread Ted Xu

Hi Shaun, Too many partitions in dynamic partitioning may slow down the mapreduce job. Can you estimate how many partitions will be generated after insert? On Thu, Jun 6, 2013 at 4:24 PM, Shaun Clowes wrote: > Hi All, > > Does anyone know the performance impact the dynamic partitions should be

Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

2013-06-06 Thread Shaun Clowes

Hi All, Does anyone know the performance impact the dynamic partitions should be expected to have? I have a table that is partitioned by a string in the form '-MM'. When I insert in to this table (from an external table that is just an S3 bucket containing gzipped logs) using dynamic partitio

Re: How to create/add Amazon Elastic Mapreduce Instances in VPC ?

2012-05-04 Thread Pedro Figueiredo

On 4 May 2012, at 14:10, Bhavesh Shah wrote: > Hello all, > I have Elastic Mapreduce instance. While executing hive job flow I needed > Subnet ID to access the VPC. > Is there any way to add/create the Amazon Elastic Mapreduce Instance in that > VPC? If you're using th

How to create/add Amazon Elastic Mapreduce Instances in VPC ?

2012-05-04 Thread Bhavesh Shah

Hello all, I have Elastic Mapreduce instance. While executing hive job flow I needed Subnet ID to access the VPC. Is there any way to add/create the Amazon Elastic Mapreduce Instance in that VPC? -- Regards, Bhavesh Shah

Related to speed of execution of Job in Amazon Elastic Mapreduce

2012-05-03 Thread Bhavesh Shah

performance is very poor on my single local machine ( It takes near about 3 hrs to execute completely). I want to reduce that time as much less as possible. For that we have decided to use Amazon Elastic Mapreduce. Currently I am using 3 m1.large instance and still I have same performance as on my local

Re: amazon elastic mapreduce

2011-12-11 Thread Aniket Mokashi

Hi, You have a couple of options to save your intermediate state- 1. If your metastore is HA, you can save your state in metastore (eg- alter table TBLPROPERTIES ("job.state", "DoneTill:121122)). 2. You can periodically save your state in EMR-local drives and upload it to s3. You can use any cust

amazon elastic mapreduce

2011-12-11 Thread Cam Bazz

Hello All, So I had a single node pseudo cluster that has been calculating me some statistics running for a year. finally it grew more than do-it-at-home task. So I have my data uploaded to s3, and I have configured everything so that I can load my tables, and load the partitions, and the data is

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

Re: Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

Extremely slow throughput with dynamic partitions using Hive 0.8.1 in Amazon Elastic Mapreduce

Re: How to create/add Amazon Elastic Mapreduce Instances in VPC ?

How to create/add Amazon Elastic Mapreduce Instances in VPC ?

Related to speed of execution of Job in Amazon Elastic Mapreduce

Re: amazon elastic mapreduce

amazon elastic mapreduce

12 matches

Site Navigation

Mail list logo

Footer information