Re: Issue uploading data to S3 with Hive

2012-10-01 Thread Florin Diaconeasa
Hello, I've met this issue several times before. The problem, from what i saw, is that Hive isn't actually aware of the underlying storage system (which is rather normal), as Hadoop should handle that. Also, Hadoop might get 404 from Amazon WS (i guess in order for them to throttle) and simply st

Re: HIVE and S3 via EMR?

2012-05-29 Thread Florin Diaconeasa
Try using the ALTER TABLE ADD PARTITION syntax. On May 29, 2012, at 11:20 PM, Russell Jurney wrote: > How do I load data from S3 into Hive using Amazon EMR? I've booted a small > cluster, and I want to load a 3-column TSV file from Pig into a table like > this: > > create table from_to (from_

Re: Job Scheduling in Hadoop-Hive

2012-05-28 Thread Florin Diaconeasa
Spring batch with a basic tasklet for querying the hive db should be of help. :) On 26.05.2012, at 17:48, Ronak Bhatt wrote: Hello - For those users whose setup is somewhat production, what do you use for job scheduling and dependency management? *thanks, ronak* * * * *

Re:

2011-11-24 Thread Florin Diaconeasa
> > If I set a property in the hive-site.xml, and wenn exetute "SET > > ;" in the shell I'm getting the property value that is > > in hive-default.xml file. > > > > 2011/11/23 shashwat shriparv : > >> What should add to add in hive-site.xml, where

Re: load data from s3 to hive

2011-11-23 Thread Florin Diaconeasa
Hello, 1st of all hadoop needs to use S3 as primary file system. So inside hadoop configuration core-site.xml you need to set fs.default.name with a value of the following form: s3n://your-bucket-name After this, the way i've done it in hive 0.6 and i assume it still works: alter table my_table a

Re: Profiling Hive / Metrics

2011-11-23 Thread Florin Diaconeasa
Hello, Unfortunately there are no Hive metrics. Hadoop has metrics, but not sure they would help you, as they are on a much lower level. Florin On 16 November 2011 21:43, john smith wrote: > Hey devs, > > My Hive reducers are running for too long. I wan't to profile Hive and > collect metrics

Re:

2011-11-23 Thread Florin Diaconeasa
Hi, How did you come to the conclusion that the hive-site.xml config is not taken into consideration? Florin On 22 November 2011 21:06, shashwat shriparv wrote: > I have these versions of Hive : 0.7.1 Hbase :0.90.4 and > Hadoop: 0.20.203.0rc1. > I have configured Hive, Hadoop ad Hbase, separat

Re: High number of input files problems

2011-11-03 Thread Florin Diaconeasa
ing. > > > ** ** > > ** ** > > *From:* Florin Diaconeasa [mailto:florin.diacone...@gmail.com] > *Sent:* Monday, October 31, 2011 6:37 AM > *To:* user@hive.apache.org > *Subject:* High number of input files problems > > ** ** > > Hello, > > ** ** > &

High number of input files problems

2011-10-31 Thread Florin Diaconeasa
Hello, Lately our user base has increased so the input files have increased considerably in size and number. One of our processing steps is doing a query of the form found at the end of the email. My problem is that apparently, sometimes, the processing misses some of the input files (for the 2nd

Re: Hive Dynamic Partions - How to avoid overwrite

2011-10-03 Thread Florin Diaconeasa
I would recommend doing the following SELECT: INSERT OVERWRITE INTO TABLE SELECT * FROM ( SELECT x,y,z FROM UNION ALL SELECT * FROM ) allTables; Obviously, there are rules coming with UNION ALL, such as you need to name(user alias eventually) all the columns of each select. More on this

Re: Alter table Set Locations for all partitions

2011-08-20 Thread Florin Diaconeasa
Hello, Where do you keep your metadata? If it's a regular RDBMS, you could update the tables directly. the location is in the partitions table inside your metadata database. Florin On Aug 20, 2011, at 3:52 AM, Aggarwal, Vaibhav wrote: > You could also specify fully qualified hdfs path in the c

Re: java.lang.IllegalStateException when getTable

2011-07-24 Thread Florin Diaconeasa
Hi, Where do you store the metadata? Inside an RDBMS, like MySQL? I get this as well after a certain amount of time because hive tries to keep the same connection to mysql that it had 24h ago (we run the cluster once per day). On Jul 22, 2011, at 5:55 PM, Hello World wrote: > when I run hive

Re: jets3t 0.7.4

2011-07-22 Thread Florin Diaconeasa
gt; hive 0.7.0+27.1-2~maverick-cdh3 and hadoop 0.20.2+923.21-1 > > -- > Wouter de Bie > Developer Business Intelligence, Spotify > wou...@spotify.com > +46 72 018 0777 > > On Thursday, July 21, 2011 at 9:05 PM, Florin Diaconeasa wrote: > > What hive version are you

Re: jets3t 0.7.4

2011-07-21 Thread Florin Diaconeasa
What hive version are you using? On Jul 21, 2011, at 1:10 PM, Wouter de Bie wrote: > Hi guys, > > I've just trying to upgrade to jets3t 0.7.4 from 0.6.1, because the > connection pool gets depleted after 20 requests. Now, I'm getting the > following stack trace when trying to access s3. Does a

Re: End User Clients for Hive?

2011-03-24 Thread Florin Diaconeasa
Hi, Sorry to burst in like this, but are there any plans for supporting prepared statements? Because it's a shame not having them. While true, they're not generally used (i presume), the Pentaho Report Designer could use them for example :) -----