Re: Any idea how to find out how long map reduce jobs are running?

2014-05-20 Thread Brad Ruderman
Wrap the sqoop job in a python subprocess. Run that process on a separate thread. On Tue, May 20, 2014 at 9:57 AM, wrote: > There is no specific map reduce program here. Actually sqoop jobs run > and internally that executes a map-reduce job. > > > > *From:* Brad Ruder

Re: Any idea how to find out how long map reduce jobs are running?

2014-05-20 Thread Brad Ruderman
track how long a map-reduce job runs. If the > map reduce job runs after a specific time, certain actions should be taken. > > > > Have you ever created one? > > > > Thanks, > > Shouvanik > > > > *From:* Brad Ruderman [mailto:bruder...@radiumone.com] > *S

Re: Any idea how to find out how long map reduce jobs are running?

2014-05-20 Thread Brad Ruderman
Check jobtracker. Or you could do: mapred job -list Thanks, Brad On Tue, May 20, 2014 at 9:35 AM, wrote: > Hi, > > > > I need help in finding how long a map-reduce job runs. Any help is highly > appreciated. > > > > Thanks, > > Shouvanik > > -- > > This message i

Re: UDFs in Beeline w/ Hive 0.13.0

2014-04-30 Thread Brad Ruderman
Post your exact code. I think you might be using a reserved word perhaps? Thanks, Brad On Wed, Apr 30, 2014 at 11:32 AM, Bryan Jeffrey wrote: > Hello. > > We have a number of UDFs that were working under Hive 0.12.0. After > upgrade to Hive 13 we are seeing errors executing queries with UDFs.

Re: Problem adding jar using pyhs2

2014-04-28 Thread Brad Ruderman
that when "add jar file.jar" is run through pyhs2, the fulle > command gets passed to AddResourceProcessor.run(), yet > AddResourceProcessor.run() is written such that it only expects "jar > file.jar" to get passed to it. That's how it appears to work when > &qu

Re: Problem adding jar using pyhs2

2014-04-26 Thread Brad Ruderman
An easy solution would be to add the jar to the classpath or auxlibs therefore every instance of hive already has the jar and you just need to create the temporary function. Else you can put the JAR in HDFS and reference the add jar using the hdfs scheme. Example: import pyhs2 with pyhs2.connect

Re: Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found

2014-04-25 Thread Brad Ruderman
I am guessing you are missing the plain kerb plugin. Try doing a: yum install cyrus-sasl-plain What auth are you using on HS2? Thanks, Brad On Fri, Apr 25, 2014 at 9:11 PM, Manish Maheshwari wrote: > Hi, > > I am using pyhs2 with HortonWorks Hadoop Image and am stuck at - > > >>> import pyhs2

Re: move hive tables from one cluster to another cluster

2014-02-28 Thread Brad Ruderman
Hi, just want to give warning about export/import to different hive versions see this bug: https://issues.apache.org/jira/browse/HIVE-5318 Thanks, Brad On Fri, Feb 28, 2014 at 1:55 PM, Edward Capriolo wrote: > Hive also has export import utilities. > > > https://cwiki.apache.org/confluence/disp

Re: hiveserver2 crashes now and then

2014-02-18 Thread Brad Ruderman
Hi Shouvanik- Can you send the hive server 2 logs? Also might want to reach out to the Accenture Tech lab's Data and Platforms group for client support as they should have in-depth experience with Hive Configuration. Mike Wendt would be a good contact. Thanks, Brad On Tue, Feb 18, 2014 at 3:18 P

Calling Community to Test Feature of pyhs2 on Kerberized Hive

2014-02-13 Thread Brad Ruderman
Hope all is well. I recently released a python wrapper around thrift for connecting to Hive Server 2. One of the big functionalities I was looking to implement was kerberos authentication support. Charith-qubit was gracious enough to modify the code and add the support. He has created a pull reque

Re: How can I just find out the physical location of a partitioned table in Hive

2014-02-06 Thread Brad Ruderman
desc extended "table name" Thanks, Brad On Thu, Feb 6, 2014 at 10:23 AM, Raj Hadoop wrote: > Hi, > > How can I just find out the physical location of a partitioned table in > Hive. > > Show partitions > > gives me just the partition column info. > > I want the location of the hdfs directory

Re: Hive Server 2 Python Client Drivers

2014-02-03 Thread Brad Ruderman
; Client<https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-PythonClient> > > > Thanks for your contribution. > > -- Lefty > > > On Tue, Oct 29, 2013 at 12:55 AM, Lefty Leverenz > wrote: > >> When it's ready,

Re: Hive ODBC Error

2013-12-04 Thread Brad Ruderman
I have had much better luck with the Cloudera driver, especially since you are using the cloudera dist. Can you send the logs from /var/log/hive/hive-server2.out and hive-server2.log? Thanks! On Wed, Dec 4, 2013 at 8:26 AM, Joseph D Antoni wrote: > To all, > > I'm trying to connect Tableau to

Re: how to find number of elements in an array in Hive

2013-12-02 Thread Brad Ruderman
Check out size https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF Thanks, Brad On Mon, Dec 2, 2013 at 5:05 PM, Raj Hadoop wrote: > hi, > > how to find number of elements in an array in Hive table? > > thanks, > Raj > >

Re: External Partition Table

2013-10-31 Thread Brad Ruderman
But just wanted to know whether this is something > normal or not at all a normal thing. > > Thanks, > Raj > > > On Thursday, October 31, 2013 6:39 PM, Brad Ruderman < > bruder...@radiumone.com> wrote: > Wow that question won't be answerable. It all depend

Re: External Partition Table

2013-10-31 Thread Brad Ruderman
Wow that question won't be answerable. It all depends on the amount of data per partition and the queries you are going to be executing on it, as well as the structure of the data. In general in hive (depending on your cluster size) you need to balance the number of files with the size, smaller num

Re: request Hive wiki write access

2013-10-28 Thread Brad Ruderman
3rd as well. I would like to add information about hs2 client libraries (ruby,node,python). bradruder...@gmail.com Thanks, Brad On Mon, Oct 28, 2013 at 5:55 PM, Mikhail Antonov wrote: > Could you please also add me? olorinb...@gmail.com > > I wanted to add details about LDAP integration > > -M

Re: Hive Server 2 Python Client Drivers

2013-10-23 Thread Brad Ruderman
pache.org/confluence/display/Hive/HiveServer2+Clients > > and save some other poor saps from re-inventing the wheel. > > > > > > > > On Wed, Oct 23, 2013 at 2:42 PM, Brad Ruderman wrote: > >> Hi All- >> I have struggled for awhile with a simple and strai

Hive Server 2 Python Client Drivers

2013-10-23 Thread Brad Ruderman
Hi All- I have struggled for awhile with a simple and straightforward driver that I can use to connect to Hive Server 2 in a very similar manner as a mysql driver in python. I know there are a few ways like using thrift or ODBC but all require significant amount of installation. I decided to create

Re: how to make async call to hive

2013-09-29 Thread Brad Ruderman
Typically it be your application that opens the process off the main thread. Hue (Beeswax specifically) does this and you can see the code here: https://github.com/cloudera/hue/tree/master/apps/beeswax Thx On Sun, Sep 29, 2013 at 5:15 PM, kentkong_work wrote: > ** > hi all, > just wonder if th

Re: Export/Import Table in Hive NPE

2013-09-19 Thread Brad Ruderman
Hi All- I have opened up a ticket for this issue: https://issues.apache.org/jira/browse/HIVE-5318 Can anyone repo to confirm its a bug with Hive and not with a configuration within my instance? THanks, Brad On Tue, Sep 17, 2013 at 2:22 PM, Brad Ruderman wrote: > Hi All- > I am try

Re: Export/Import Table in Hive NPE

2013-09-18 Thread Brad Ruderman
more 13/09/18 13:14:02 INFO ql.Driver: 13/09/18 13:14:02 INFO ql.Driver: 13/09/18 13:14:02 INFO ql.Driver: 13/09/18 13:14:02 INFO ql.Driver: 13/09/18 13:14:02 INFO ql.Driver: Thanks, Brad On Tue, Sep 17, 2013 at 2:22 PM, Brad Ruderman wrote: > Hi All- > I am trying to export a table i

Export/Import Table in Hive NPE

2013-09-17 Thread Brad Ruderman
Hi All- I am trying to export a table in Hive 0.9, then import it into Hive 0.10 staging. Essentially moving data from a production import to staging. I used the EXPORT table command, however when I try to import the table back into staging I receive the following (pulled from the hive.log file).

Re: Mappers per job per user

2013-08-30 Thread Brad Ruderman
er for scheduling the >> jobs rather than the default FIFO scheduler. >> >> Regards >> Ravi Magham >> >> >> On Fri, Aug 30, 2013 at 10:07 PM, Brad Ruderman >> wrote: >> >>> Hi All- >>> I was hoping to gather some insight in h

Mappers per job per user

2013-08-30 Thread Brad Ruderman
Hi All- I was hoping to gather some insight in how the hadoop (and or hive) job scheduler distributes mappers per user. I am running into an issue where I see that hadoop (and or hive) is evenly distributing mappers per user instead of per job. For example: -We have 1000 mapper capacity -10 Jobs a

Re: Multiple Insert with Where Clauses

2013-07-30 Thread Brad Ruderman
Hive doesn't support inserting a few records into a table. You will need to write a query to union your select and then insert. IF you can partition, then you can insert a whole partition at a time instead of the table. Thanks, Brad On Tue, Jul 30, 2013 at 9:04 PM, Sha Liu wrote: > Yes for the

Re: Multiple Insert with Where Clauses

2013-07-30 Thread Brad Ruderman
Have you simply tried INSERT OVERWRITE TABLE destination SELECT col1, col2, col3 FROM source WHERE col4 = 'abc' Thanks! On Tue, Jul 30, 2013 at 8:25 PM, Sha Liu wrote: > Hi Hive Gurus, > > When using the Hive extension of multiple inserts, can we add Where > clauses for each Select statement

Re: Best Performance on Large Scale Join

2013-07-29 Thread Brad Ruderman
y chance you can partition data? are there any columns you have > on which you can create buckets? > > I have done joins having 10 billion records in one table but other table > was significantly smaller. and I had a 1000 node cluster ad disposal > > > > > > On Mon, J

Best Performance on Large Scale Join

2013-07-29 Thread Brad Ruderman
Hi All- I have 2 tables: CREATE TABLE users ( a bigint, b int ) CREATE TABLE products ( a bigint, c int ) Each table has about 8 billion records (roughly 2k files total mappers). I want to know the most performant way to do the following query: SELECT u.b, p.c, coun

Re: how to know a hive query failed

2013-07-16 Thread Brad Ruderman
one): if hive_exception is not None: hive_exception(stdout,stderr) else: if stderr != '': stderr = stderr.lower() if stderr.find('error') > -1 or stderr.find('failed') >-1: if stderr.find('log4j:error could not find value for key log4j.appender.fa') =

Re: how to know a hive query failed

2013-07-16 Thread Brad Ruderman
You need to stream and read the stderr and stdout for text messages alerting there is an error. In python this is what I use: On Tue, Jul 16, 2013 at 7:42 PM, kentkong_work wrote: > ** > hi, > I use a shell script to run hive query in background, like this > hive -e "select uname, age fr