RE: Multiple Insert with Where Clauses

2013-07-30 Thread Sha Liu
Doesn't INSERT INTO do what you said? I'm not sure I understand "inserting a few records into a table". Anyway here the problem seems different to me. For my cases these where clauses for multiple inserts seem not effective, while Hive doesn't complain about that. -Sha Date: Tue, 30 Jul 2013 21:

Re: Multiple Insert with Where Clauses

2013-07-30 Thread Brad Ruderman
Hive doesn't support inserting a few records into a table. You will need to write a query to union your select and then insert. IF you can partition, then you can insert a whole partition at a time instead of the table. Thanks, Brad On Tue, Jul 30, 2013 at 9:04 PM, Sha Liu wrote: > Yes for the

RE: Multiple Insert with Where Clauses

2013-07-30 Thread Sha Liu
Yes for the example you gave, it works. It even works when there is a single insert under the from clause, but there there are multiple inserts, the where clauses seem no longer effective. Date: Tue, 30 Jul 2013 20:29:19 -0700 Subject: Re: Multiple Insert with Where Clauses From: bruder...@radiu

Re: Multiple Insert with Where Clauses

2013-07-30 Thread Brad Ruderman
Have you simply tried INSERT OVERWRITE TABLE destination SELECT col1, col2, col3 FROM source WHERE col4 = 'abc' Thanks! On Tue, Jul 30, 2013 at 8:25 PM, Sha Liu wrote: > Hi Hive Gurus, > > When using the Hive extension of multiple inserts, can we add Where > clauses for each Select statement

Multiple Insert with Where Clauses

2013-07-30 Thread Sha Liu
Hi Hive Gurus, When using the Hive extension of multiple inserts, can we add Where clauses for each Select statement, like the following? FROM ...INSERT OVERWRITE TABLE ...SELECT col1, col2, col3WHERE col4='abc'INSERT OVERWRITE TABLE ...SELECT col1, col4, col2WHERE col3='xyz' The

Re: UDFs with package names

2013-07-30 Thread Edward Capriolo
It might be a better idea to use your own package com.mystuff.x. You might be running into an issue where java is not finding the file because it assumes the relation between package and jar is 1 to 1. You might also be compiling wrong If your package is com.mystuff that class file should be in a d

Re: Hive Join with distinct rows

2013-07-30 Thread Sunita Arvind
Thanks for sharing your experience Marcin Sunita On Tue, Jul 30, 2013 at 11:54 AM, Marcin Mejran wrote: > I’ve used a rank udf for this previously, distribute and sort by the > column then select all rows where rank=1. That should work with a join but > I never tried it. It’d be an issue if t

UDFs with package names

2013-07-30 Thread Michael Malak
Thus far, I've been able to create Hive UDFs, but now I need to define them within a Java package name (as opposed to the "default" Java package as I had been doing), but once I do that, I'm no longer able to load them into Hive. First off, this works: add jar /usr/lib/hive/lib/hive-contrib-0.1

Review Request (wikidoc): LZO Compression in Hive

2013-07-30 Thread Sanjay Subramanian
Hi Met with Lefty this afternoon and she was kind to spend time to add my documentation to the site - since I still don't have editing privileges :-) Please review the new wikidoc about LZO compression in the Hive language manual. If anything is unclear or needs more information, you can email

Re: Write access for the wiki

2013-07-30 Thread Ashutosh Chauhan
Done. Added you as contributor. Happy Documenting !! Ashutosh On Tue, Jul 30, 2013 at 2:15 PM, Mark Wagner wrote: > Yes, I created it right before emailing the list: > https://cwiki.apache.org/confluence/display/~mwagner > > > On Tue, Jul 30, 2013 at 1:45 PM, Ashutosh Chauhan wrote: > >> Is tha

Re: Write access for the wiki

2013-07-30 Thread Mark Wagner
Yes, I created it right before emailing the list: https://cwiki.apache.org/confluence/display/~mwagner On Tue, Jul 30, 2013 at 1:45 PM, Ashutosh Chauhan wrote: > Is that your cwiki id ? I am not seeing it there. Remember cwiki > is separate than jira account. > > Ashutosh > > > On Tue, Jul 30, 2

Re: Write access for the wiki

2013-07-30 Thread Ashutosh Chauhan
Is that your cwiki id ? I am not seeing it there. Remember cwiki is separate than jira account. Ashutosh On Tue, Jul 30, 2013 at 1:40 PM, Mark Wagner wrote: > My id is mwagner. Thanks! > > > On Tue, Jul 30, 2013 at 1:36 PM, Ashutosh Chauhan wrote: > >> Mark, >> >> Do you have an account on hive

Re: Write access for the wiki

2013-07-30 Thread Mark Wagner
My id is mwagner. Thanks! On Tue, Jul 30, 2013 at 1:36 PM, Ashutosh Chauhan wrote: > Mark, > > Do you have an account on hive cwiki. Whats your id ? > > Thanks, > Ashutosh > > > On Tue, Jul 30, 2013 at 1:06 PM, Mark Wagner wrote: > >> Hi all, >> >> Would someone with the right permissions grant

Re: Write access for the wiki

2013-07-30 Thread Ashutosh Chauhan
Mark, Do you have an account on hive cwiki. Whats your id ? Thanks, Ashutosh On Tue, Jul 30, 2013 at 1:06 PM, Mark Wagner wrote: > Hi all, > > Would someone with the right permissions grant me write access to the Hive > wiki? I'd like to update some information on the Avro Serde. > > Thanks, >

Write access for the wiki

2013-07-30 Thread Mark Wagner
Hi all, Would someone with the right permissions grant me write access to the Hive wiki? I'd like to update some information on the Avro Serde. Thanks, Mark

Select statements return null

2013-07-30 Thread Sunita Arvind
Hi, I have written a script which generates JSON files, loads it into a dictionary, adds a few attributes and uploads the modified files to HDFS. After the files are generated, if I perform a select * from..; on the table which points to this location, I get "null, null" as the result. I also

Re: Prevent users from killing each other's jobs

2013-07-30 Thread Vinod Kumar Vavilapalli
That is correct. Seems like something else is happening. One thing to see if all your users or more importantly their group is added to the cluster-admin acl (mapreduce.cluster.administrators) You should look at mapreduce audit logs (which by default go into JobTracker logs, search for Audit).

Re: Prevent users from killing each other's jobs

2013-07-30 Thread pandees waran
Hi Mikhail, Could you please explain how we can track all the kill requests for a job? Is there any feature available in hadoop stack for this? Or do we need to track this in OS layer by capturing the signals? Thanks, Pandeesh On Jul 31, 2013 12:03 AM, "Mikhail Antonov" wrote: > In addition to

Re: Prevent users from killing each other's jobs

2013-07-30 Thread Mikhail Antonov
In addition to using job's ACLs you could have more brutal schema. Track all requests to kill the jobs, and if any request is coming from the user who should't be trying to kill this particular job, then ssh from the script to his client machine and forcibly reboot it :) 2013/7/30 Edward Capriolo

Re: Prevent users from killing each other's jobs

2013-07-30 Thread Murat Odabasi
I'm not sure how I should do that. The documentation says "A job submitter can specify access control lists for viewing or modifying a job via the configuration properties mapreduce.job.acl-view-job and mapreduce.job.acl-modify-job respectively. By default, nobody is given access in these properti

Re: Prevent users from killing each other's jobs

2013-07-30 Thread Edward Capriolo
Honestly tell your users to stop being jerks. People know if they kill my query there is going to be hell to pay :) On Tue, Jul 30, 2013 at 2:25 PM, Vinod Kumar Vavilapalli wrote: > > You need to set up Job ACLs. See > http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Authorization

Re: Prevent users from killing each other's jobs

2013-07-30 Thread Vinod Kumar Vavilapalli
You need to set up Job ACLs. See http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Authorization. It is a per job configuration, you can provide with defaults. If the job owner wishes to give others access, he/she can do so. Thanks, +Vinod Kumar Vavilapalli Hortonworks Inc. http://

Prevent users from killing each other's jobs

2013-07-30 Thread Murat Odabasi
Hi there, I am trying to introduce some sort of security to prevent different people using the cluster from interfering with each other's jobs. Following the instructions at http://hadoop.apache.org/docs/stable/cluster_setup.html and https://www.inkling.com/read/hadoop-definitive-guide-tom-white-

RE: Hive Join with distinct rows

2013-07-30 Thread Marcin Mejran
I've used a rank udf for this previously, distribute and sort by the column then select all rows where rank=1. That should work with a join but I never tried it. It'd be an issue if the join outputs a lot of records that then are all dropped since that'd slow down the query. I've actually forke

Hive Join with distinct rows

2013-07-30 Thread Sunita Arvind
Hi Praveen / All, I also have a requirement similar to the one explained (by Praveen) below: distinct rows on a single column with corresponding data from other columns. http://mail-archives.apache.org/mod_mbox/hive-user/201211.mbox/%3ccahmb8ta+r0h5a+armutookhkp8fxctown68qoz6lkfmwbrk...@mail.gmai

Re: Hive Metastore Server 0.9 Connection Reset and Connection Timeout errors

2013-07-30 Thread Nitin Pawar
The mentioned flow is called when you have unsecure mode of thrift metastore client-server connection. So one way to avoid this is have a secure way. public boolean process(final TProtocol in, final TProtocol out) throwsTException { setIpAddress(in); ... ... ... @Override protected void setI

Re: PL/SQL to HiveQL translation

2013-07-30 Thread Jérôme Verdier
Hi, Thanks for this link, it was very helpful :-) I have another question. I have some HiveQL script wich are stored into .hql file. What is the best way to execute these scripts with a Java/JDBC program ? Thanks. 2013/7/29 Brendan Heussler > Jerome, > > There is a really good page on the