Hive User Group Meeting

2014-07-07 Thread Xuefu Zhang
Dear Hive users, Hive community is considering a user group meeting during Hadoop World that will be held in New York October 15-17th. To make this happen, your support is essential. First, I'm wondering if any user, especially those in New York area would be willing to host the meetup. Secondly,

Re: hiveserver2 0.12 and 0.13 incompatible?

2014-07-10 Thread Xuefu Zhang
Yeah. It's expected that 13 client is not able to talk to the older sever. However, the other direction is fine. That is, old 12 client should be able to talk to 13 server. --Xuefu On Thu, Jul 10, 2014 at 3:09 PM, Edward Capriolo wrote: > 2014-07-10 22:00:03 ERROR HiveConnection:425 - Error op

Re: beeline client

2014-07-11 Thread Xuefu Zhang
Chaudra, The difference you saw between Hive CLI and Beeline might indicate a bug. However, before making such a conclusion, could you give an example of your queries? Are the jobs you expect to run parallel for a single query? Please note that your script file is executed line by line in either c

Re: Hive job scheduling

2014-07-11 Thread Xuefu Zhang
Or you can just run CRON tasks in your OS. On Thu, Jul 10, 2014 at 4:55 PM, moon soo Lee wrote: > for simpler use, Zeppelin (http://zeppelin-project.org) runs hive query > with web based editor, and it's got cron tab style scheduler. > > Best, > moon > > > On Fri, Jul 11, 2014 at 8:52 AM, Marti

Re: beeline client

2014-07-13 Thread Xuefu Zhang
ive –f script1.hql & > > Hive –f script2.hql & > > Hive –f script3.hql & > > Hive –f script4.hql & > > Hive –f script5.hql & > > > > > > *From:* Xuefu Zhang [mailto:xzh...@cloudera.com] > *Sent:* Friday, July

Re: Hive User Group Meeting

2014-07-25 Thread Xuefu Zhang
ingly. Many thanks to about.com for hosting the event. Also thanks go to Edward Capriolo and his company, HuffPost, as they also offered to host. Once we have a list of talks, I shall update you again. Thanks and have a nice weekend! --Xuefu On Mon, Jul 7, 2014 at 6:01 PM, Xuefu Zhang wrote: &

Re: Hive User Group Meeting

2014-08-26 Thread Xuefu Zhang
plan the meeting accordingly. Currently, we still have a few talk slots open. Please let me know if you're interested to give a talk. Regards, Xuefu On Mon, Jul 7, 2014 at 6:01 PM, Xuefu Zhang wrote: > Dear Hive users, > > Hive community is considering a user group meeting

Re: Hive Mapred local task distribution

2014-09-06 Thread Xuefu Zhang
By "same host", don't you mean your HiveServer2 host? One solution is to have multiple HiveServer2 instances and do load balance among them. --Xuefu On Fri, Sep 5, 2014 at 11:37 PM, Abhilash L L wrote: > Hello, > >We are using Hive 0.11 connecting to it via Hive Thrift server 2. > >A l

Re: Parquet Binary Column Support

2014-09-06 Thread Xuefu Zhang
I don't think there is any issue keeping it away. The only issue is resource. We welcome effort from the community to move it forward. I'm willing to coach/review it. --Xuefu On Sat, Sep 6, 2014 at 8:18 AM, John Omernik wrote: > Greetings all - > > We really want to look into the Parquet file

Re: Hive Mapred local task distribution

2014-09-06 Thread Xuefu Zhang
this email are confidential. Please contact the > Sender if you have received this email in error. > > > > On Sat, Sep 6, 2014 at 7:53 PM, Xuefu Zhang wrote: > >> By "same host", don't you mean your HiveServer2 host? One solution is to >> have multiple HiveS

Re: Hive User Group Meeting

2014-09-10 Thread Xuefu Zhang
Hi all, I'm very excited as we are just about one month away from the meetup. Here is a list of talks that will be delivered in the coming Hive user group meeting. 1. Julian Hyde, cost-based optimization, Optiq, and materialized views 2. Xuefu Zhang, Hive on Spark 3. George Chow, Updates on

Re: [ANNOUNCE] New Hive Committer - Eugene Koifman

2014-09-12 Thread Xuefu Zhang
Congratulations, Eugene! --Xuefu On Fri, Sep 12, 2014 at 4:08 PM, Gunther Hagleitner < ghagleit...@hortonworks.com> wrote: > Congrats Eugene! > > Cheers, > Gunther. > > On Fri, Sep 12, 2014 at 3:47 PM, Prasanth Jayachandran < > pjayachand...@hortonworks.com> wrote: > >> Congrats Eugene! >> >> T

Re: Editing hive wiki?

2014-09-19 Thread Xuefu Zhang
Done for tispratik. BTW, @Lefty, I think the sun sets in west coast. :) On Fri, Sep 19, 2014 at 2:12 PM, pratik khadloya wrote: > Sorry, forgot to mention it previously. It is "tispratik". > > Thanks, > Pratik > > On Fri, Sep 19, 2014 at 2:02 PM, Lefty Leverenz > wrote: > >> What's your Conflu

Re: Editing hive wiki?

2014-09-19 Thread Xuefu Zhang
ct the Infra Team. > > Can this be enabled? > > ~Pratik > > On Fri, Sep 19, 2014 at 2:28 PM, Xuefu Zhang wrote: > >> Done for tispratik. >> >> BTW, @Lefty, I think the sun sets in west coast. :) >> >> On Fri, Sep 19, 2014 at 2:12 PM, pratik khadlo

Re: how to read array enclosed within square brackets

2014-09-22 Thread Xuefu Zhang
Hive doesn't know it needs to skip your square brackets, so you numbers are really [1, 2, and 3]. [1 and 3] cannot be parsed to numbers, so they become null. I think you interpret the second column as [1, 2, 3] of type string. Then you can remove the brackets, and use a UDF (write your own if the

Hive User Group Meeting, Oct. 15th, New York

2014-10-02 Thread Xuefu Zhang
* Hive on Spark, by Xuefu Zhang * Cost-Based Optimization, Optiq, and Materialized Views, by Julian Hyde * Updates on Hive Thrift Protocol, by George Chow * What's new in Apache Sentry, by Prasad Mujumdar * Insert, update, and delete in Hive, by Owen O'Malley * Overcoming data wareh

Re: Variable set via query

2014-10-10 Thread Xuefu Zhang
This is more like macro than variable. Refer to https://github.com/myui/hivemall/blob/master/scripts/ddl/define-macros.hive for some examples/usage. --Xuefu On Fri, Oct 10, 2014 at 6:21 PM, Martin, Nick wrote: > Hi all, > > Wondering of its possible for me to set a variable via a query. Somethi

Re: Hive User Group Meeting, Oct. 15th, New York

2014-10-13 Thread Xuefu Zhang
36 AM, Xuefu Zhang wrote: > Dear Hive users/developers, > > The next Hive user group meeting is around the corner. It will be held on > Oct. 15th, from 6:30pm to 9:00pm at 1500 Broadway, 6th floor, New York, NY > 10036. This will be a great opportunity for vast users and developers

Update: Hive user group meeting tonight

2014-10-15 Thread Xuefu Zhang
Hi all, Quick update, you should be able to attend as long as you have an ID with you. (Sorry about the confusion.) For those willing to dial in, the info is on the meetup page. http://www.meetup.com/Hive-User-Group-Meeting/events/202007872/ Regards, Xuefu

Appreciation: Hive user group meeting NY

2014-10-15 Thread Xuefu Zhang
Hi all, About an half an hour ago, the Hive user group meetup NY was concluded. It had 7 talks and a lot of content was covered. Representing Hive community, I'd like to thank those who gave talks or participated in the meetup. Especially, many thanks to about.com for hosting event. I look forwar

Re: [ANNOUNCE] New Hive PMC Member - Alan Gates

2014-10-27 Thread Xuefu Zhang
Congratulations, Alan! On Mon, Oct 27, 2014 at 3:43 PM, Matthew McCline wrote: > Congratulations! > > On Mon, Oct 27, 2014 at 3:38 PM, Carl Steinbach wrote: > > > I am pleased to announce that Alan Gates has been elected to the Hive > > Project Management Committee. Please join me in congratula

Re: How to run TestCliDriver Unit Test

2014-10-27 Thread Xuefu Zhang
You need to run the command in itests directory. --Xuefu On Mon, Oct 27, 2014 at 8:41 PM, Gordon Wang wrote: > + hive user group. :) > > -- Forwarded message -- > From: Gordon Wang > Date: Tue, Oct 28, 2014 at 11:22 AM > Subject: How to run TestCliDriver Unit Test > To: d...@hi

Re: maven build failure for hive with spark

2014-11-26 Thread Xuefu Zhang
Thanks for your interest. Please remove org/apache/spark in your local maven repo and try again. This may fix your problem. --Xuefu On Wed, Nov 26, 2014 at 1:03 AM, Somnath Pandeya < somnath_pand...@infosys.com> wrote: > Hi > > I am trying to build hive with spark engine but getting maven build

Re: can't get smallint field from hive on spark

2014-11-27 Thread Xuefu Zhang
Could you provide a test case to demonstrate the problem? Also the whole exception trace should be helpful. Thanks, Xuefu On Wed, Nov 26, 2014 at 11:20 PM, 诺铁 wrote: > we are currently using hive on spark, when reading a small int field, it > reports error: > Cannot get field 'i16Val' because u

Re: maven build failure for hive with spark

2014-11-27 Thread Xuefu Zhang
:570) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

Re: maven build failure for hive with spark

2014-11-28 Thread Xuefu Zhang
dead letters encountered. This logging can be turned > off or adjusted with configuration settings 'akka.log-dead-letters' and > 'akka.log-dead-letters-during-shutdown'. > > > > And query is failing > > Please let me know what can be issue. > > > > Thanks

Re: Job aborted due to stage failure

2014-12-01 Thread Xuefu Zhang
It seems that wrong class, HiveInputFormat, is loaded. The stacktrace is way off the current Hive code. You need to build Spark 1.2 and copy spark-assembly jar to Hive's lib directory and that it. --Xuefu On Mon, Dec 1, 2014 at 6:22 PM, yuemeng1 wrote: > hi,i built a hive on spark package and

Re: java.lang.NoClassDefFoundError: org/apache/spark/SparkJobInfo

2014-12-01 Thread Xuefu Zhang
I think your spark cluster needs to be a build from latest Spark-1.2 branch. You need to build it yourself. --Xuefu On Mon, Dec 1, 2014 at 7:59 PM, yuemeng1 wrote: > i get a spark-1.1.0-bin-hadoop2.4 from( > http://ec2-50-18-79-139.us-west-1.compute.amazonaws.com/data/) and > replace the Spark

Re: Job aborted due to stage failure

2014-12-01 Thread Xuefu Zhang
t's > spark-assembly-1.2.0-SNAPSHOT-hadoop2.4.0.jar produce error like that > > > On 2014/12/2 11:03, Xuefu Zhang wrote: > > It seems that wrong class, HiveInputFormat, is loaded. The stacktrace is > way off the current Hive code. You need to build Spark 1.2 and copy >

Re: Job aborted due to stage failure

2014-12-02 Thread Xuefu Zhang
or.java:603) > at java.lang.Thread.run(Thread.java:722) > > Driver stacktrace: > > > > i think my spark clusters did't had any problem,but why always give me > such error > > > > > > > > > > > > > > > > > > > &g

Re: Job aborted due to stage failure

2014-12-02 Thread Xuefu Zhang
r; > select distinct st.sno,sname from student st join score sc > on(st.sno=sc.sno) where sc.cno IN(11,12,13) and st.sage > 28;(work in mr) > 4) > studdent.txt file > 1,rsh,27,female > 2,kupo,28,male > 3,astin,29,female > 4,beike,30,male > 5,aili,31,famle > > score.txt

Re: Job aborted due to stage failure

2014-12-03 Thread Xuefu Zhang
,first to use it ,then feel good or bad? > and if u need,i can add something to start document > > > thanks > yuemeng > > > > > > > On 2014/12/3 11:03, Xuefu Zhang wrote: > > When you build Spark, remove -Phive as well as -Pyarn. When you run hive > q

Re: spark worker nodes getting disassociated while running hive on spark

2015-01-05 Thread Xuefu Zhang
Hi Somnath, The error seems nothing to do with Hive. I haven't seen this problem, but I'm wondering if your cluster has any configuration issue, especially the timeout values for network communications. The default values worked well for us fine. If the problem persists, please provide detailed i

Re: Hive create table line terminated by '\n'

2015-01-13 Thread Xuefu Zhang
Consider using dataformat other than TEXT such as sequence file. On Mon, Jan 12, 2015 at 10:54 PM, 王鹏飞 wrote: > Thank you,maybe i didn't express my question explicitly.I know the hive > create table clause,and there exists FIELDS TERMINATED BY etc. > For example,if i use FIELDS TERMINATED BY ' ,

Re: Set variable via query

2015-01-13 Thread Xuefu Zhang
select * from someothertable where dt IN (select max(dt) from sometable); On Tue, Jan 13, 2015 at 4:39 PM, Martin, Nick wrote: > Hi all, > > I'm looking to set a variable in Hive and use the resulting value in a > subsequent query. Something like: > > set startdt='select max(dt) from sometabl

Re: how to determine the memory usage of select,join, in hive on spark?

2015-01-24 Thread Xuefu Zhang
Hi, Since you have only one worker, you should be able to use jmap to get a dump of the worker process. In Hive, you can configure the memory usage for join. As to the slowness and hive GC you observed, I'm thinking this might have to do with your query. Could you share it? Thanks, Xuefu On Thu

Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-28 Thread Xuefu Zhang
Congratulations to all! --Xuefu On Wed, Jan 28, 2015 at 1:15 PM, Carl Steinbach wrote: > I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen > O'Malley and Prasanth Jayachandran have been elected to the Hive Project > Management Committee. Please join me in congratulating th

Re: Union all with a field 'hard coded'

2015-01-30 Thread Xuefu Zhang
Use column alias: INSERT OVERWRITE TABLE all_dictionaries_ext SELECT name, id, category FROM dictionary UNION ALL SELECT NAME, ID, "CAMPAIGN" as category FROM md_campaigns On Fri, Jan 30, 2015 at 1:41 PM, Philippe Kernévez wrote: > Hi all, > > I would like to do union all with a fiel

Re: Union all with a field 'hard coded'

2015-02-02 Thread Xuefu Zhang
ther query clauses? > > -- Lefty > > On Sun, Feb 1, 2015 at 11:27 AM, Philippe Kernévez > wrote: > >> Perfect. >> >> Thank you Xuefu. >> >> Philippe >> >> On Fri, Jan 30, 2015 at 11:32 PM, Xuefu Zhang >> wrote: >> >>

Re: Does Hive 1.0.0 still support commandline

2015-02-09 Thread Xuefu Zhang
There should be no confusion. While in 1.0 you can still use HiveCLI, you don't have HiveCLI + HiveSever1 option. You will not able to connect HiveServer2 with HiveCLI. Thus, the clarification is: You can only use HiveCLI as a standalone application in 1.0. --Xuefu On Mon, Feb 9, 2015 at 9:17 AM

Re: Union all with a field 'hard coded'

2015-02-20 Thread Xuefu Zhang
gt; >> This is a part of standard SQL syntax, isn't it? >> >> On Mon, Feb 2, 2015 at 2:22 PM, Xuefu Zhang wrote: >> >>> Yes, I think it would be great if this can be documented. >>> >>> --Xuefu >>> >>> On Sun, Feb 1,

Re: Union all with a field 'hard coded'

2015-02-21 Thread Xuefu Zhang
anualUnion-ColumnAliasesforUNIONALL> > . > > Please review it one more time. > > -- Lefty > > On Fri, Feb 20, 2015 at 7:06 AM, Xuefu Zhang wrote: > >> Hi Lefty, >> >> The description seems good to me. I just slightly modified it so that it >> sounds mor

Re: Union all with a field 'hard coded'

2015-02-21 Thread Xuefu Zhang
> it. (Tech writing by successive approximation.) > > Thanks again. > > -- Lefty > > On Sat, Feb 21, 2015 at 6:27 AM, Xuefu Zhang wrote: > >> I haven't tried union distinct, but I assume the same rule applies. >> >> Thanks for putting it together. It look

Re: Bucket map join - reducers role

2015-02-27 Thread Xuefu Zhang
Could you post your query and "explain your_query" result? On Fri, Feb 27, 2015 at 5:32 AM, murali parimi < muralikrishna.par...@icloud.com> wrote: > Hello team, > > I have two tables A and B. A has 360Million rows with one column K. B has > around two billion rows with multiple columns includin

Re: error: Failed to create spark client. for hive on spark

2015-03-02 Thread Xuefu Zhang
Could you check your hive.log and spark.log for more detailed error message? Quick check though, do you have spark-assembly.jar in your hive lib folder? Thanks, Xuefu On Mon, Mar 2, 2015 at 5:14 AM, scwf wrote: > Hi all, > anyone met this error: HiveException(Failed to create spark client.) >

Re: Where does hive do sampling in order by ?

2015-03-02 Thread Xuefu Zhang
there is no sampling for order by in Hive. Hive uses a single reducer for order by (if you're talking about MR execution engine). Hive on Spark is different for this, thought. Thanks, Xuefu On Mon, Mar 2, 2015 at 2:17 AM, Jeff Zhang wrote: > Order by usually invoke 2 steps (sampling job and re

Re: error: Failed to create spark client. for hive on spark

2015-03-02 Thread Xuefu Zhang
pl.( > SparkClientImpl.java:96) > ... 26 more > Caused by: java.util.concurrent.TimeoutException: Timed out waiting for > client connection. > at org.apache.hive.spark.client.rpc.RpcServer$2.run(RpcServer. > java:134) > at io.netty.util.concurrent.PromiseTask$RunnableAdapter. > c

Re: [ANNOUNCE] Apache Hive 1.1.0 Released

2015-03-09 Thread Xuefu Zhang
Great job, guys! This is a much major release with significant new features and improvement. Thanks to everyone who contributed to make this happen. Thanks, Xuefu On Sun, Mar 8, 2015 at 10:40 PM, Brock Noland wrote: > The Apache Hive team is proud to announce the the release of Apache > Hive v

Re: Does any one know how to deploy a custom UDAF jar file in SparkSQL

2015-03-10 Thread Xuefu Zhang
This question seems more suitable to Spark community. FYI, this is Hive user list. On Tue, Mar 10, 2015 at 5:46 AM, shahab wrote: > Hi, > > Does any one know how to deploy a custom UDAF jar file in SparkSQL? Where > should i put the jar file so SparkSQL can pick it up and make it accessible > fo

Re: [hive building error] can't download pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom

2015-03-10 Thread Xuefu Zhang
The aws instance is done. We are working to restore it. Thanks, Xuefu On Tue, Mar 10, 2015 at 12:17 AM, wangzhenhua (G) wrote: > Hi, all, > > When I build hive source using Maven, it gets stuck in: > "Downloading: > http://ec2-50-18-79-139.us-west-1.compute.amazonaws.com/data/spark_2.10-1.3-r

Re: Hive on Spark

2015-03-13 Thread Xuefu Zhang
You need to copy the spark-assembly.jar to your hive/lib. Also, you can check hive.log to get more messages. On Fri, Mar 13, 2015 at 4:51 AM, Amith sha wrote: > Hi all, > > > Recently i have configured Spark 1.2.0 and my environment is hadoop > 2.6.0 hive 1.1.0 Here i have tried hive on Spark w

Re: Hive on Spark

2015-03-16 Thread Xuefu Zhang
482638193 end=1426482732205 duration=94012 > from=org.apache.hadoop.hive.ql.Driver> > 2015-03-16 10:42:12,205 INFO [main]: log.PerfLogger > (PerfLogger.java:PerfLogBegin(121)) - from=org.apache.hadoop.hive.ql.Driver> > 2015-03-16 10:42:12,544 INFO [main]: log.PerfLogger > (Perf

Re: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio Pena

2015-03-23 Thread Xuefu Zhang
Congratulations to all! --Xuefu On Mon, Mar 23, 2015 at 11:08 AM, Carl Steinbach wrote: > The Apache Hive PMC has voted to make Jimmy Xiang, Matt McCline, and > Sergio Pena committers on the Apache Hive Project. > > Please join me in congratulating Jimmy, Matt, and Sergio. > > Thanks. > > - Car

Re: merge small orc files

2015-04-20 Thread Xuefu Zhang
Also check hive.merge.size.per.task and hive.merge.smallfiles.avgsize. On Mon, Apr 20, 2015 at 8:29 AM, patcharee wrote: > Hi, > > How to set the configuration hive-site.xml to automatically merge small > orc file (output from mapreduce job) in hive 0.14 ? > > This is my current configuration> >

Re: Table Lock Manager: ZooKeeper cluster

2015-04-20 Thread Xuefu Zhang
I'm not a zookeeper expert, but zookeeper is supposed to be characteristics of light-weight, high performance, and fast response. Unless you zookeeper is already overloaded, I don't see why you would need a separate zookeeper cluster just for Hive. There are a few zookeeper usages in Hive, the add

Re: Too many connections from hive to zookeeper

2015-04-29 Thread Xuefu Zhang
This is a known issue and has been fixed in later releases. --Xuefu On Wed, Apr 29, 2015 at 7:44 PM, Shady Xu wrote: > Recently I found in the zookeeper log that there were too many client > connections and it was hive that was establishing more and more connections. > > I modified the max clie

Re: Repeated Hive start-up issues

2015-05-15 Thread Xuefu Zhang
Your namenode is in safe mode, as the exception shows. You need to verify/fix that before trying Hive. Secondly, "!=" may not work as expected. Try "<>" or other simpler query first. --Xuefu On Fri, May 15, 2015 at 6:17 AM, Anand Murali wrote: > Hi All: > > I have installed Hadoop-2.6, Hive 1.

Re: Hive on Spark VS Spark SQL

2015-05-20 Thread Xuefu Zhang
I have been working on HIve on Spark, and knows a little about SparkSQL. Here are a few factors to be considered: 1. SparkSQL is similar to Shark (discontinued) in that it clones Hive's front end (parser and semantic analyzer) and metastore, and inject in between a laryer where Hive's operator tre

Re: Hive on Spark VS Spark SQL

2015-05-22 Thread Xuefu Zhang
're not as mature as Hive. > What it depends on Hive for is Metastore, CliDriver, DDL parser, etc. > > Cheolsoo > > On Wed, May 20, 2015 at 10:45 AM, Xuefu Zhang wrote: > >> I have been working on HIve on Spark, and knows a little about SparkSQL. >> Here ar

Re: Pointing SparkSQL to existing Hive Metadata with data file locations in HDFS

2015-05-27 Thread Xuefu Zhang
I'm afraid you're at the wrong community. You might have a better chance to get an answer in Spark community. Thanks, Xuefu On Wed, May 27, 2015 at 5:44 PM, Sanjay Subramanian < sanjaysubraman...@yahoo.com> wrote: > hey guys > > On the Hive/Hadoop ecosystem we have using Cloudera distribution CD

Hosting Hive User Group Meeting During Hadoop World NY

2015-06-10 Thread Xuefu Zhang
Dear Hive users, Hive community is considering a user group meeting during Hadoop World that will be held in New York at the end of September. To make this happen, your support is essential. First, I'm wondering if any user in New York area would be willing to host the meetup. Secondly, I'm solici

Re: Error using UNION ALL operator on tables of different storage format !!!

2015-06-18 Thread Xuefu Zhang
Sounds like a bug. However, could you reproduce with the latest Hive code? --Xuefu On Thu, Jun 18, 2015 at 8:56 PM, @Sanjiv Singh wrote: > Hi All > > I was trying to combine records of two tables using UNION ALL. > One table testTableText is on TEXT format and another table testTableORC > is on

Re: EXPORTing multiple partitions

2015-06-25 Thread Xuefu Zhang
Hi Brian, If you think that is useful, please feel free to create a JIRA requesting for it. Thanks, Xuefu On Thu, Jun 25, 2015 at 10:36 AM, Brian Jeltema < brian.jelt...@digitalenvoy.net> wrote: > Answering my own question: > > create table foo_copy like foo; > insert into foo_copy partitio

Re: Error: java.lang.RuntimeException: org.apache.hive.com/esotericsoftware.kryo.KryoException: Encountered unregistered class ID: 380

2015-07-16 Thread Xuefu Zhang
Same as https://issues.apache.org/jira/browse/HIVE-11269? On Thu, Jul 16, 2015 at 7:25 AM, Anupam sinha wrote: > Hi Guys, > > I am writing the simple hive query,Receiving the following error > intermittently. This error > presents itself for 30min-2hr then goes away. > > Appreciate your help to

Re: Obtain user identity in UDF

2015-07-27 Thread Xuefu Zhang
There is a udf, current_user, which returns a value that can passed to your udf as an input, right? On Mon, Jul 27, 2015 at 1:13 PM, Adeel Qureshi wrote: > Is there a way to obtain user authentication information in a UDF like > kerberos username that they have logged in with to execute a hive q

Re: Computation timeout

2015-07-29 Thread Xuefu Zhang
Have you tried hive.server2.idle.operation.timeout? --Xuefu On Wed, Jul 29, 2015 at 5:52 AM, Loïc Chanel wrote: > Hi all, > > As I'm trying to build a secured and multi-tenant Hadoop cluster with > Hive, I am desperately trying to set a timeout to Hive requests. > My idea is that some users can

Re: Computation timeout

2015-07-29 Thread Xuefu Zhang
I thought the idea of infinite operation was not very >> compatible with the "idle" word (as the operation will not stop running), >> but I'll try :-) >> Thanks for the idea, >> >> >> Loïc >> >> Loïc CHANEL >> Engineering

Re: Computation timeout

2015-07-29 Thread Xuefu Zhang
your solution :) >> >> Thanks, >> >> >> Loïc >> >> Loïc CHANEL >> Engineering student at TELECOM Nancy >> Trainee at Worldline - Villeurbanne >> >> 2015-07-29 16:14 GMT+02:00 Xuefu Zhang : >> >>> Okay. To confirm, you set it to

Re: Request write access to the Hive wiki

2015-08-10 Thread Xuefu Zhang
Done! On Mon, Aug 10, 2015 at 1:05 AM, Xu, Cheng A wrote: > Hi, > > I’d like to have write access to the Hive wiki. My Confluence username is > cheng.a...@intel.com with Full Name “Ferdinand Xu”. Please help me deal > with it. Thank you! > > > > Regards, > > Ferdinand Xu > > >

Re: Request write access to the Hive wiki

2015-08-10 Thread Xuefu Zhang
s access > too? :) > > Thanks, > > On Mon, Aug 10, 2015 at 2:37 PM, Xuefu Zhang wrote: > >> Done! >> >> On Mon, Aug 10, 2015 at 1:05 AM, Xu, Cheng A >> wrote: >> >>> Hi, >>> >>> I’d like to have write access to the Hive wiki.

Re: HIVE:1.2, Query taking huge time

2015-08-20 Thread Xuefu Zhang
Please check out HIVE-11502. For your poc, you can simply get around using other data types instead of double. On Thu, Aug 20, 2015 at 2:08 AM, Nishant Aggarwal wrote: > Thanks for the reply Noam. I have already tried the later point of > dividing the query. But the challenge comes during the jo

Re: Hive on Spark

2015-08-31 Thread Xuefu Zhang
What you described isn't part of the functionality of Hive on Spark. Rather, Spark is used here as a general purpose engine similar to MR but without intemediate stages. It's batch origientated. Keeping 100T data in memory is hardly beneficial unless you know that that dataset is going to be used

Hive User Group Meeting Singapore

2015-08-31 Thread Xuefu Zhang
Dear Hive users, Hive community is considering a user group meeting during Hadoop World that will be held in Singarpore [1] Dec 1-3, 2015. As I understand, this will be the first time that this meeting ever happens in Asia Pacific even though there is a large user base in that region. As another g

Re: Hive on Spark on Mesos

2015-09-09 Thread Xuefu Zhang
Mesos isn't supported for Hive on Spark. We have never attempted to run against it. --Xuefu On Wed, Sep 9, 2015 at 6:12 AM, John Omernik wrote: > In the docs for Hive on Spark, it appears to have instructions only for > Yarn. Will there be instructions or the ability to run hive on spark with

Re: [ANNOUNCE] New Hive PMC Chair - Ashutosh Chauhan

2015-09-16 Thread Xuefu Zhang
Congratulations, Ashutosh!. Well-deserved. Thanks to Carl also for the hard work in the past few years! --Xuefu On Wed, Sep 16, 2015 at 12:39 PM, Carl Steinbach wrote: > I am very happy to announce that Ashutosh Chauhan is taking over as the > new VP of the Apache Hive project. Ashutosh has be

Re: hive on spark query error

2015-09-25 Thread Xuefu Zhang
What's the value of spark.master in your case? The error specifically says something wrong with it. --Xuefu On Fri, Sep 25, 2015 at 9:18 AM, Garry Chen wrote: > Hi All, > > I am following > https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started? > To s

Fw: read this

2015-09-28 Thread Xuefu Zhang
Hello! New message, please read <http://elatronic.com/story.php?a> Xuefu Zhang

Re: Alias vs Assignment

2015-10-08 Thread Xuefu Zhang
It looks to me that this adds only syntactic suger which doesn't provide much additional value. On the contrary, it might even bring confusion to non-sql-server users. As you have already noted, it's not ISO standard. Writing queries this way actually make them less portable. Personally I'd discour

Re: regarding hiveserver2 DeRegisterWatcher

2015-10-12 Thread Xuefu Zhang
Can you articulate further why HiveServer2 is not working in such an event? What's current behavior and what's expected from an end user's standpoint? Thanks, Xuefu On Mon, Oct 12, 2015 at 6:52 AM, Wangwenli wrote: > > now hiveserver2 has multiple instance register to zookeeper, if zookeeper >

Re: Hive and Spark on Windows

2015-10-19 Thread Xuefu Zhang
Hi Andres, We haven't tested Hive on Spark on Windows. However, if you can get Hive and Spark to work on Windows, I'd assume that the configuration is no different from on Linux. Let's know if you encounter any specific problems. Thanks, Xuefu On Mon, Oct 19, 2015 at 5:13 PM, Andrés Ivaldi wrot

Re: Hive and Spark on Windows

2015-10-20 Thread Xuefu Zhang
> Does Hive needs hadoop always? or there are some configuration missing? > > Thanks > > On Mon, Oct 19, 2015 at 11:31 PM, Xuefu Zhang wrote: > >> Hi Andres, >> >> We haven't tested Hive on Spark on Windows. However, if you can get Hive >> and Spark t

Re: Hive and Spark on Windows

2015-10-20 Thread Xuefu Zhang
at 11:46 AM, Xuefu Zhang wrote: > >> Yes. You need HADOOP_HOME, which tells Hive how to connect to HDFS and >> get its dependent libraries there. >> >> On Tue, Oct 20, 2015 at 7:36 AM, Andrés Ivaldi >> wrote: >> >>> I've already installed cyg

Re: Hive on Spark

2015-10-23 Thread Xuefu Zhang
quick answers: 1. you can pretty much set any spark configuration at hive using set command. 2. no. you have to make the call. On Thu, Oct 22, 2015 at 10:32 PM, Jone Zhang wrote: > 1.How can i set Storage Level when i use Hive on Spark? > 2.Do Spark have any intention of dynamically determine

Re: Hive on Spark

2015-10-23 Thread Xuefu Zhang
operties file in spark, > Spark provided "def persist(newLevel: StorageLevel)" > api only... > > 2015-10-23 19:03 GMT+08:00 Xuefu Zhang : > >> quick answers: >> 1. you can pretty much set any spark configuration at hive using set >> command. >> 2.

Re: Hive on Spark

2015-10-23 Thread Xuefu Zhang
er the limited > resources. > "15/10/23 17:37:13 Reporter WARN > org.apache.spark.deploy.yarn.YarnAllocator>> Container killed by YARN for > exceeding memory limits. 7.6 GB of 7.5 GB physical memory used. Consider > boosting spark.yarn.executor.memoryOverhead." >

Re: Hive on Spark NPE at org.apache.hadoop.hive.ql.io.HiveInputFormat

2015-11-02 Thread Xuefu Zhang
That msg could be just noise. On the other hand, there is NPE, which might be the problem you're having. Have you tried your query with MapReduce? On Sun, Nov 1, 2015 at 5:32 PM, Jagat Singh wrote: > One interesting message here , *No plan file found: * > > 15/11/01 23:55:36 INFO exec.Utilities:

Re: Hive on Spark NPE at org.apache.hadoop.hive.ql.io.HiveInputFormat

2015-11-03 Thread Xuefu Zhang
Singh wrote: > This is the virtual machine from Hortonworks. > > The query is this > > select count(*) from sample_07; > > It should run fine with MR. > > I am trying to run on Spark. > > > > > > > On Tue, Nov 3, 2015 at 4:39 PM, Xuefu Zhang wrote:

Re: Do you have more suggestions on when to use Hive on MapReduce or Hive on Spark?

2015-11-04 Thread Xuefu Zhang
Hi Jone, Thanks for trying Hive on Spark. I don't know about your cluster, so I cannot comment too much on your configurations. We do have a "Getting Started" guide [1] which you may refer to. (We are currently updating the document.) Your executor size (cores/memory) seems rather small and not al

Re: troubleshooting: "unread block data' error

2015-11-19 Thread Xuefu Zhang
Are you able to run queries that are not touching HBase? This problem were seen before but fixed. On Tue, Nov 17, 2015 at 3:37 AM, Sofia wrote: > Hello, > > I have configured Hive to work Spark. > > I have been trying to run a query on a Hive table managing an HBase table > (created via HBaseSto

Re: starting spark-shell throws /tmp/hive on HDFS should be writable error

2015-11-20 Thread Xuefu Zhang
This seems belonging to Spark user list. I don't see any relevance to Hive except the directory containing "hive" word. --Xuefu On Fri, Nov 20, 2015 at 1:13 PM, Mich Talebzadeh wrote: > Hi, > > > > Has this been resolved. I don’t think this has anything to do with > /tmp/hive directory permissi

Re: Building Spark to use for Hive on Spark

2015-11-22 Thread Xuefu Zhang
Hive is supposed to work with any version of Hive (1.1+) and a version of Spark w/o Hive. Thus, to make HoS work reliably and also simply the matters, I think it still makes to require that spark-assembly jar shouldn't contain Hive Jars. Otherwise, you have to make sure that your Hive version match

Re: Upgrading from Hive 0.14.0 to Hive 1.2.1

2015-11-24 Thread Xuefu Zhang
This upgrade should be no different from other upgrade. You can use Hive's schema tool to upgrade your existing metadata. Thanks, Xuefu On Tue, Nov 24, 2015 at 10:05 AM, Mich Talebzadeh wrote: > Hi, > > > > I would like to upgrade to Hive 1.2.1 as I understand one cannot deploy > Spark executio

Re: [ANNOUNCE] New PMC Member : John Pullokkaran

2015-11-24 Thread Xuefu Zhang
Congratulations, John! --Xuefu On Tue, Nov 24, 2015 at 3:01 PM, Prasanth J wrote: > Congratulations and Welcome John! > > Thanks > Prasanth > > On Nov 24, 2015, at 4:59 PM, Ashutosh Chauhan > wrote: > > On behalf of the Hive PMC I am delighted to announce John Pullokkaran is > joining Hive PMC

Re: Write access request to the Hive wiki

2015-11-25 Thread Xuefu Zhang
Hi Aihua, I just granted you the write access to Hive wiki. Let me know if problem remains. Thanks, Xuefu On Wed, Nov 25, 2015 at 10:50 AM, Aihua Xu wrote: > I'd like to request write access to the Hive wiki to update some of the > docs. > > My Confluence user name is aihuaxu. > > Thanks! > Ai

Re: hive1.2.1 on spark connection time out

2015-11-25 Thread Xuefu Zhang
There usually a few more messages before this but after "spark-submit" in hive.log. Do you have spark.home set? On Sun, Nov 22, 2015 at 10:17 PM, zhangjp wrote: > > I'm using hive1.2.1 . I want to run hive on spark model,but there is some > issues. > have been set spark.master=yarn-client; > spa

Answers to recent questions on Hive on Spark

2015-11-27 Thread Xuefu Zhang
Hi there, There seemed an increasing interest in Hive On Spark From the Hive users. I understand that there have been a few questions or problems reported and I can see some frustration sometimes. It's impossible for Hive on Spark team to respond every inquiry even thought we wish we could. Howeve

Re: Answers to recent questions on Hive on Spark

2015-11-27 Thread Xuefu Zhang
*Sybase ASE 15 Gold Medal Award 2008* > > A Winning Strategy: Running the most Critical Financial Data on ASE 15 > > > http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf > > Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE

Re: 答复: Answers to recent questions on Hive on Spark

2015-11-27 Thread Xuefu Zhang
, Wangwenli wrote: > Hi xuefu , > > > > thanks for the information. > > One simple question, *any plan when the hive on spark can be used in > production environment?* > > > > Regards > > wenli > > > > *发件人:* Xuefu Zhang [mailto:xzh...@

Re: Java heap space occured when the amount of data is very large with the same key on join sql

2015-11-28 Thread Xuefu Zhang
How much data you're dealing with and how skewed it's? The code comes from Spark as far as I can see. To overcome the problem, you have a few things to try: 1. Increase executor memory. 2. Try Hive's skew join. 3. Rewrite your query. Thanks, Xuefu On Sat, Nov 28, 2015 at 12:37 AM, Jone Zhang wr

Re: Answers to recent questions on Hive on Spark

2015-11-28 Thread Xuefu Zhang
gt; parameter hive.spark.client.server.address > > > > Now I don’t seem to be able to set it up in hive-site.xml or as a set > parameter in hive prompt itself! > > > > Any hint would be appreciated or any work around? > > > > Regards, > > > >

  1   2   >