Re:Re:Does Hive 0.14.0. support registering permanent function during hive thrift server is running

2015-12-03 Thread Todd
I have figured out that hive supports this. 在 2015-12-04 09:58:48,"Todd" 写道: Could someone help answer my question? Thanks. At 2015-12-03 19:12:29, "Todd" wrote: Hi, I am using Hive 0.14.0, and have hive thrift server running.During its running, I would use “create function” to add

Re: Why there are two different stages on the same query when i use hive on spark.

2015-12-03 Thread Jone Zhang
*Thanks for you warning.* *The first query is mapjoin and second query is reducejoin.The data format is all textInputFormat.* *I'll go to learn more about mapjoin of **hive on spark** anyway,But why** stage1 of first query in attachment is so slowly?* *Explain first query:* hive (u_wsd)> explai

Cannot drop a table after creating an index and then renaming to a different database

2015-12-03 Thread Toby Allsopp
Hi, a sequence of commands should make things clearer. I'm using the Hortonworks Sandbox VM with HDP 2.3. Connected to: Apache Hive (version 1.2.1.2.3.0.0-2557) Driver: Hive JDBC (version 1.2.1.2.3.0.0-2557) 0: jdbc:hive2://localhost:1> *create database db1;* No rows affected (0.997 seconds) 0

Re:Does Hive 0.14.0. support registering permanent function during hive thrift server is running

2015-12-03 Thread Todd
Could someone help answer my question? Thanks. At 2015-12-03 19:12:29, "Todd" wrote: Hi, I am using Hive 0.14.0, and have hive thrift server running.During its running, I would use “create function” to add a permanent function, Does hive support this **without restarting** the hive thrift

Hive Support for Unicode languages

2015-12-03 Thread mahender bigdata
Hi Team, Does hive supports Hive Unicode like UTF-8,UTF-16 and UTF-32. I would like to see different language supported in hive table. Is there any serde which can show exactly japanese, chineses character rather than showing symbols on Hive console. -Mahender

Storing the Hive Query Results into Variable

2015-12-03 Thread mahender bigdata
Hi, Is there option available to store hive results into variable like select @i= count(*) from HiveTable. or Storing Table Results into variable and make use of it later stage of Query. I tired using HQL CTE but the scope of CTE is limited to next select only, Is there a way to intermediate

RE: Any clue on this error, Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT

2015-12-03 Thread Mich Talebzadeh
Thanks downloaded the one suggested, Unfortunately I get the following error when I try start-master.sh hduser@rhes564::/home/hduser> start-master.sh starting org.apache.spark.deploy.master.Master, logging to /usr/lib/spark/sbin/../logs/spark-hduser-org.apache.spark.deploy.master.Master-1

Re: Any clue on this error, Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT

2015-12-03 Thread Marcelo Vanzin
I spoke to Xuefu (Hive dev) and mentioned that this isn't really how it should be done. In the meantime, if you can, you should use a Spark package that does not include Hive classes. There used to be an explicit one for that, but I can't find it. In the meantime, the tarball that says "pre-built

RE: Any clue on this error, Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT

2015-12-03 Thread Mich Talebzadeh
Just noticed that hive shell in 1.2.1 makes a reference to SPARK_HOME if it finds it # add Spark assembly jar to the classpath if [[ -n "$SPARK_HOME" ]] then sparkAssemblyPath=`ls ${SPARK_HOME}/lib/spark-assembly-*.jar` CLASSPATH="${CLASSPATH}:${sparkAssemblyPath}" fi Is thi

RE: query execution

2015-12-03 Thread Shirley Cohen
Thanks Sergey, this is exactly what I needed to know! Shirley From: Sergey Shelukhin [mailto:ser...@hortonworks.com] Sent: Thursday, December 03, 2015 1:28 PM To: user@hive.apache.org Subject: COMMERCIAL:Re: query execution If you are using Tez, you can set hive.tez.exec.print.summary=true; in C

RE: how to get counts as a byproduct of a query

2015-12-03 Thread Ryan Harris
I'm not sure that is going to give you what you are expecting since the counts will probably be based on the result of the join, not on the original tables. Try it and see, but remain skeptical until you validate the results. From: Frank Luo [mailto:j...@merkleinc.com] Sent: Thursday, December

RE: Any clue on this error, Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT

2015-12-03 Thread Mich Talebzadeh
Hi, This is my CLASSPATH which I have simplified running with Hive 1.2.1 and generic build Spark 1.3 unset CLASSPATH CLASSPATH=$HADOOP_HOME/share/hadoop/common/hadoop-common-2.6.0-tests.jar:$HADOOP_HOME/share/hadoop/common/hadoop-common-2.6.0.jar:hadoop-nfs-2.6.0.jar:$HIVE_HOME/lib:${SPAR

Re: query execution

2015-12-03 Thread Sergey Shelukhin
If you are using Tez, you can set hive.tez.exec.print.summary=true; in CLI to see the breakdown. From: Shirley Cohen mailto:shirley.co...@rackspace.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Thursday, December 3, 2015 at 08:06 To: "us

RE: Any clue on this error, Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT

2015-12-03 Thread Mich Talebzadeh
Hi Marcelo. So this is the approach I am going to take: Use spark 1.3 pre-built Use Hive 1.2.1. Do not copy over anything to add to hive libraries from spark 1.3 libraries Use Hadoop 2.6 There is no need to mess around with the libraries. I will try to unset my CLASSPATH and reset again and tr

Re: Any clue on this error, Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT

2015-12-03 Thread Marcelo Vanzin
On Thu, Dec 3, 2015 at 10:32 AM, Mich Talebzadeh wrote: > hduser@rhes564::/usr/lib/spark/logs> hive --version > SLF4J: Found binding in > [jar:file:/usr/lib/spark/lib/spark-assembly-1.3.0-hadoop2.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] As I suggested before, you have Spark's assembly i

RE: Any clue on this error, Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT

2015-12-03 Thread Mich Talebzadeh
Hi, These are my stack for now 1.Spark version 1.3 2.Hive version 1.2.1 3.Hadoop version 2.6 So I am using hive version 1.2.1 hduser@rhes564::/usr/lib/spark/logs> hive --version SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/u

Re: Any clue on this error, Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT

2015-12-03 Thread Furcy Pin
The field SPARK_RPC_CLIENT_CONNECT_TIMEOUT seems to have been added to Hive in the 1.1.0 release https://github.com/apache/hive/blob/release-1.1.0/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java Are you using an older version of Hive somewhere? On Thu, Dec 3, 2015 at 7:15 PM, Mich Tal

RE: Any clue on this error, Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT

2015-12-03 Thread Mich Talebzadeh
Thanks I tried all :( I am trying to make Hive use Spark and apparently Hive can use version 1.3 of Spark as execution engine. Frankly I don’t know why this is not working! Mich Talebzadeh Sybase ASE 15 Gold Medal Award 2008 A Winning Strategy: Running the most Critical Financial Data

Re: Any clue on this error, Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT

2015-12-03 Thread Marcelo Vanzin
(bcc: user@spark, since this is Hive code.) You're probably including unneeded Spark jars in Hive's classpath somehow. Either the whole assembly or spark-hive, both of which will contain Hive classes, and in this case contain old versions that conflict with the version of Hive you're running. On

Re: Any clue on this error, Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT

2015-12-03 Thread Furcy Pin
maybe you compile and run against different versions of spark? On Thu, Dec 3, 2015 at 6:54 PM, Mich Talebzadeh wrote: > Trying to run Hive on Spark 1.3 engine, I get > > > > conf hive.spark.client.channel.log.level=null --conf > hive.spark.client.rpc.max.size=52428800 --conf > hive.spark.client.

Any clue on this error, Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT

2015-12-03 Thread Mich Talebzadeh
Trying to run Hive on Spark 1.3 engine, I get conf hive.spark.client.channel.log.level=null --conf hive.spark.client.rpc.max.size=52428800 --conf hive.spark.client.rpc.threads=8 --conf hive.spark.client.secret.bits=256 15/12/03 17:53:18 [stderr-redir-1]: INFO client.SparkClientImpl: Spark asse

RE: how to get counts as a byproduct of a query

2015-12-03 Thread Frank Luo
Ryan, Thanks for your reply. Your previous response gave me some hints. I think below will scan tables just once: from table_A a join table_B b on a.X = b.X insert INTO TABLE table_C select a.X, a.Y, b.Z insert OVERWRITE TABLE count_A select count(a.X) insert OVERWRITE TABLE count_B select count

Re: How to register permanent function during hive thrift server is running

2015-12-03 Thread Alan Gates
No restart of the thrift service should be required. Alan. Todd December 3, 2015 at 3:12 Hi, I am using Hive 0.14.0, and have hive thrift server running.During its running, I would use “create function” to add a permanent function, Does hive support this **without rest

Hive jdbc 1.x not working with spark 1.5.1 in standalone mode

2015-12-03 Thread reena upadhyay
I am trying to execute hive query using spark 1.5.1 in standalone mode and hive 1.2.0 jdbc version. Here is my piece of code: private static final String HIVE_DRIVER = "org.apache.hive.jdbc.HiveDriver"; private static final String HIVE_CONNECTION_URL = "jdbc:hive2://localhost:1/idw"; private

query execution

2015-12-03 Thread Shirley Cohen
Hi, I want to characterize the overhead for each step of a Hive query. The explain output doesn't give me the actual execution times, so how would I find those out? Thanks in advance, Shirley

Re: Handling LZO files

2015-12-03 Thread Jörn Franke
Your Hive version is too old. You may want to use also another execution engine. I think your problem might then be related to external tables for which the parameter you set probably do not apply. I had once the same problem, but I needed to change the block size on the Hadoop level (hdfs-site.

Re: Why there are two different stages on the same query when i use hive on spark.

2015-12-03 Thread Xuefu Zhang
Can you also attach explain query result? What's your data format? --Xuefu On Thu, Dec 3, 2015 at 12:09 AM, Jone Zhang wrote: > Hive1.2.1 on Spark1.4.1 > > *The first query is:* > set mapred.reduce.tasks=100; > use u_wsd; > insert overwrite table t_sd_ucm_cominfo_incremental partition (ds=20151

Re: Building spark 1.3 from source code to work with Hive 1.2.1

2015-12-03 Thread Xuefu Zhang
Mich, To start your Spark standalone cluster, you can just download the tarball from Spark repo site. In other words, you don't need to start your cluster using your build. You only need to spark-assembly.jar to Hive's /lib directory and that's it. I guess you have been confused by this, which I

Re: Handling LZO files

2015-12-03 Thread Harsha HN
Hi Franke, It's 100+ node cluster. Roughly 2TB memory and 1000+ vCores were available when I ran my job. So infrastructure is not a problem here. Hive version is 0.13 About ORC or PARQUET, requires us to load 5 years of LZO data in ORC or PARQUET format. Though it might be performance efficient,

Re: Handling LZO files

2015-12-03 Thread Jörn Franke
How many nodes, cores and memory do you have? What hive version? Do you have the opportunity to use tez as an execution engine? Usually I use external tables only for reading them and inserting them into a table in Orc or parquet format for doing analytics. This is much more performant than jso

Handling LZO files

2015-12-03 Thread Harsha HN
Hi, We have LZO compressed JSON files in our HDFS locations. I am creating an "External" table on the data in HDFS for the purpose of analytics. There are 3 LZO compressed part files of size 229.16 MB, 705.79 MB, 157.61 MB respectively along with their index files. When I run count(*) query on t

How to register permanent function during hive thrift server is running

2015-12-03 Thread Todd
Hi, I am using Hive 0.14.0, and have hive thrift server running.During its running, I would use “create function” to add a permanent function, Does hive support this **without restarting** the hive thrift server,that is, after creating the function, I will be able to use the function when I conne

Building spark 1.3 from source code to work with Hive 1.2.1

2015-12-03 Thread Mich Talebzadeh
Hi, I have seen mails that state that the user has managed to build spark 1.3 to work with Hive. I tried Spark 1.5.2 but no luck I downloaded spark source 1.3 source code spark-1.3.0.tar and built it as follows ./make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-pr

Why there are two different stages on the same query when i use hive on spark.

2015-12-03 Thread Jone Zhang
Hive1.2.1 on Spark1.4.1 *The first query is:* set mapred.reduce.tasks=100; use u_wsd; insert overwrite table t_sd_ucm_cominfo_incremental partition (ds=20151202) select t1.uin,t1.clientip from (select uin,clientip from t_sd_ucm_cominfo_FinalResult where ds=20151202) t1 left outer join (select uin,