Persistent (and possibly asynchronous) Hive access from within Scala

2015-08-06 Thread Stephen Bly
What library should I use if I want to make persistent connections from within Scala/Java? I’m working on a web service that sends Hive queries to our HiveServer (we are about to upgrade to Hive 1.1 with Hive Server 2). Right now I’m using the Hive Driver for JDBC but that does not have the capa

Re: NPE error during file sink stage when inserting into bucketed table

2015-08-06 Thread Jason Dere
Got a stack trace? Might help someone identify the issue. From: Muni Chada Sent: Thursday, August 06, 2015 4:12 PM To: user@hive.apache.org Subject: NPE error during file sink stage when inserting into bucketed table Hi, We are on hive 0.14 and running into NPE

Re: UDTF fails with java.lang.ClassCastException

2015-08-06 Thread Jason Dere
Would have to look at the UDTF to know for sure what is going on - it might not be using the object inspectors properly here. Is it using the ObjectInspectors that were passed in during initialize(), or is it creating a WritableStringObjectInspector and assuming this will work with the value ob

UDTF fails with java.lang.ClassCastException

2015-08-06 Thread Jim Green
Hi Team, One UDTF fails in Hive 1.0 with below stacktrace: Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazy.LazyString cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaOb

NPE error during file sink stage when inserting into bucketed table

2015-08-06 Thread Muni Chada
Hi, We are on hive 0.14 and running into NPE error during file sink stage when inserting into bucketed table. Please advice for any workaround or patch info. Thanks, Muni

Re: Optimizing HiveServer2 data transfer rate

2015-08-06 Thread Alexander Zarei
Also, during the profiling and using Htop tool, I see that there is a main thread and many other child threads. It is only one child thread that is using close to 80% CPU time. While the main thread combining all threads uses around 104% of the CPU time. (multiple cores). I wonder if there is a wa

Optimizing HiveServer2 data transfer rate

2015-08-06 Thread Alexander Zarei
Hi, I am doing performance testing on Hive ODBC driver and so far I have concluded that HiveServer2 is the bottleneck in my data pipe. I wonder if you could direct me to links and tips on how to optimize HiverServer2 so it transfers a large amount of data faster. I went ahead and profiled HiveSe

Is it worth storing in ORC for one time read. And can replace hive with HBase?

2015-08-06 Thread venkatesh b
Hi, here I got two things to know. Columns size In hive tables Size of each record is normal only(around 20 columns containing, int type columns and string columns with length 50 chars, not very long columns are present). FIRST: In our project we use hive. We daily get new data. We need to proces

Is it worth of using ORC format in my case. Can I replace hive with HBase.

2015-08-06 Thread venkatesh b
Hi, here I got two things to know. FIRST: In our project we use hive. We daily get new data. We need to process this new data only once. And send this processed data to RDBMS. Here in processing we majorly use many complex queries with joins with where condition and grouping functions. There are ma

Re: Hive on Tez much slower than MR

2015-08-06 Thread William Slacum
Hey Jörn, thanks for the response! Unfortunately I'm kinda stuck on the version I am. We do plan on moving to ORC at some point. I need to dig more into the implementation of how Vectorized execution works. The documentation ( https://cwiki.apache.org/confluence/display/Hive/Vectorized+Query+Execu

Re: Perl-Hive connection

2015-08-06 Thread David Morel
You probably forgot to load (use) the module before calling new() Le 6 août 2015 8:49 AM, "siva kumar" a écrit : > Hi David , > I have tried the link you have posted. But im stuck > with this error message below > > Can't locate object method "new" via package "Thrift::API::H

Using org.apache.hadoop.hive.ql.Driver to run join sql with hbase external table throw ClassNotFoundException

2015-08-06 Thread 杜宇軒
Hi All I am using CDH-5.2.1 without kerberos, hbase-0.98.6-cdh5.2.1, hive-0.13.1-cdh5.2.1. I use hive-cli to create a hive external table : create external table table1(rowkey String, cfcol1 String) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.

Re: Hive on Tez much slower than MR

2015-08-06 Thread Jörn Franke
Always use the newest version of Hive. You should use orc or parquet wherever possible. If you use orc then you should explicitly enable storage indexes and insert your table sorted (eg for the query below you would sort on x). Additionally you should enable statistics. Compression may bring addit