Spark SQL is not returning records for HIVE transactional tables on HDP

2016-03-12 Thread @Sanjiv Singh
Hi All, I am facing this issue on HDP setup on which COMPACTION is required only once for transactional tables to fetch records with Spark SQL. On the other hand, Apache setup doesn't required compaction even once. May be something got triggered on meta-store after compaction, Spark SQL start rec

Re: Spark SQL is not returning records for HIVE transactional tables on HDP

2016-03-12 Thread Mich Talebzadeh
This is an interesting one as it appears that a hive transactional table 1. Hive version 2 2. Hive on Spark engine 1.3.1 3. Spark 1.5.2 hive> create table default.foo(id int) clustered by (id) into 2 buckets STORED AS ORC TBLPROPERTIES ('transactional'='true'); hive> insert into defaul

Re: Spark SQL is not returning records for HIVE transactional tables on HDP

2016-03-12 Thread Mich Talebzadeh
Hi, Thanks for the input. I use Hive 2 and still have this issue. 1. Hive version 2 2. Hive on Spark engine 1.3.1 3. Spark 1.5.2 I have added Hive user group to this as well. So hopefully we may get some resolution. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/pr

Query question.

2016-03-12 Thread ws
I need to re-write something like this to Hive: ###select  x.A, sum(0.5 * (select (count(*) from Table_X tx where tx.blah = 0.75)))from Table_X x### Thanks in advance for any pointer to it.

Re: Query question.

2016-03-12 Thread Mich Talebzadeh
That select (count(*) from Table_X tx where tx.blah = 0.75 will return a single value y and 0.5 * y will be a single value.. Where does sum come into it. It is normally applied to GROUP BY statements. Can you provide the original SQL if you have (Oracle, Sybase whatever)? HTH Dr Mich Talebzade

How to rename a hive table without changing location?

2016-03-12 Thread Rex X
Based on the Hive doc below: Rename Table *ALTER TABLE table_name RENAME TO new_table_name;* This statement lets you change the name of a table to a different name. *As of version 0.6, a rename on a managed table moves its HDFS location as well. (Older Hive versions just renamed the table in t

Re: How to rename a hive table without changing location?

2016-03-12 Thread Mich Talebzadeh
I do not see much point in renaming a table from A to B but still looking at files A for this table. What is the purpose of renaming the table but having the same file system? hive> *create table a (col1 int);*hive> *show create table a;*CREATE TABLE `a`( `col1` int) ROW FORMAT SERDE 'org.apac

how to create an array from two columns?

2016-03-12 Thread Rex X
How to make the following work? 1. combine columns A and B to make one array as a new column AB. Both column A and B are string types. select string_columnA, string_columnB, *array(string_columnA, string_columnB) *as AB from Table1; 2. append columnA to an existing array-type column B select

Re: How to rename a hive table without changing location?

2016-03-12 Thread Rex X
Hi Mich, I am doing this, because I need to update an existing big hive table, which can be stored in any arbitrary customized location on hdfs. But when we do Alter Table Rename, Hive will automatically move the files to the subdirectory of the corresponding database, /user/hive/warehouse/test.db

Re: How to rename a hive table without changing location?

2016-03-12 Thread Marcin Tustin
I you wish to keep it in its current location consider creating an external table. On Saturday, March 12, 2016, Rex X wrote: > Hi Mich, > > I am doing this, because I need to update an existing big hive table, > which can be stored in any arbitrary customized location on hdfs. But when > we do A

Re: how to create an array from two columns?

2016-03-12 Thread Chandeep Singh
If you only want the array while you’re querying table1 your example should work. If you want to add AB to the table you’ll probably need to create a new table by selecting everything you need from table1. hive> select * from table1 limit 1; OK temp1 temp2 temp3 hive> select f1, array(f2, f

Re: how to create an array from two columns?

2016-03-12 Thread Rex X
Thank you, Chandeep. Yes, my first problem solved. How about the second one? Is there any way to append an element to an existing array? On Sat, Mar 12, 2016 at 5:10 PM, Chandeep Singh wrote: > If you only want the array while you’re querying table1 your example > should work. If you want to a

Re: how to create an array from two columns?

2016-03-12 Thread Rex X
For the first question, is there any way to use "set" instead of an "array" to dedupe all elements? "select array(1,1)" will return "[1,1]", not "[1]". On Sat, Mar 12, 2016 at 5:26 PM, Rex X wrote: > Thank you, Chandeep. Yes, my first problem solved. > How about the second one? Is there any w

Re: how to create an array from two columns?

2016-03-12 Thread Chandeep Singh
Since data is stored in HDFS you have very limited scope to directly append. As a workaround you could get the contents of the original array by their index and then create a new array. This would only make sense if you know the number of elements in your array and it doesn’t change across rows

Re: how to create an array from two columns?

2016-03-12 Thread Chandeep Singh
Writing your own UDF is always an option :) > On Mar 13, 2016, at 1:46 AM, Chandeep Singh wrote: > > Since data is stored in HDFS you have very limited scope to directly append. > > As a workaround you could get the contents of the original array by their > index and then create a new array.

Re: how to create an array from two columns?

2016-03-12 Thread Chandeep Singh
Not sure if there is a function for that but I wrote a UDF to do so - https://github.com/chandeepsingh/Hive-UDFs hive> ADD JAR hive-udfs-1.0-uber.jar; Added [hive-udfs-1.0-uber.jar] to class path Added resources: [hive-udfs-1.0-uber.jar] hive> CREATE