Re: OutOfMemoryError after loading lots of dynamic partitions

2020-01-08 Thread Suresh Kumar Sethuramaswamy
Thanks for the Query and the hive options. Looks like the JVM HEAP space for HIVE CLI is running out of memory as per the EMR documentation https://aws.amazon.com/premiumsupport/knowledge-center/emr-hive-outofmemoryerror-heap-space/ On Wed, Jan 8, 2020 at 11:38 AM Patrick Duin wrote: > The q

set hive.optimize.insert.dest.volume

2020-01-08 Thread Jon Morisi
Hi group, I came across this post (https://randyzwitch.com/hive-five-hard-won-lessons/) which recommended the following for performance reasons with a CTAS statement: set hive.optimize.insert.dest.volume = true; Given that the post is 6 years old, and my system reports, "hive.optimize.insert.d

Re: OutOfMemoryError after loading lots of dynamic partitions

2020-01-08 Thread Patrick Duin
The query is rather large it won't tell you much (it's generated). It comes down to this: WITH gold AS ( select * from table1), delta AS (select * from table2) INSERT OVERWRITE TABLE my_db.temp__v1_2019_12_03_182627 PARTITION (`c_date`,`c_hour`,`c_b`,`c_p`) SELECT * FROM gold UNION DISTINCT

Re: OutOfMemoryError after loading lots of dynamic partitions

2020-01-08 Thread Suresh Kumar Sethuramaswamy
Could you please post your insert query snippet along with the SET statements ? On Wed, Jan 8, 2020 at 11:17 AM Patrick Duin wrote: > Hi, > I got a query that's producing about 3000 partitions which we load > dynamically (On Hive 2.3.5). > At the end of this query (running on M/R which runs fine

OutOfMemoryError after loading lots of dynamic partitions

2020-01-08 Thread Patrick Duin
Hi, I got a query that's producing about 3000 partitions which we load dynamically (On Hive 2.3.5). At the end of this query (running on M/R which runs fine) the M/R job is finished and we see this on the hive cli: Loading data to table my_db.temp__v1_2019_12_03_182627 partition (c_date=null, c_ho

Apache Iceberg integration

2020-01-08 Thread Elliot West
Hello, We're considering working on an integration of Apache Iceberg with Hive, initially so that the latest snapshot of Iceberg tables can be queried via Hive, but later to allow the writing of data using the Iceberg table format. I wanted to first check for the existence and status of any simil

Re: HIVE-2.4 release plans

2020-01-08 Thread Oleksiy S
Thanks for answering. It would be nice to have Hive-2.4.0, but the versioning is up to you. Waiting for new Hive! On Fri, Jan 3, 2020 at 11:31 AM Mass Dosage wrote: > +1 for this, or for a Hive 2.3.7 release. We are blocked from releasing > some of our projects which use Hive 2.3.x on Java >8 d