Hi,
We've been investigating some high db load in our HMS server (version 2.3.9
on Mysql 5.7 aurora 2.11.2). This seems to be due to sort indexing being
created for queries on the COLUMNS_V2 table.
After some digging we think we see the same thing as this ticket/PR tries
to solve: https://issues.a
Hi Yussuf
A hive user here having the same issues.
I think the interface method just follows the same code path as an Alter
table query would do.
My current thinking is that this safeguard was probably more useful in the
olden days of CSV files. With the more modern file formats like Avro, ORC
and
In some of our tools we used to interact with the metastore we've moved
away from long running clients altogether the thrift protocol is best
served by just creating a new client for a request. Try to just create a
new client every time. They are fast to make.
They metastore clients are also not th
Hi
I'm struggling to override the
'hive.metastore.disallow.incompatible.col.type.changes' conf.
I've got a table (Parquet format) which needs some columns renamed/dropped,
structs changed, Hive cli doesn't have the option to drop columns so I'm
going Hive thrift api route but keep getting the exce
That's only for the session, so if you start up the Hive cli set some
params and do a query they will be used. Close client and start up a new
hive cli session and you're back to defaults. To make it permanent you'll
have to change them in the hive-site.xml.
Hope that helps,
Patrick.
Op wo 3 jun.
Hi,
I've got a question is doing a union distinct on a column with a 'map' type
fully supported?
We've seen in Spark that it is not and Spark throws an exception.
Hive seems fine but we were wondering if anyone ever had any issues with
this (we're on hive 2.3.x).
Any pointers on where in the code
hive-outofmemoryerror-heap-space/
>
>
>
>
> On Wed, Jan 8, 2020 at 11:38 AM Patrick Duin wrote:
>
>> The query is rather large it won't tell you much (it's generated).
>>
>> It comes down to this:
>> WITH gold AS ( select * f
<
rock...@gmail.com>:
> Could you please post your insert query snippet along with the SET
> statements ?
>
> On Wed, Jan 8, 2020 at 11:17 AM Patrick Duin wrote:
>
>> Hi,
>> I got a query that's producing about 3000 partitions which we load
&g
Hi,
I got a query that's producing about 3000 partitions which we load
dynamically (On Hive 2.3.5).
At the end of this query (running on M/R which runs fine) the M/R job is
finished and we see this on the hive cli:
Loading data to table my_db.temp__v1_2019_12_03_182627 partition
(c_date=null, c_ho
set hive.map.aggr=false;
Worked for me. Slow and steady wins the race :)
Many thanks all!
Patrick
Op di 12 mrt. 2019 om 03:23 schreef Gopal Vijayaraghavan :
>
> > I'll try the simplest query I can reduce it to with loads of memory and
> see if that gets anywhere. Other pointers are much appre
;
> regards
> Dev
>
>
> On Mon, Mar 11, 2019 at 9:21 PM Patrick Duin wrote:
>
>> Very good question, Yes that does give the same problem.
>>
>> Op ma 11 mrt. 2019 om 16:28 schreef Devopam Mittra :
>>
>>> Can you please try doing SELECT DISTINCT *
Very good question, Yes that does give the same problem.
Op ma 11 mrt. 2019 om 16:28 schreef Devopam Mittra :
> Can you please try doing SELECT DISTINCT * FROM DELTA into a physical
> table first ?
> regards
> Dev
>
>
> On Mon, Mar 11, 2019 at 7:59 PM Patrick Duin wrot
Hi,
I'm running into oom issue trying to do a Union all on a bunch of AVRO
files.
The query is something like this:
with gold as ( select * from table1 where local_date=2019-01-01),
delta ss ( select * from table2 where local_date=2019-01-01)
insert overwrite table3 PARTITION ('local_date'
this benefits anyone else.
Op ma 24 sep. 2018 om 18:22 schreef Patrick Duin :
> Hi all,
>
> I got a query doing an insert overwrite like this:
>
> WITH tbl1 AS (
> SELECT
>col0, col1, local_date, local_hour
> FROM tbl1
> WHERE
> ),
> tbl2
Hi all,
I got a query doing an insert overwrite like this:
WITH tbl1 AS (
SELECT
col0, col1, local_date, local_hour
FROM tbl1
WHERE
),
tbl2 AS (
SELECT col0, col1, local_date, local_hour
FROM tbl2
WHERE
)
INSERT OVERWRITE TABLE
tbl3
PARTITION (local_date, local_hour)
ta and
> you are creating a table with snappy compression, you need to do use
> "CREATE into new_compressed table as select * from un_compressed_table" in
> order to actually compress the data
>
> Regards,
> Tanvi Thacker
>
> On Fri, Aug 10, 2018 at 6:30 AM Patrick
Hi,
I got some hive tables in Parquet format and I am trying to find out how
best to enable compression.
Done a bit of searching and the information is a bit scattered but I found
I can use this hive property to enable compression.It needs to be set
before doing an insert.
set parquet.compressio
Replying to myself as I found my issue, I hadn't updated the schema of my
partitions correctly, I've only updated the table schema, the error went
away when I updated my partitions. All data was query-able old and newly
landed data.
Op do 26 jul. 2018 om 11:22 schreef Patrick Duin
I'm encountering errors in Hive 2.3.2 when reading sets of Parquet files,
where the schema has evolved.
The error I'm seeing is :
Failed with exception java.io.IOException:java.lang.RuntimeException: Hive
internal error: conversion of string to arraynot supported yet.
My schema has a top-level co
Hi,
We've just open sourced a library that we have been using internally at
Hotels.com for unit testing applications that use the Hive metastore
service. It's called BeeJU and is a set of JUnit rules that spin up (and
tear down) a Hive Metastore client using an in-memory database. If you
write any
Hi,
I've noticed the same thing we set the table parameter as well to make sure
the table is External.
replica.putToParameters("EXTERNAL", "TRUE")
Not sure if the tableType is actually used anywhere, we set it anyway as
well as the table parameter just to be sure when using the Metastore API.
No
inline..
>
>
> On Mar 1, 2016, at 8:41 AM, Patrick Duin wrote:
>
> Hi Prasanth,
>
> Thanks for this. I tried out the configuration and I wanted to share some
> number with you.
>
> My test setup is a cascading job that reads in 240 files (ranging from
> 1.5GB to
.
Would be good to know if other users have similar experiences.
Again thanks for your help.
Kind regards,
Patrick.
2016-02-29 6:38 GMT+00:00 Prasanth Jayachandran <
pjayachand...@hortonworks.com>:
> Hi Patrick
>
> Please find answers inline
>
> On Feb 26, 2016, at 9:36 AM
s will be read for split pruning.
>
> The default strategy does it automatically (choosing between when to read
> and when not to footers). It is configurable as well.
>
> >
> > Thanks
> > Prasanth
> >
> >> On Feb 25, 2016, at 7:08 AM, Patrick Duin wrote
Hi,
We've recently moved one of our datasets to ORC and we use Cascading and
Hive to read this data. We've had problems reading the data via Cascading,
because of the generation of splits.
We read in a large number of files (thousands) and they are about 1GB each.
We found that the split calculati
wn from default 256KB. You can do that by setting
> orc.compress.size tblproperties.
>
> On Sep 24, 2015, at 3:27 AM, Patrick Duin wrote:
>
> Thanks for the reply,
> My first thought was out of memory as well but the illegal argument
> exception happens before it is a separate entr
jayachand...@hortonworks.com>:
> Looks like you are running out of memory. Trying increasing the heap
> memory or reducing the stripe size. How many columns are you writing? Any
> idea how many record writers are open per map task?
>
> - Prasanth
>
> On Sep 22, 2015, at 4:32 AM,
Hi all,
I am struggling trying to understand a stack trace I am getting trying to
write an ORC file:
I am using hive-0.13.0/hadoop-2.4.0.
2015-09-21 09:15:44,603 INFO [main] org.apache.hadoop.mapred.MapTask:
Ignoring exception during close for
org.apache.hadoop.mapred.MapTask$NewDirectOutputColle
28 matches
Mail list logo