MERGE performances issue

2018-05-06 Thread Nicolas Paris
Hi, Has anyone any positive feedback on the hive MERGE statement ? There is some informations [1] and [2]. >From my experience, merging a source table of 300M rows and 100 columns to a target of 1.5B is 100 times slower than doing an UPDATE and an INSERT. It is also slower than a third approach c

Re: MERGE performances issue

2018-05-09 Thread Nicolas Paris
2018-05-07 23:26 GMT+02:00 Gopal Vijayaraghavan : > > Then I am wondering if the merge statement is impracticable because > > of bad use of myself or because this feature is just not mature enough. > > Since you haven't mentioned a Hive version here, I'm going to assume > you're some variant of Hi

MOB support : Insert to hbase from Hive

2018-05-17 Thread Nicolas Paris

Re: MOB support : Insert to hbase from Hive

2018-05-17 Thread Nicolas Paris
My question is : Does HIVE allow to insert into mob fields? Apparently not. I mean creating a hive table pointing to an hbase table with MOB field 2018-05-17 14:43 GMT+02:00 Nicolas Paris : > >

Re: Cannot INSERT OVERWRITE on clustered table with > 8 buckets

2018-07-14 Thread Nicolas Paris
​Hi Gopal​ Can you try running with (& see what your query read-perf looks like) > https://gist.github.com/t3rmin4t0r/087b61f79514673c307bb9a88327a4db > > CREATE TABLE IF NOT EXISTS passaggi1718 > ( > ... > ) > PARTITIONED BY (DATAPASSAGGIO string) > CLUSTERED BY (ORAPASSAGGIO) INTO

Re: Announce: MR3 0.3, and performance comparison with Hive-LLAP, Presto, Spark, Hive on Tez

2018-09-07 Thread Nicolas Paris
On Thu, Aug 16, 2018 at 10:55:19PM +0900, Sungwoo Park wrote: > The article compare the following six systems: Great article, as usual. Would have been great to also compare concurrent queries. In particular, I guess presto on that point perform the best. That metric is major since such technol

[feature request] auto-increment field in Hive

2018-09-15 Thread Nicolas Paris
Hi Hive does not provide auto-increment columns (=sequences). Is there any chance that feature will be provided in the future ? This is one of the highest limitation in hive data warehousing in replacement of RDBMS right now. Thanks, -- nicolas

Re: [feature request] auto-increment field in Hive

2018-09-16 Thread Nicolas Paris
On Sat, Sep 15, 2018 at 09:38:01PM +, Vineet Garg wrote: > Not exactly sequence but an ability to generate unique numbers (with > limitation) is under development: > https://issues.apache.org/jira/browse/HIVE-20536 unique numbers is sufficient for a sequence. However, the limitation looks hug

Re: [feature request] auto-increment field in Hive

2018-09-16 Thread Nicolas Paris
On Sat, Sep 15, 2018 at 09:19:12PM -0700, Gopal Vijayaraghavan wrote: > Since we added a sequence + locking in Hive ACID, there's a Surrogate > Key prototype (for Hive 3.0) Great. I did not mention I needed an ACID compliant sequence. > This is not an auto_increment key, but the numbering is for

CSV Serde Quote All

2018-11-12 Thread Nicolas Paris
Hi The 'org.apache.hadoop.hive.serde2.OpenCSVSerde' is a simple and fast way to handle csv tables in hive. However its behavior is to QUOTE ALL columns of the table while there only is a need for VARCHAR/STRING columns containing a separator in it. The problem I am facing is full quoted CSV are

Re: Read Hive ACID tables in Spark or Pig

2019-03-09 Thread Nicolas Paris
Hi, > The issue is that outside readers don't understand which records in > the delta files are valid and which are not. Theoretically all this > is possible, as outside clients could get the valid transaction list > from the metastore and then read the files, but no one has done this > work. I g

Re: Read Hive ACID tables in Spark or Pig

2019-03-09 Thread Nicolas Paris
/issues.apache.org/jira/ > browse/HIVE-14035  This has design documents.  I don't guarantee the > implementation completely matches the design, but you can at least get an idea > of the intent and follow the JIRA stream from there to see what was > implemented. > > Alan. &g

UDF timestamp columns

2020-01-22 Thread Nicolas Paris
Hi I cannot find the way to implement hive UDF dealing with timestamp type. I tried both java.sql.Timestamp and import org.apache.hadoop.hive.common.type.Timestamp without success Is there any guidance ? thanks -- nicolas

Re: UDF timestamp columns

2020-01-28 Thread Nicolas Paris
:51PM +, Shawn Weeks wrote: > Depending on what version of Hive you are looking for TimestampWritable or > one of it's related classes. > > Thanks > Shawn > > On 1/22/20, 6:51 AM, "Nicolas Paris" wrote: > > Hi > > I cannot

metastore bug when hive update spark table ?

2022-01-06 Thread Nicolas Paris
Hi there. I also posted this problem in the spark list. I am no sure this is a spark or a hive metastore problem. Or if there is some metastore tunning configuration as workaround. Spark can't see hive schema updates partly because it stores the schema in a weird way in hive metastore. 1. FROM

Unsubscribe

2022-02-08 Thread Nicolas Paris
sorry for the inconvenience, but I already sent 4 mails to  user-unsubscr...@hive.apache.org and never received any confirmation mail. might be something wrong on my side although I just unsubscribed from other apache mailing with success.