Hi Peter, Just to give some news concerning my issue. The problem is fixed. In fact, it was a reset of rowid in my application because default batch size of my VectorizedRowBatch (ORC) is 1024 And during the reset of this batch, a reset of rowid was done. By now it works as expected
Thanks David Le jeu. 6 févr. 2020 à 12:12, David Morin <morin.david....@gmail.com> a écrit : > ok, Peter > No problem. Thx > I'll keep you in touch > > On 2020/02/06 09:42:39, Peter Vary <pv...@cloudera.com> wrote: > > Hi David, > > > > I more familiar with ACID v2 :( > > What I would do is to run an update operation with your version of Hive > and try to see how it handles this case. > > > > Would be nice to hear back from you if you found something. > > > > Thanks, > > Peter > > > > > On Feb 5, 2020, at 16:55, David Morin <morin.david....@gmail.com> > wrote: > > > > > > Hello, > > > > > > Thanks. > > > In fact I use HDP 2.6.5 and previous Orc version with transactionid > for example and the update flag. > > > Sorry with the row__id iw would have been easier > > > > > > So, Here after the Orc files content (with hive --orcfiledump) > > > hive --orcfiledump hdfs://XXXX/delta_0198994_0198994_0000/bucket_00000 > > > > {"operation":0,"originalTransaction":198994,"bucket":0,"rowId":14,"currentTransaction":198994,"row":{...}} > > > > > > hive --orcfiledump hdfs://XXXX/delta_0199073_0199073_0000/bucket_00000 > > > > {"operation":1,"originalTransaction":198994,"bucket":0,"rowId":14,"currentTransaction":199073,"row":{...}} > > > > > > => When I execute a SELECT statement on the PK I see 2 records. The > link is not done. > > > Still, the meta looks good. > > > The only difference I see is that in Orc files I mix operations. > > > When I perform a MERGE statement for example 2 directories have been > created > > > hdfs://XXXX/delta_0199073_0199073_0000 > > > hdfs://XXXX/delta_0199073_0199073_0002 > > > And the first one contains updates (operation:1) and the second one, > inserts (operation:0) > > > > > > And here after the row__id values > > > > > > (related file: hdfs://XXXX/delta_0198994_0198994_0000/bucket_00000) > > > row__id {"transactionid":198994,"bucketid":0,"rowid":14} > > > ts 1580751316822642886 (5h35) > > > id 764925 > > > > > > (related file: hdfs://XXXX/delta_0199073_0199073_0000/bucket_00000 ) > > > row__id {"transactionid":198994,"bucketid":0,"rowid":14} > > > ts 1580757191368914630 (7h13) > > > id 764925 > > > > > > => And I have one duplicated value if I execute a Query based on the > id (primary key) > > > > > > So I don't understand because the transactionid=originalTransaction > and all metadatas seems to be good. > > > Probably a problem in the sort but I follow the rule that data are > ordered by originalTransaction,bucketId,rowId ascendingly and > currentTransaction descendingly. It works pretty well except for some > tables with lot of updates. > > > The only thing I can see at the moment it is the fact that I mix > different types of operations in one bucket. The Merge query for example > create different directories (one per operation) > > > > > > David > > > > > > > > > On 2020/02/05 12:22:28, Peter Vary <pv...@cloudera.com> wrote: > > >> Hi David, > > >> > > >> There is no tombstone for the updated record. > > >> In ACID v2 there is no update for the rows. Only insert and delete. > So update is handled as delete (old) row, insert (new/independent) row. > > >> The delete is stored in the delete delta directories., and the file > do not have to contain the {row} struct at the end. > > >> > > >> Hope this helps, > > >> Peter > > >> > > >>> On Feb 5, 2020, at 09:39, David Morin <morin.david....@gmail.com> > wrote: > > >>> > > >>> Hi, > > >>> > > >>> It works pretty well but... still problems sometimes occur > > >>> Do we have to separate operations ? > > >>> > > >>> Here after Orc files content: > > >>> hive --orcfiledump > hdfs://XXXX/delta_0198994_0198994_0000/bucket_00000 > > >>> > {"operation":0,"originalTransaction":198994,"bucket":0,"rowId":14,"currentTransaction":198994,"row":{...}} > > >>> > > >>> hive --orcfiledump > hdfs://XXXX/delta_0199073_0199073_0000/bucket_00000 > > >>> > {"operation":1,"originalTransaction":198994,"bucket":0,"rowId":14,"currentTransaction":199073,"row":{...}} > > >>> > > >>> => When I execute a SELECT statement on the PK I see 2 records. The > link is not done. > > >>> Still, the meta looks good. > > >>> The only difference I see is that in Orc files I mix operations. > > >>> When I perform a MERGE statement for example 2 directories have been > created > > >>> hdfs://XXXX/delta_0199073_0199073_0000 > > >>> hdfs://XXXX/delta_0199073_0199073_0002 > > >>> And the first one contains updates (operation:1) and the second one, > inserts (operation:0) > > >>> > > >>> Thanks for your help > > >>> David > > >>> > > >>> On 2019/12/01 16:57:08, David Morin <morin.david....@gmail.com> > wrote: > > >>>> Hi Peter, > > >>>> > > >>>> At the moment I have a pipeline based on Flink to write Orc Files. > These Orc Files can be read from Hive thanks to external tables and, then, > a merge statement (triggered by oozie) push these data into tables managed > by Hive (transactional tables => ORC). Hive version is 2.1 because this is > the one provided by HDP 2.6.5. > > >>>> We've developed a system that write Hive Delta Files for the > managed tables directly from Flink. > > >>>> The current streaming apis for Hive 2 are not suitable for our > needs and we cannot use the new Hive 3 streaming api yet. This system uses > the Flink state to store Hive metadata (originalTransaction, bucket, rowId, > ..) > > >>>> Thanks for your reply because yes, when files are ordered by > originalTransacion, bucket, rowId > > >>>> it works ! I just have to use 1 transaction instead of 2 at the > moment and it will be ok. > > >>>> > > >>>> Thanks > > >>>> David > > >>>> > > >>>> On 2019/11/29 11:18:05, Peter Vary <pv...@cloudera.com> wrote: > > >>>>> Hi David, > > >>>>> > > >>>>> Not entirely sure what you are doing here :), my guess is that you > are trying to write ACID tables outside of hive. Am I right? What is the > exact use-case? There might be better solutions out there than writing the > files by hand. > > >>>>> > > >>>>> As for your question below: Yes, the files should be ordered by: > originalTransacion, bucket, rowId triple, otherwise you will get wrong > results. > > >>>>> > > >>>>> Thanks, > > >>>>> Peter > > >>>>> > > >>>>>> On Nov 19, 2019, at 13:30, David Morin <morin.david....@gmail.com> > wrote: > > >>>>>> > > >>>>>> here after more details about ORC content and the fact we have > duplicate rows: > > >>>>>> > > >>>>>> /delta_0011365_0011365_0000/bucket_00003 > > >>>>>> > > >>>>>> > {"operation":0,"originalTransaction":11365,"bucket":3,"rowId":0,"currentTransaction":11365,"row":{"TS":1574156027915254212,"cle":5218,...}} > > >>>>>> > {"operation":0,"originalTransaction":11365,"bucket":3,"rowId":1,"currentTransaction":11365,"row":{"TS":1574156027915075038,"cle":5216,...}} > > >>>>>> > > >>>>>> > > >>>>>> /delta_0011368_0011368_0000/bucket_00003 > > >>>>>> > > >>>>>> > {"operation":2,"originalTransaction":11365,"bucket":3,"rowId":1,"currentTransaction":11368,"row":null} > > >>>>>> > {"operation":2,"originalTransaction":11365,"bucket":3,"rowId":0,"currentTransaction":11368,"row":null} > > >>>>>> > > >>>>>> /delta_0011369_0011369_0000/bucket_00003 > > >>>>>> > > >>>>>> > {"operation":0,"originalTransaction":11369,"bucket":3,"rowId":1,"currentTransaction":11369,"row":{"TS":1574157407855174144,"cle":5216,...}} > > >>>>>> > {"operation":0,"originalTransaction":11369,"bucket":3,"rowId":0,"currentTransaction":11369,"row":{"TS":1574157407855265906,"cle":5218,...}} > > >>>>>> > > >>>>>> +-------------------------------------------------+-------+--+ > > >>>>>> | row__id | cle | > > >>>>>> +-------------------------------------------------+-------+--+ > > >>>>>> | {"transactionid":11367,"bucketid":0,"rowid":0} | 5209 | > > >>>>>> | {"transactionid":11369,"bucketid":0,"rowid":0} | 5211 | > > >>>>>> | {"transactionid":11369,"bucketid":1,"rowid":0} | 5210 | > > >>>>>> | {"transactionid":11369,"bucketid":2,"rowid":0} | 5214 | > > >>>>>> | {"transactionid":11369,"bucketid":2,"rowid":1} | 5215 | > > >>>>>> | {"transactionid":11365,"bucketid":3,"rowid":0} | 5218 | > > >>>>>> | {"transactionid":11365,"bucketid":3,"rowid":1} | 5216 | > > >>>>>> | {"transactionid":11369,"bucketid":3,"rowid":1} | 5216 | > > >>>>>> | {"transactionid":11369,"bucketid":3,"rowid":0} | 5218 | > > >>>>>> | {"transactionid":11369,"bucketid":4,"rowid":0} | 5217 | > > >>>>>> | {"transactionid":11369,"bucketid":4,"rowid":1} | 5213 | > > >>>>>> | {"transactionid":11369,"bucketid":7,"rowid":0} | 5212 | > > >>>>>> +-------------------------------------------------+-------+--+ > > >>>>>> > > >>>>>> As you can see we have duplicate rows for column "cle" 5216 and > 5218 > > >>>>>> Do we have to keep the rowids ordered ? because this is the only > difference I have noticed based on some tests with beeline. > > >>>>>> > > >>>>>> Thanks > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> Le mar. 19 nov. 2019 à 00:18, David Morin < > morin.david....@gmail.com <mailto:morin.david....@gmail.com>> a écrit : > > >>>>>> Hello, > > >>>>>> > > >>>>>> I'm trying to understand the purpose of the rowid column inside > ORC delta file > > >>>>>> {"transactionid":11359,"bucketid":5,"rowid":0} > > >>>>>> Orc view: > {"operation":0,"originalTransaction":11359,"bucket":5,"rowId":0,"currentTransaction":11359,"row":...} > > >>>>>> I use HDP 2.6 => Hive 2 > > >>>>>> > > >>>>>> If I want to be idempotent with INSERT / DELETE / INSERT. > > >>>>>> Do we have to keep the same rowid ? > > >>>>>> It seems that when the rowid is changed during the second INSERT > I have a duplicate row. > > >>>>>> For me, I can create a new rowid for the new transaction during > the second INSERT but that seems to generate duplicate records. > > >>>>>> > > >>>>>> Regards, > > >>>>>> David > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>> > > >>>>> > > >>>> > > >> > > >> > > > > >