ta-access/content/understanding-administering-compactions.html
>
> Also if everything else fails, you can still issue the ALTER TABLE command
> periodically using crontab. Running extra compaction will not hurt that
> much.
>
> Thanks,
> Peter
>
> On Jun 2, 2020, at 14:2
. 2 juin 2020 à 12:57, Peter Vary a écrit :
> Hi David,
>
> You do not really need to run compaction every time.
> Is it possible to wait for the compaction to start automatically next time?
>
> Thanks,
> Peter
>
> On Jun 2, 2020, at 12:51, David Morin wrote:
>
> Th
s looks very confusing when looking at
> the logs."
>
> Thanks,
> Peter
>
> On Jun 2, 2020, at 11:44, David Morin wrote:
>
> I don't get it.
> The transaction id in the error message "No delta files or original files
> found to compact in hdfs://... wi
paction for the current database/table
On 2020/06/01 20:13:08, David Morin wrote:
> Hi,
>
> I have a compaction issue on my cluster. When I force a compaction (major) on
> one table I get this error in Metastore logs:
>
> 2020-06-01 19:49:35,512 ERROR [-78]: compactor.Com
Hi,
I have a compaction issue on my cluster. When I force a compaction (major) on
one table I get this error in Metastore logs:
2020-06-01 19:49:35,512 ERROR [-78]: compactor.CompactorMR
(CompactorMR.java:run(264)) - No delta files or original files found to compact
in hdfs://...hive/wareh
Le jeu. 6 févr. 2020 à 12:12, David Morin a
écrit :
> ok, Peter
> No problem. Thx
> I'll keep you in touch
>
> On 2020/02/06 09:42:39, Peter Vary wrote:
> > Hi David,
> >
> > I more familiar with ACID v2 :(
> > What I would do is to run an update oper
be nice to hear back from you if you found something.
>
> Thanks,
> Peter
>
> > On Feb 5, 2020, at 16:55, David Morin wrote:
> >
> > Hello,
> >
> > Thanks.
> > In fact I use HDP 2.6.5 and previous Orc version with transactionid for
> > e
he rows. Only insert and delete. So update
> is handled as delete (old) row, insert (new/independent) row.
> The delete is stored in the delete delta directories., and the file do not
> have to contain the {row} struct at the end.
>
> Hope this helps,
> Peter
>
> > On
73_0199073_
hdfs:///delta_0199073_0199073_0002
And the first one contains updates (operation:1) and the second one, inserts
(operation:0)
Thanks for your help
David
On 2019/12/01 16:57:08, David Morin wrote:
> Hi Peter,
>
> At the moment I have a pipeline based on Flink to wri
Hi,
When major compactions have been performed on Hive tables based on the Orc
format do we have Orc stripes that have been rewritten ? I know that records
have not been updated (ignored for some of ones but not updated) but concerning
Stripes size, do major compactions impact these ones ?
For e
your question below: Yes, the files should be ordered by:
> originalTransacion, bucket, rowId triple, otherwise you will get wrong
> results.
>
> Thanks,
> Peter
>
> > On Nov 19, 2019, at 13:30, David Morin wrote:
> >
> > here after more d
tid":3,"rowid":0} | *5218* |
| {"transactionid":11365,"bucketid":3,"rowid":1} | *5216* |
| {"transactionid":11369,"bucketid":3,"rowid":1} | *5216* |
| {"transactionid":11369,"bucketid":
Hello,
I'm trying to understand the purpose of the rowid column inside ORC delta
file
{"transactionid":11359,"bucketid":5,"*rowid*":0}
Orc view: {"operation":0,"originalTransaction":11359,"bucket":5,"*rowId*
":0,"currentTransaction":11359,"row":...}
I use HDP 2.6 => Hive 2
If I want to be idempot
>
> Alan.
>
> On Mon, Sep 9, 2019 at 10:55 AM David Morin
> wrote:
>
>> Thanks Alan,
>>
>> When you say "you just can't have two simultaneous deletes in the same
>> partition", simultaneous means for the same transaction ?
>> If a create 2 &q
is changes in Hive 3, where update and delete also take shared locks and
> a first committer wins strategy is employed instead.
>
> Alan.
>
> On Mon, Sep 9, 2019 at 8:29 AM David Morin
> wrote:
>
>> Hello,
>>
>> I use in production HDP 2.6.5 with Hive 2.1.0
&g
Hello,
I use in production HDP 2.6.5 with Hive 2.1.0
We use transactional tables and we try to ingest data in a streaming way
(despite the fact we still use Hive 2)
I've read some docs but I would like some clarifications concerning the use of
Locks with transactional tables.
Do we have to use l
elong to hive.
Weird, isn't it ?
Thus, this is a workaround but a little bit crappy.
But I'm open to any more suitable solution.
Le lun. 26 août 2019 à 08:51, David Morin a
écrit :
> Sorry, the same link in english:
> http://www.adaltas.com/en/2019/07/25/hive-3-features-tips-tricks/
>
Sorry, the same link in english:
http://www.adaltas.com/en/2019/07/25/hive-3-features-tips-tricks/
Le lun. 26 août 2019 à 08:35, David Morin a
écrit :
> Here after a link related to hive3:
> http://www.adaltas.com/fr/2019/07/25/hive-3-fonctionnalites-conseils-astuces/
> The author sug
août 2019 à 07:51, David Morin a
écrit :
> Hello,
> I've been trying "ALTER TABLE (table_name) COMPACT 'MAJOR'" on my Hive 2
> environment, but it always fails (HDP 2.6.5 precisely). It seems that the
> merged base file is created but the delta is not delet
Hello,
I've been trying "ALTER TABLE (table_name) COMPACT 'MAJOR'" on my Hive 2
environment, but it always fails (HDP 2.6.5 precisely). It seems that the
merged base file is created but the delta is not deleted.
I found that it was because the HiveMetastore Client can't connect to the
metastore bec
;
> Alan.
>
> On Tue, Mar 12, 2019 at 12:24 PM David Morin
> wrote:
>
>> Thanks Alan.
>> Yes, the problem is fact was that this streaming API does not handle
>> update and delete.
>> I've used native Orc files and the next step I've planned to do
s designed for this case, though it only handles insert (not update),
> so if you need updates you'd have to do the merge as you are currently
> doing.
>
> Alan.
>
> On Mon, Mar 11, 2019 at 2:09 PM David Morin
> wrote:
>
>> Hello,
>>
>> I've just
Hello,
I've just implemented a pipeline based on Apache Flink to synchronize
data between MySQL and Hive (transactional + bucketized) onto HDP
cluster. Flink jobs run on Yarn.
I've used Orc files but without ACID properties.
Then, we've created external tables on these hdfs directories that contai
Hi,
I've just implemented a pipeline to synchronize data between MySQL and Hive
(transactional + bucketized) onto HDP cluster.
I've used Orc files but without ACID properties.
Then, we've created external tables on these hdfs directories that contain
these delta Orc files.
Then, MERGE INTO queries
Hello,
I face to one error when I try to read my Orc files from Hive (external
table) or Pig or with hive --orcfiledump ..
These files are generated with Flink using the Orc Java API with Vectorize
column.
If I create these files locally (/tmp/...), push them to hdfs, then I can
read the content of
25 matches
Mail list logo