But this *minOpenTxn* value isn't from from delta I want to compact. *minOpenTxn* can point on transaction in partition *A *while in partition *B *there's deltas ready for compaction. If *minOpenTxn* is less than txnIds in partition *B *deltas, compaction won't happen. So open transaction in partition *A *blocks compaction in partition *B*. That's seems wrong to me.
On Thu, Jul 28, 2016 at 7:06 PM, Alan Gates <alanfga...@gmail.com> wrote: > Hive is doing the right thing there, as it cannot compact the deltas into > a base file while there are still open transactions in the delta. Storm > should be committing on some frequency even if it doesn’t have enough data > to commit. > > Alan. > > > On Jul 28, 2016, at 05:36, Igor Kuzmenko <f1she...@gmail.com> wrote: > > > > I made some research on that issue. > > The problem is in ValidCompactorTxnList::isTxnRangeValid method. > > > > Here's code: > > @Override > > public RangeResponse isTxnRangeValid(long minTxnId, long maxTxnId) { > > if (highWatermark < minTxnId) { > > return RangeResponse.NONE; > > } else if (minOpenTxn < 0) { > > return highWatermark >= maxTxnId ? RangeResponse.ALL : > RangeResponse.NONE; > > } else { > > return minOpenTxn > maxTxnId ? RangeResponse.ALL : > RangeResponse.NONE; > > } > > } > > > > In my case this method returned RangeResponce.NONE for most of delta > files. With this value delta file doesn't include in compaction. > > > > Last 'else' bock compare minOpenTxn to maxTxnId and if maxTxnId bigger > return RangeResponce.NONE, thats a problem for me, because of using Storm > Hive Bolt. Hive Bolt gets transaction and maintain it open with heartbeat > until there's data to commit. > > > > So if i get transaction and maintain it open all compactions will stop. > Is it incorrect Hive behavior, or Storm should close transaction? > > > > > > > > > > On Wed, Jul 27, 2016 at 8:46 PM, Igor Kuzmenko <f1she...@gmail.com> > wrote: > > Thanks for reply, Alan. My guess with Storm was wrong. Today I get same > behavior with running Storm topology. > > Anyway, I'd like to know, how can I check that transaction batch was > closed correctly? > > > > On Wed, Jul 27, 2016 at 8:09 PM, Alan Gates <alanfga...@gmail.com> > wrote: > > I don’t know the details of how the storm application that streams into > Hive works, but this sounds like the transaction batches weren’t getting > closed. Compaction can’t happen until those batches are closed. Do you > know how you had storm configured? Also, you might ask separately on the > storm list to see if people have seen this issue before. > > > > Alan. > > > > > On Jul 27, 2016, at 03:31, Igor Kuzmenko <f1she...@gmail.com> wrote: > > > > > > One more thing. I'm using Apache Storm to stream data in Hive. And > when I turned off Storm topology compactions started to work properly. > > > > > > On Tue, Jul 26, 2016 at 6:28 PM, Igor Kuzmenko <f1she...@gmail.com> > wrote: > > > I'm using Hive 1.2.1 transactional table. Inserting data in it via > Hive Streaming API. After some time i expect compaction to start but it > didn't happen: > > > > > > Here's part of log, which shows that compactor initiator thread > doesn't see any delta files: > > > 2016-07-26 18:06:52,459 INFO [Thread-8]: compactor.Initiator > (Initiator.java:run(89)) - Checking to see if we should compact > default.data_aaa.dt=20160726 > > > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: io.AcidUtils > (AcidUtils.java:getAcidState(432)) - in directory hdfs:// > sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/data_aaa/dt=20160726 > base = null deltas = 0 > > > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: compactor.Initiator > (Initiator.java:determineCompactionType(271)) - delta size: 0 base size: 0 > threshold: 0.1 will major compact: false > > > > > > But in that directory there's actually 23 files: > > > > > > hadoop fs -ls /apps/hive/warehouse/data_aaa/dt=20160726 > > > Found 23 items > > > -rw-r--r-- 3 storm hdfs 4 2016-07-26 17:20 > /apps/hive/warehouse/data_aaa/dt=20160726/_orc_acid_version > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:22 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71741256_71741355 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:23 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71762456_71762555 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:25 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71787756_71787855 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:26 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71795756_71795855 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:27 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71804656_71804755 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:29 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71828856_71828955 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:30 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71846656_71846755 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:32 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71850756_71850855 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:33 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71867356_71867455 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:34 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71891556_71891655 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:36 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71904856_71904955 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:37 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71907256_71907355 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:39 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71918756_71918855 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:40 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71947556_71947655 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:41 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71960656_71960755 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:43 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71963156_71963255 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:44 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71964556_71964655 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:46 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71987156_71987255 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:47 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72015756_72015855 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:48 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72021356_72021455 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:50 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72048756_72048855 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:50 > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72070856_72070955 > > > > > > Full log here. > > > > > > What could go wrong? > > > > > > > > > > >