I think Storm has some timeout parameter that will close the transaction
if there are no events for a certain amount of time.
How many transactions do you per transaction batch?  Perhaps making the
batches smaller will make them close sooner.

Eugene


On 7/28/16, 3:59 PM, "Alan Gates" <alanfga...@gmail.com> wrote:

>But until those transactions are closed you don¹t know that they won¹t
>write to partition B.  After they write to A they may choose to write to
>B and then commit.  The compactor can not make any assumptions about what
>sessions with open transactions will do in the future.
>
>Alan.
>
>> On Jul 28, 2016, at 09:19, Igor Kuzmenko <f1she...@gmail.com> wrote:
>> 
>> But this minOpenTxn value isn't from from delta I want to compact.
>>minOpenTxn can point on transaction in partition A while in partition B
>>there's deltas ready for compaction. If minOpenTxn is less than txnIds
>>in partition B deltas, compaction won't happen. So open transaction in
>>partition A blocks compaction in partition B. That's seems wrong to me.
>> 
>> On Thu, Jul 28, 2016 at 7:06 PM, Alan Gates <alanfga...@gmail.com>
>>wrote:
>> Hive is doing the right thing there, as it cannot compact the deltas
>>into a base file while there are still open transactions in the delta.
>>Storm should be committing on some frequency even if it doesn¹t have
>>enough data to commit.
>> 
>> Alan.
>> 
>> > On Jul 28, 2016, at 05:36, Igor Kuzmenko <f1she...@gmail.com> wrote:
>> >
>> > I made some research on that issue.
>> > The problem is in ValidCompactorTxnList::isTxnRangeValid method.
>> >
>> > Here's code:
>> > @Override
>> > public RangeResponse isTxnRangeValid(long minTxnId, long maxTxnId) {
>> >   if (highWatermark < minTxnId) {
>> >     return RangeResponse.NONE;
>> >   } else if (minOpenTxn < 0) {
>> >     return highWatermark >= maxTxnId ? RangeResponse.ALL :
>>RangeResponse.NONE;
>> >   } else {
>> >     return minOpenTxn > maxTxnId ? RangeResponse.ALL :
>>RangeResponse.NONE;
>> >   }
>> > }
>> >
>> > In my case this method returned RangeResponce.NONE for most of delta
>>files. With this value delta file doesn't include in compaction.
>> >
>> > Last 'else' bock compare minOpenTxn to maxTxnId and if maxTxnId
>>bigger return RangeResponce.NONE, thats a problem for me, because of
>>using Storm Hive Bolt. Hive Bolt gets transaction and maintain it open
>>with heartbeat until there's data to commit.
>> >
>> > So if i get transaction and maintain it open all compactions will
>>stop. Is it incorrect Hive behavior, or Storm should close transaction?
>> >
>> >
>> >
>> >
>> > On Wed, Jul 27, 2016 at 8:46 PM, Igor Kuzmenko <f1she...@gmail.com>
>>wrote:
>> > Thanks for reply, Alan. My guess with Storm was wrong. Today I get
>>same behavior with running Storm topology.
>> > Anyway, I'd like to know, how can I check that transaction batch was
>>closed correctly?
>> >
>> > On Wed, Jul 27, 2016 at 8:09 PM, Alan Gates <alanfga...@gmail.com>
>>wrote:
>> > I don¹t know the details of how the storm application that streams
>>into Hive works, but this sounds like the transaction batches weren¹t
>>getting closed.  Compaction can¹t happen until those batches are closed.
>> Do you know how you had storm configured?  Also, you might ask
>>separately on the storm list to see if people have seen this issue
>>before.
>> >
>> > Alan.
>> >
>> > > On Jul 27, 2016, at 03:31, Igor Kuzmenko <f1she...@gmail.com> wrote:
>> > >
>> > > One more thing. I'm using Apache Storm to stream data in Hive. And
>>when I turned off Storm topology compactions started to work properly.
>> > >
>> > > On Tue, Jul 26, 2016 at 6:28 PM, Igor Kuzmenko <f1she...@gmail.com>
>>wrote:
>> > > I'm using Hive 1.2.1 transactional table. Inserting data in it via
>>Hive Streaming API. After some time i expect compaction to start but it
>>didn't happen:
>> > >
>> > > Here's part of log, which shows that compactor initiator thread
>>doesn't see any delta files:
>> > > 2016-07-26 18:06:52,459 INFO  [Thread-8]: compactor.Initiator
>>(Initiator.java:run(89)) - Checking to see if we should compact
>>default.data_aaa.dt=20160726
>> > > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: io.AcidUtils
>>(AcidUtils.java:getAcidState(432)) - in directory
>>hdfs://sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/data_aaa/dt=2016
>>0726 base = null deltas = 0
>> > > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: compactor.Initiator
>>(Initiator.java:determineCompactionType(271)) - delta size: 0 base size:
>>0 threshold: 0.1 will major compact: false
>> > >
>> > > But in that directory there's actually 23 files:
>> > >
>> > > hadoop fs -ls /apps/hive/warehouse/data_aaa/dt=20160726
>> > > Found 23 items
>> > > -rw-r--r--   3 storm hdfs          4 2016-07-26 17:20
>>/apps/hive/warehouse/data_aaa/dt=20160726/_orc_acid_version
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:22
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71741256_71741355
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:23
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71762456_71762555
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:25
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71787756_71787855
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:26
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71795756_71795855
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:27
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71804656_71804755
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:29
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71828856_71828955
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:30
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71846656_71846755
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:32
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71850756_71850855
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:33
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71867356_71867455
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:34
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71891556_71891655
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:36
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71904856_71904955
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:37
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71907256_71907355
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:39
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71918756_71918855
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:40
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71947556_71947655
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:41
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71960656_71960755
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:43
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71963156_71963255
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:44
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71964556_71964655
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:46
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_71987156_71987255
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:47
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_72015756_72015855
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:48
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_72021356_72021455
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:50
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_72048756_72048855
>> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:50
>>/apps/hive/warehouse/data_aaa/dt=20160726/delta_72070856_72070955
>> > >
>> > > Full log here.
>> > >
>> > > What could go wrong?
>> > >
>> >
>> >
>> >
>> 
>> 
>
>

Reply via email to