Quick update:

After each compaction, files under base directory (for the buckets) have
latest data. However, I am expecting to see all delta files (and
directories) gone, as they should be merged in the base directory.
Otherwise, we'll start seeing too many small files on HDFS which is a
problem. Am I understanding this feature correctly in assuming so?

On Wed, Nov 16, 2016 at 5:24 PM, Manoj Murumkar <manoj.murum...@gmail.com>
wrote:

> Hi,
>
> We are trying to implement transaction feature in hive. I created
> following table:
>
> +-----------------------------------------------------------
> -----------------------------+--+
> |                                     createtab_stmt
> |
> +-----------------------------------------------------------
> -----------------------------+--+
> | CREATE TABLE `txntest.txntab3`(
>                                             |
> |   `id` int,
> |
> |   `name` string)
> |
> | CLUSTERED BY (
>                                             |
> |   id)
> |
> | INTO 2 BUCKETS
> |
> | ROW FORMAT SERDE
> |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
> |
> | STORED AS INPUTFORMAT
> |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
> |
> | OUTPUTFORMAT
> |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
> |
> | LOCATION
> |
> |   'hdfs://or1010051016175.corp.adobe.com:8020/user/hive/
> warehouse/txntest.db/txntab3'  |
> | TBLPROPERTIES (
>                                            |
> |   'COLUMN_STATS_ACCURATE'='true',
> |
> |   'numFiles'='22',
> |
> |   'numRows'='90000',
> |
> |   'rawDataSize'='0',
> |
> |   'totalSize'='3564019',
> |
> |   'transactional'='true',
> |
> |   'transient_lastDdlTime'='1479329198')
> |
> +-----------------------------------------------------------
> -----------------------------+--+
>
> I inserted 90000 rows in it in multiple iterations, so it created 22 files
> (as is visible above). I have run multiple compactions (major and minor),
> but nothing seems to happen on HDFS. What am I missing?
>
> I have following configuration:
>
> Metastore:
>
> hive.compactor.initiator.on = true;
> hive.compactor.worker.threads = 2;
>
> Client:
>
> hive.support.concurrency = true;
> hive.exec.dynamic.partition.mode = nonstrict;
> hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
>
> Can someone point me in right direction? Compaction process did run
> (verified via SHOW COMPACTIONS) and I also see we have base directory
> created on HDFS.
> I was expecting all the delta directories gone when major compation runs.
>
> drwxrwxrwt   - admin    hive          0 2016-11-16 20:47
> /user/hive/warehouse/txntest.db/txntab3/base_0000021
> -rw-r--r--   3 admin    hive     227916 2016-11-16 20:47
> /user/hive/warehouse/txntest.db/txntab3/base_0000021/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:33
> /user/hive/warehouse/txntest.db/txntab3/delta_0000003_0000003
> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:33
> /user/hive/warehouse/txntest.db/txntab3/delta_0000003_0000003/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:33
> /user/hive/warehouse/txntest.db/txntab3/delta_0000004_0000004
> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:33
> /user/hive/warehouse/txntest.db/txntab3/delta_0000004_0000004/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:34
> /user/hive/warehouse/txntest.db/txntab3/delta_0000005_0000005
> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:33
> /user/hive/warehouse/txntest.db/txntab3/delta_0000005_0000005/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:34
> /user/hive/warehouse/txntest.db/txntab3/delta_0000006_0000006
> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:34
> /user/hive/warehouse/txntest.db/txntab3/delta_0000006_0000006/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:34
> /user/hive/warehouse/txntest.db/txntab3/delta_0000007_0000007
> -rw-r--r--   3 nex37045 hive        636 2016-11-16 01:34
> /user/hive/warehouse/txntest.db/txntab3/delta_0000007_0000007/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:36
> /user/hive/warehouse/txntest.db/txntab3/delta_0000008_0000008
> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:36
> /user/hive/warehouse/txntest.db/txntab3/delta_0000008_0000008/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:36
> /user/hive/warehouse/txntest.db/txntab3/delta_0000009_0000009
> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:36
> /user/hive/warehouse/txntest.db/txntab3/delta_0000009_0000009/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:37
> /user/hive/warehouse/txntest.db/txntab3/delta_0000010_0000010
> -rw-r--r--   3 nex37045 hive        640 2016-11-16 01:37
> /user/hive/warehouse/txntest.db/txntab3/delta_0000010_0000010/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:37
> /user/hive/warehouse/txntest.db/txntab3/delta_0000011_0000011
> -rw-r--r--   3 nex37045 hive        644 2016-11-16 01:37
> /user/hive/warehouse/txntest.db/txntab3/delta_0000011_0000011/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:37
> /user/hive/warehouse/txntest.db/txntab3/delta_0000012_0000012
> -rw-r--r--   3 nex37045 hive        644 2016-11-16 01:37
> /user/hive/warehouse/txntest.db/txntab3/delta_0000012_0000012/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:44
> /user/hive/warehouse/txntest.db/txntab3/delta_0000013_0000013
> -rw-r--r--   3 nex37045 hive        644 2016-11-16 01:44
> /user/hive/warehouse/txntest.db/txntab3/delta_0000013_0000013/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 01:45
> /user/hive/warehouse/txntest.db/txntab3/delta_0000014_0000014
> -rw-r--r--   3 nex37045 hive        644 2016-11-16 01:45
> /user/hive/warehouse/txntest.db/txntab3/delta_0000014_0000014/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 02:02
> /user/hive/warehouse/txntest.db/txntab3/delta_0000015_0000015
> -rw-r--r--   3 nex37045 hive        644 2016-11-16 02:02
> /user/hive/warehouse/txntest.db/txntab3/delta_0000015_0000015/bucket_00000
> drwxrwxrwt   - admin    hive          0 2016-11-16 02:03
> /user/hive/warehouse/txntest.db/txntab3/delta_0000015_0000016
> -rw-r--r--   3 admin    hive        531 2016-11-16 02:03
> /user/hive/warehouse/txntest.db/txntab3/delta_0000015_0000016/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 02:03
> /user/hive/warehouse/txntest.db/txntab3/delta_0000016_0000016
> -rw-r--r--   3 nex37045 hive        644 2016-11-16 02:03
> /user/hive/warehouse/txntest.db/txntab3/delta_0000016_0000016/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 20:37
> /user/hive/warehouse/txntest.db/txntab3/delta_0000017_0000017
> -rw-r--r--   3 nex37045 hive     156395 2016-11-16 20:37
> /user/hive/warehouse/txntest.db/txntab3/delta_0000017_0000017/bucket_00000
> drwxrwxrwt   - admin    hive          0 2016-11-16 20:40
> /user/hive/warehouse/txntest.db/txntab3/delta_0000017_0000019
> -rw-r--r--   3 admin    hive    2598250 2016-11-16 20:40
> /user/hive/warehouse/txntest.db/txntab3/delta_0000017_0000019/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 20:39
> /user/hive/warehouse/txntest.db/txntab3/delta_0000018_0000018
> -rw-r--r--   3 nex37045 hive       4737 2016-11-16 20:39
> /user/hive/warehouse/txntest.db/txntab3/delta_0000018_0000018/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 20:39
> /user/hive/warehouse/txntest.db/txntab3/delta_0000019_0000019
> -rw-r--r--   3 nex37045 hive     192658 2016-11-16 20:39
> /user/hive/warehouse/txntest.db/txntab3/delta_0000019_0000019/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 20:45
> /user/hive/warehouse/txntest.db/txntab3/delta_0000020_0000020
> -rw-r--r--   3 nex37045 hive     192835 2016-11-16 20:45
> /user/hive/warehouse/txntest.db/txntab3/delta_0000020_0000020/bucket_00000
> drwxr-xr-x   - nex37045 hive          0 2016-11-16 20:46
> /user/hive/warehouse/txntest.db/txntab3/delta_0000021_0000021
> -rw-r--r--   3 nex37045 hive     201206 2016-11-16 20:46
> /user/hive/warehouse/txntest.db/txntab3/delta_0000021_0000021/bucket_00000
>
>
>
>
> Thanks,
>
> Manoj
>
>

Reply via email to