[GitHub] [hudi] Coreixwumo opened a new issue #4221: [SUPPORT] hudi mor table has a lack of data

GitBox Sun, 05 Dec 2021 18:28:26 -0800


Coreixwumo opened a new issue #4221:
URL: https://github.com/apache/hudi/issues/4221



   hi
   
   there is some problems while i use mor table with flink, but with the same 
schema cow table is ok ,here is my create table sql:
   
   create table hudi.corgi_PayOrder_logic_mor(
   `id` STRING,
   `status` STRING,
   `version` STRING,
   `payerId` STRING, 
   `payeeId` STRING,
   `eventType` STRING,
   `executeTime` BIGINT,
   `touchTime` BIGINT,
   `created` STRING,
   `etl_update_time` STRING,
   `visit_date` string
   )PARTITIONED BY (`visit_date`)
   with (
   'connector' = 'hudi'
   ,'is_generic' = 'true'
   ,'path' = 'xxx'
   ,'hoodie.datasource.write.recordkey.field' = 'id'
   ,'hoodie.datasource.write.partitionpath.field' = 'visit_date'
   ,'write.precombine.field' = 'etl_update_time'
   ,'write.tasks' = '20'
   ,'table.type' = 'MERGE_ON_READ'
   ,'index.global.enabled' = 'false'
   ,'compaction.schedule.enabled' = 'true'
   ,'compaction.async.enabled' = 'false'
   ,'compaction.trigger.strategy' = 'time_elapsed'
   ,'compaction.delta_seconds' = '120'
   ,'write.rate.limit' = '1000'
   ,'hive_sync.enable' = 'true'
   ,'hive_sync.db' = 'hudi'
   ,'hive_sync.table' = 'corgi_PayOrder_mor'
   ,'hive_sync.username' = 'data'
   ,'hive_sync.file_format' = 'PARQUET'
   ,'hive_sync.support_timestamp' = 'true'
   ,'hive_sync.use_jdbc' = 'true'
   ,'hive_sync.jdbc_url' = 'jdbc:hive2://xxx'
   ,'hive_sync.metastore.uris' = 'thrift://x1,thrift://x2'
   ,'hoodie.datasource.hive_sync.partition_extractor_class' = 
'hudi.DatePartitionExtractor'
   ,'hoodie.datasource.hive_style_partition' = 'true'
   ,'hive_sync.partition_fields' = 'visit_date'
   ,'hive_sync.auto_create_database' = 'true'
   ,'hive_sync.skip_ro_suffix' = 'false'
   ,'hive_sync.support_timestamp' = 'false'
   ,'read.tasks' = '20'
   ,'read.streaming.enabled' = 'true'
   );
   
   
   first:
        the mor table data file size is right , similar to cow, but when i use 
spark sql query like
   'select count(1) from hudi.corgi_payorder_mor_rt' , there is a serious lack 
of data 
   
   second:
             the generation of deltacommit.requested file in .hoodie is not 
regular, do not match the compaction.trigger.strategy. usually does not 
generate the deltacommit.requested file. i do not know if this is ok.
   
        thanks very much
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] Coreixwumo opened a new issue #4221: [SUPPORT] hudi mor table has a lack of data

Reply via email to