[ https://issues.apache.org/jira/browse/HUDI-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Raymond Xu updated HUDI-3806: ----------------------------- Sprint: Hudi-Sprint-Apr-05, Hudi-Sprint-Apr-12, Hudi-Sprint-Apr-19 (was: Hudi-Sprint-Apr-05, Hudi-Sprint-Apr-12) > Improve HoodieBloomIndex using bloom_filter and col_stats in MDT > ---------------------------------------------------------------- > > Key: HUDI-3806 > URL: https://issues.apache.org/jira/browse/HUDI-3806 > Project: Apache Hudi > Issue Type: Improvement > Reporter: Ethan Guo > Assignee: Ethan Guo > Priority: Blocker > Fix For: 0.12.0 > > > For a Delastreamer job doing bulk inserts of 10GB batches, the job is stuck > at the stage when HoodieBloomIndex reads bloom filter from the metadata > table, taking more than 2 hours. When bloom filter is disabled in metadata > table, each commit takes 10-20 minutes. -- This message was sent by Atlassian Jira (v8.20.7#820007)