[ https://issues.apache.org/jira/browse/HIVE-27309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Denys Kuzmenko resolved HIVE-27309. ----------------------------------- Fix Version/s: 4.0.0 Resolution: Fixed > Large number of partitions and small files causes OOM in query coordinator > -------------------------------------------------------------------------- > > Key: HIVE-27309 > URL: https://issues.apache.org/jira/browse/HIVE-27309 > Project: Hive > Issue Type: Improvement > Components: Iceberg integration > Reporter: Rajesh Balamohan > Assignee: Dmitriy Fingerman > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > When large number of nested partitions (with small files) are read, AM bails > out with OOM. > {noformat} > CREATE EXTERNAL TABLE `store_sales_delete_6`( > `ss_sold_time_sk` int, > `ss_item_sk` int, > `ss_customer_sk` int, > `ss_cdemo_sk` int, > `ss_hdemo_sk` int, > `ss_addr_sk` int, > `ss_store_sk` int, > `ss_promo_sk` int, > `ss_ticket_number` bigint, > `ss_quantity` int, > `ss_wholesale_cost` decimal(7,2), > `ss_list_price` decimal(7,2), > `ss_sales_price` decimal(7,2), > `ss_ext_discount_amt` decimal(7,2), > `ss_ext_sales_price` decimal(7,2), > `ss_ext_wholesale_cost` decimal(7,2), > `ss_ext_list_price` decimal(7,2), > `ss_ext_tax` decimal(7,2), > `ss_coupon_amt` decimal(7,2), > `ss_net_paid` decimal(7,2), > `ss_net_paid_inc_tax` decimal(7,2), > `ss_net_profit` decimal(7,2), > `ss_sold_date_sk` int) > PARTITIONED BY SPEC ( > ss_store_sk, ss_promo_sk, ss_sold_date_sk) STORED by iceberg LOCATION > 's3a://blah/blah/tablespace/external/hive/blah.db/store_sales_delete_6'; > alter table store_sales_delete_6 set > tblproperties('format'='iceberg/parquet'); > alter table store_sales_delete_6 set > tblproperties('format-version'='2');insert into store_sales_delete_6 select * > from tpcds_1000_update.ssv limit 100000;; > select count(*) from store_sales_delete_6; > {noformat} > Now, select count query throws OOM in query AM. This query generates 100,000 > splits which are grouped together into 41 splits. But streaming this and > sending as events throws OOM. -- This message was sent by Atlassian Jira (v8.20.10#820010)