[ https://issues.apache.org/jira/browse/HIVE-16311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15944685#comment-15944685 ]
Colin Ma commented on HIVE-16311: --------------------------------- The initial patch is uploaded. Do the simple test, 12345.67/123.45 with FastHiveDecimalImpl.fastDivide 500000 times, the result shows 1s(without patch) vs 0.1s(with patch). Also test the patch with q06 of TPCx-BB which has the following divide expression: {code} sum( case when (d_year = 2001) THEN (((ws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2) ELSE 0 END) first_year_total {code} The cluster includes 6 nodes, 128G memory/per node, CPU is Intel(R) Xeon(R) E5-2680, 1G network. With the 1T data scale and spark as executor engine, the following is the result: || ||without patch||with patch||improvement|| |disable vectorization|214s|164s|23.36%| |enable vectorization|252s|125s|50.4%| > Improve the performance for FastHiveDecimalImpl.fastDivide > ---------------------------------------------------------- > > Key: HIVE-16311 > URL: https://issues.apache.org/jira/browse/HIVE-16311 > Project: Hive > Issue Type: Improvement > Affects Versions: 2.2.0 > Reporter: Colin Ma > Assignee: Colin Ma > Fix For: 2.2.0 > > Attachments: HIVE-16311.001.patch > > > FastHiveDecimalImpl.fastDivide is poor performance when evaluate the > expression as 12345.67/123.45 > There are 2 points can be improved: > 1. Don't always use HiveDecimal.MAX_SCALE as scale when do the > BigDecimal.divide. > 2. Get the precision for BigInteger in a fast way if possible. -- This message was sent by Atlassian JIRA (v6.3.15#6346)