[ https://issues.apache.org/jira/browse/HIVE-25335?focusedWorklogId=774386&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774386 ]
ASF GitHub Bot logged work on HIVE-25335: ----------------------------------------- Author: ASF GitHub Bot Created on: 25/May/22 07:28 Start Date: 25/May/22 07:28 Worklog Time Spent: 10m Work Description: zabetak commented on PR #3292: URL: https://github.com/apache/hive/pull/3292#issuecomment-1136889593 @zhengchenyu I suspect that the error you see on Jenkins has to do with the fact that there are a lot of errors in the tests. If you run locally and you use the `-Dtest.ouptut.overwrite` then you will not have any errors cause you are updating automatically the "reference files". If you want to see all the errors locally you must remove this parameter. Having said that if you commit all the changes in the reference files then tests most likely will pass and the Jenkins pipeline may run fine. Issue Time Tracking ------------------- Worklog Id: (was: 774386) Time Spent: 3h 20m (was: 3h 10m) > Unreasonable setting reduce number, when join big size table(but small row > count) and small size table > ------------------------------------------------------------------------------------------------------ > > Key: HIVE-25335 > URL: https://issues.apache.org/jira/browse/HIVE-25335 > Project: Hive > Issue Type: Improvement > Reporter: zhengchenyu > Assignee: zhengchenyu > Priority: Major > Labels: pull-request-available > Attachments: HIVE-25335.001.patch > > Time Spent: 3h 20m > Remaining Estimate: 0h > > I found an application which is slow in our cluster, because the proccess > bytes of one reduce is very huge, but only two reduce. > when I debug, I found the reason. Because in this sql, one big size table > (about 30G) with few row count(about 3.5M), another small size table (about > 100M) have more row count (about 3.6M). So JoinStatsRule.process only use > 100M to estimate reducer's number. But we need to process 30G byte in fact. -- This message was sent by Atlassian Jira (v8.20.7#820007)