[ https://issues.apache.org/jira/browse/HIVE-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HIVE-1672: ------------------------------------------ Attachment: patch-1672.txt I looked at Shrikrishna's query and the task logs. The mapper was spending time in processMapLocalWork() without reporting. Attached patch fixes the problem. > Complex Hive queries fails with Task timeouts when trying to do a table scan > ---------------------------------------------------------------------------- > > Key: HIVE-1672 > URL: https://issues.apache.org/jira/browse/HIVE-1672 > Project: Hive > Issue Type: Bug > Components: Query Processor > Reporter: Shrikrishna Lawande > Attachments: patch-1672.txt > > > executing a join query where one of the tables is a fact table would fail > during table scan of the fact table. This usually happens when one of the > tasks is scanning large number of rows (say 200 thousand rows in my case) and > the task fails to respond in the timeout window. > The workaround for this is to set a very large timeout for task. I could > manage to run the query by setting the timeout to 0. (infinite) > To repro : > Run a join query with couple of tables of which one is a fact table. In my > env, the fact table has 40TB data with more than a Billion rows. Most of the > map tasks are processing over 200 thousand rows. > Few of the task takes more than 30 min to respond and fail since the default > task timeout if 10 min.. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.