[ https://issues.apache.org/jira/browse/HIVE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801070#comment-13801070 ]
Shreepadma Venugopalan commented on HIVE-4957: ---------------------------------------------- Thanks, Brock! > Restrict number of bit vectors, to prevent out of Java heap memory > ------------------------------------------------------------------ > > Key: HIVE-4957 > URL: https://issues.apache.org/jira/browse/HIVE-4957 > Project: Hive > Issue Type: Bug > Affects Versions: 0.11.0 > Reporter: Brock Noland > Assignee: Shreepadma Venugopalan > Fix For: 0.13.0 > > Attachments: HIVE-4957.1.patch, HIVE-4957.2.patch > > > normally increase number of bit vectors will increase calculation accuracy. > Let's say > {noformat} > select compute_stats(a, 40) from test_hive; > {noformat} > generally get better accuracy than > {noformat} > select compute_stats(a, 16) from test_hive; > {noformat} > But larger number of bit vectors also cause query run slower. When number of > bit vectors over 50, it won't help to increase accuracy anymore. But it still > increase memory usage, and crash Hive if number if too huge. Current Hive > doesn't prevent user use ridiculous large number of bit vectors in > 'compute_stats' query. > One example > {noformat} > select compute_stats(a, 999999999) from column_eight_types; > {noformat} > crashes Hive. > {noformat} > 2012-12-20 23:21:52,247 Stage-1 map = 0%, reduce = 0% > 2012-12-20 23:22:11,315 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.29 > sec > MapReduce Total cumulative CPU time: 290 msec > Ended Job = job_1354923204155_0777 with errors > Error during job, obtaining debugging information... > Job Tracking URL: > http://cs-10-20-81-171.cloud.cloudera.com:8088/proxy/application_1354923204155_0777/ > Examining task ID: task_1354923204155_0777_m_000000 (and more) from job > job_1354923204155_0777 > Task with the most failures(4): > ----- > Task ID: > task_1354923204155_0777_m_000000 > URL: > > http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1354923204155_0777&tipid=task_1354923204155_0777_m_000000 > ----- > Diagnostic Messages for this Task: > Error: Java heap space > {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)