[ 
https://issues.apache.org/jira/browse/HIVE-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-5546:
---------------------------

    Description: 
{code}
2013-10-15 10:49:49,386 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: 
included column ids = 
2013-10-15 10:49:49,386 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: 
included columns names = 
2013-10-15 10:49:49,386 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: 
No ORC pushdown predicate
2013-10-15 10:49:49,834 INFO 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file 
hdfs://localhost:54310/user/hive/warehouse/web_sales_orc/000000_0
2013-10-15 10:49:49,834 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2013-10-15 10:49:49,840 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100
2013-10-15 10:49:49,968 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-10-15 10:49:49,994 INFO org.apache.hadoop.io.nativeio.NativeIO: 
Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
2013-10-15 10:49:49,994 INFO org.apache.hadoop.io.nativeio.NativeIO: Got 
UserName yhuai for UID 1000 from the native implementation
2013-10-15 10:49:49,996 FATAL org.apache.hadoop.mapred.Child: Error running 
child : java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:949)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:428)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
{code}

If includedColumnIds is an empty list, we do not need to read any column. But, 
in 
{code}
if (ColumnProjectionUtils.isReadAllColumns(conf) ||
      includedStr == null || includedStr.trim().length() == 0) {
      return null;
    } 
{code}

  was:
{code}
2013-10-15 10:49:49,386 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: 
included column ids = 
2013-10-15 10:49:49,386 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: 
included columns names = 
2013-10-15 10:49:49,386 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: 
No ORC pushdown predicate
2013-10-15 10:49:49,834 INFO 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file 
hdfs://localhost:54310/user/hive/warehouse/web_sales_orc/000000_0
2013-10-15 10:49:49,834 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2013-10-15 10:49:49,840 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100
2013-10-15 10:49:49,968 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-10-15 10:49:49,994 INFO org.apache.hadoop.io.nativeio.NativeIO: 
Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
2013-10-15 10:49:49,994 INFO org.apache.hadoop.io.nativeio.NativeIO: Got 
UserName yhuai for UID 1000 from the native implementation
2013-10-15 10:49:49,996 FATAL org.apache.hadoop.mapred.Child: Error running 
child : java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:949)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:428)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
{code}

If includedColumnIds is an empty list, we do not need to read any column


> A change in ORCInputFormat made by HIVE-4113 was reverted by HIVE-5391
> ----------------------------------------------------------------------
>
>                 Key: HIVE-5546
>                 URL: https://issues.apache.org/jira/browse/HIVE-5546
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Yin Huai
>            Assignee: Yin Huai
>
> {code}
> 2013-10-15 10:49:49,386 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: 
> included column ids = 
> 2013-10-15 10:49:49,386 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: 
> included columns names = 
> 2013-10-15 10:49:49,386 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: 
> No ORC pushdown predicate
> 2013-10-15 10:49:49,834 INFO 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file 
> hdfs://localhost:54310/user/hive/warehouse/web_sales_orc/000000_0
> 2013-10-15 10:49:49,834 INFO org.apache.hadoop.mapred.MapTask: 
> numReduceTasks: 1
> 2013-10-15 10:49:49,840 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 
> 100
> 2013-10-15 10:49:49,968 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2013-10-15 10:49:49,994 INFO org.apache.hadoop.io.nativeio.NativeIO: 
> Initialized cache for UID to User mapping with a cache timeout of 14400 
> seconds.
> 2013-10-15 10:49:49,994 INFO org.apache.hadoop.io.nativeio.NativeIO: Got 
> UserName yhuai for UID 1000 from the native implementation
> 2013-10-15 10:49:49,996 FATAL org.apache.hadoop.mapred.Child: Error running 
> child : java.lang.OutOfMemoryError: Java heap space
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:949)
>       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:428)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
>       at org.apache.hadoop.mapred.Child.main(Child.java:249)
> {code}
> If includedColumnIds is an empty list, we do not need to read any column. 
> But, in 
> {code}
> if (ColumnProjectionUtils.isReadAllColumns(conf) ||
>       includedStr == null || includedStr.trim().length() == 0) {
>       return null;
>     } 
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to