[ 
https://issues.apache.org/jira/browse/SPARK-17131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15592000#comment-15592000
 ] 

Aleksander Eskilson commented on SPARK-17131:
---------------------------------------------

Yeah, that makes sense. So far, what I documented and this one seem to have 
been the only JIRAs that exhibit specifically the Constant Pool limit error. 
I'm trying to dig deeper into it to see if it really marks its own class of 
error, but given that SPARK-17702 didn't resolve the error case I posted (even 
though it splits up sections of large generated code), I do suspect they are, 
quite related, but ultimately different issues. I think the spliExpressions 
technique that was used in SPARK-17702 and that also appears to be being 
employed in SPARK-16845 could be useful for the range of different classes that 
can generate too many lines of code. Seeing the issues linked together is 
definitely useful.

To that end, I'll leave mine resolved as a duplicate of SPARK-16845 for now 
until I can make use of the patch it develops, so we can see more conclusively 
if they're related issues, or truly duplicates. And I'll link the two "0xFFFF" 
issues together as related.

> Code generation fails when running SQL expressions against a wide dataset 
> (thousands of columns)
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-17131
>                 URL: https://issues.apache.org/jira/browse/SPARK-17131
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>            Reporter: Iaroslav Zeigerman
>         Attachments: 
> _SPARK_17131__add_a_test_case_with_1000_column_DF_where_describe___fails.patch
>
>
> When reading the CSV file that contains 1776 columns Spark and Janino fail to 
> generate the code with message:
> {noformat}
> Constant pool has grown past JVM limit of 0xFFFF
> {noformat}
> When running a common select with all columns it's fine:
> {code}
>       val allCols = df.columns.map(c => col(c).as(c + "_alias"))
>       val newDf = df.select(allCols: _*)
>       newDf.show()
> {code}
> But when I invoke the describe method:
> {code}
> newDf.describe(allCols: _*)
> {code}
> it fails with the following stack trace:
> {noformat}
>       at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:889)
>       at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:941)
>       at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:938)
>       at 
> org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
>       at 
> org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
>       ... 30 more
> Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool has 
> grown past JVM limit of 0xFFFF
>       at 
> org.codehaus.janino.util.ClassFile.addToConstantPool(ClassFile.java:402)
>       at 
> org.codehaus.janino.util.ClassFile.addConstantIntegerInfo(ClassFile.java:300)
>       at 
> org.codehaus.janino.UnitCompiler.addConstantIntegerInfo(UnitCompiler.java:10307)
>       at org.codehaus.janino.UnitCompiler.pushConstant(UnitCompiler.java:8868)
>       at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:4346)
>       at org.codehaus.janino.UnitCompiler.access$7100(UnitCompiler.java:185)
>       at 
> org.codehaus.janino.UnitCompiler$10.visitIntegerLiteral(UnitCompiler.java:3265)
>       at org.codehaus.janino.Java$IntegerLiteral.accept(Java.java:4321)
>       at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3290)
>       at org.codehaus.janino.UnitCompiler.fakeCompile(UnitCompiler.java:2605)
>       at 
> org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4362)
>       at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:3975)
>       at org.codehaus.janino.UnitCompiler.access$6900(UnitCompiler.java:185)
>       at 
> org.codehaus.janino.UnitCompiler$10.visitMethodInvocation(UnitCompiler.java:3263)
>       at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:3974)
>       at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3290)
>       at 
> org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4368)
>       at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2662)
>       at org.codehaus.janino.UnitCompiler.access$4400(UnitCompiler.java:185)
>       at 
> org.codehaus.janino.UnitCompiler$7.visitMethodInvocation(UnitCompiler.java:2627)
>       at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:3974)
>       at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2654)
>       at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1643)
> ....
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to