Justin Miller created SPARK-17936:
-------------------------------------
Summary: "CodeGenerator - failed to compile:
org.codehaus.janino.JaninoRuntimeException: Code of" method Error
Key: SPARK-17936
URL: https://issues.apache.org/jira/browse/SPARK-17936
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 2.0.1
Reporter: Justin Miller
Greetings. I'm currently in the process of migrating a project I'm working on
from Spark 1.6.2 to 2.0.1. The project uses Spark Streaming to convert Thrift
structs coming from Kafka into Parquet files stored in S3. This conversion
process works fine in 1.6.2 but I think there may be a bug in 2.0.1. I'll paste
the stack trace below.
org.codehaus.janino.JaninoRuntimeException: Code of method
"(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass;[Ljava/lang/Object;)V"
of class
"org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
grows beyond 64 KB
at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941)
at org.codehaus.janino.CodeContext.write(CodeContext.java:854)
at org.codehaus.janino.UnitCompiler.writeShort(UnitCompiler.java:10242)
at org.codehaus.janino.UnitCompiler.writeLdc(UnitCompiler.java:9058)
Also, later on:
07:35:30.191 ERROR o.a.s.u.SparkUncaughtExceptionHandler - Uncaught exception
in thread Thread[Executor task launch worker-6,5,run-main-group-0]
java.lang.OutOfMemoryError: Java heap space
I've seen similar issues posted, but those were always on the query side. I
have a hunch that this is happening at write time as the error occurs after
batchDuration. Here's the write snippet.
stream.
flatMap {
case Success(row) =>
thriftParseSuccess += 1
Some(row)
case Failure(ex) =>
thriftParseErrors += 1
logger.error("Error during deserialization: ", ex)
None
}.foreachRDD { rdd =>
val sqlContext = SQLContext.getOrCreate(rdd.context)
transformer(sqlContext.createDataFrame(rdd, converter.schema))
.coalesce(coalesceSize)
.write
.mode(Append)
.partitionBy(partitioning: _*)
.parquet(parquetPath)
}
Please let me know if you can be of assistance and if there's anything I can do
to help.
Best,
Justin
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]