[ 
https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072474#comment-16072474
 ] 

Maciej Bryński edited comment on SPARK-21287 at 7/3/17 1:59 PM:
----------------------------------------------------------------

Quote
{quote}
By default, ResultSets are completely retrieved and stored in memory. In most 
cases this is the most efficient way to operate and, due to the design of the 
MySQL network protocol, is easier to implement. If you are working with 
ResultSets that have a large number of rows or large values and cannot allocate 
heap space in your JVM for the memory required, you can tell the driver to 
stream the results back one row at a time.
{quote}
https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html


was (Author: maver1ck):
Quote
{code}
By default, ResultSets are completely retrieved and stored in memory. In most 
cases this is the most efficient way to operate and, due to the design of the 
MySQL network protocol, is easier to implement. If you are working with 
ResultSets that have a large number of rows or large values and cannot allocate 
heap space in your JVM for the memory required, you can tell the driver to 
stream the results back one row at a time.
{code}
https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html

> Cannot use Int.MIN_VALUE as Spark SQL fetchsize
> -----------------------------------------------
>
>                 Key: SPARK-21287
>                 URL: https://issues.apache.org/jira/browse/SPARK-21287
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.1
>            Reporter: Maciej Bryński
>
> MySQL JDBC driver gives possibility to not store ResultSet in memory.
> We can do this by setting fetchSize to Int.MIN_VALUE.
> Unfortunately this configuration isn't correct in Spark.
> {code}
> java.lang.IllegalArgumentException: requirement failed: Invalid value 
> `-2147483648` for parameter `fetchsize`. The minimum value is 0. When the 
> value is 0, the JDBC driver ignores the value and does the estimates.
>       at scala.Predef$.require(Predef.scala:224)
>       at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:105)
>       at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:34)
>       at 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
>       at 
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330)
>       at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
>       at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125)
>       at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:166)
>       at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:206)
>       at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>       at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
>       at py4j.Gateway.invoke(Gateway.java:280)
>       at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>       at py4j.commands.CallCommand.execute(CallCommand.java:79)
>       at py4j.GatewayConnection.run(GatewayConnection.java:214)
>       at java.lang.Thread.run(Thread.java:748)
> {code}
> https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to