Sifan Huang created SPARK-43267:
-----------------------------------

             Summary: Support creating data frame from a Postgres table that 
contains user-defined array column
                 Key: SPARK-43267
                 URL: https://issues.apache.org/jira/browse/SPARK-43267
             Project: Spark
          Issue Type: New Feature
          Components: SQL
    Affects Versions: 3.3.2, 2.4.0
            Reporter: Sifan Huang


Spark SQL now doesn’t support creating data frame from a Postgres table that 
contains user-defined array column. However, it used to allow such type before 
the Postgres JDBC commit 
(https://github.com/pgjdbc/pgjdbc/commit/375cb3795c3330f9434cee9353f0791b86125914).
 The previous behavior was to handle user-defined array column as String.

Given:
 * Postgres table with user-defined array column
 * Function: DataFrameReader.jdbc - 
https://spark.apache.org/docs/2.4.0/api/java/org/apache/spark/sql/DataFrameReader.html#jdbc-java.lang.String-java.lang.String-java.util.Properties-

Results:
 * Exception “java.sql.SQLException: Unsupported type ARRAY” is thrown

Expectation after the change:
 * Function call succeeds
 * User-defined array is converted as a string in Spark DataFrame

Suggested fix:
 * Update “getCatalystType” function in “PostgresDialect” as
 ** 
{code:java}
val catalystType = toCatalystType(typeName.drop(1), size, 
scale).map(ArrayType(_))
if (catalystType.isEmpty) Some(StringType) else catalystType{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to