Sifan Huang created SPARK-43267: ----------------------------------- Summary: Support creating data frame from a Postgres table that contains user-defined array column Key: SPARK-43267 URL: https://issues.apache.org/jira/browse/SPARK-43267 Project: Spark Issue Type: New Feature Components: SQL Affects Versions: 3.3.2, 2.4.0 Reporter: Sifan Huang
Spark SQL now doesn’t support creating data frame from a Postgres table that contains user-defined array column. However, it used to allow such type before the Postgres JDBC commit (https://github.com/pgjdbc/pgjdbc/commit/375cb3795c3330f9434cee9353f0791b86125914). The previous behavior was to handle user-defined array column as String. Given: * Postgres table with user-defined array column * Function: DataFrameReader.jdbc - https://spark.apache.org/docs/2.4.0/api/java/org/apache/spark/sql/DataFrameReader.html#jdbc-java.lang.String-java.lang.String-java.util.Properties- Results: * Exception “java.sql.SQLException: Unsupported type ARRAY” is thrown Expectation after the change: * Function call succeeds * User-defined array is converted as a string in Spark DataFrame Suggested fix: * Update “getCatalystType” function in “PostgresDialect” as ** {code:java} val catalystType = toCatalystType(typeName.drop(1), size, scale).map(ArrayType(_)) if (catalystType.isEmpty) Some(StringType) else catalystType{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org