[jira] [Updated] (SPARK-51830) Spark SQL: Error handling for partition datatype conversion

Madhukar (Jira) Wed, 16 Apr 2025 22:34:05 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-51830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Madhukar updated SPARK-51830:
-----------------------------
    Description: 
A partition is of int datatype and a partition in legacy table was created as
partition_name=partition_value

Then, when we perform following operations,
val df = spark.sql("select * from db.table");
val modifiedDf = df.withColumn("partition_name", lit(2))
modifiedDf.show(false)
modifiedDf.write.mode(SaveMode.Append).insertInto("db.table")

 
it throws error, even if spark.sql.sources.validatePartitionColumns=false.
java.lang.NumberFormatException: For input string: "partition_value"
  at 
java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67)
  at java.base/java.lang.Integer.parseInt(Integer.java:668)
  at java.base/java.lang.Integer.parseInt(Integer.java:786)
  at 
org.apache.spark.sql.execution.datasources.PartitioningUtils$.castPartValueToDesiredType(PartitioningUtils.scala:535)
  at 
org.apache.spark.sql.execution.datasources.PartitioningUtils$.removeLeadingZerosFromNumberTypePartition(PartitioningUtils.scala:362)
  at 
org.apache.spark.sql.execution.datasources.PartitioningUtils$.$anonfun$getPathFragment$1(PartitioningUtils.scala:355)

  was:
In the stage UI, we can see all the tasks' statuses are SUCCESS.

But the stage is still marked as active.


> Spark SQL: Error handling for partition datatype conversion
> -----------------------------------------------------------
>
>                 Key: SPARK-51830
>                 URL: https://issues.apache.org/jira/browse/SPARK-51830
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, SQL
>    Affects Versions: 3.4.1, 3.5.0, 4.0.0
>            Reporter: Madhukar
>            Priority: Critical
>
> A partition is of int datatype and a partition in legacy table was created as
> partition_name=partition_value
> Then, when we perform following operations,
> val df = spark.sql("select * from db.table");
> val modifiedDf = df.withColumn("partition_name", lit(2))
> modifiedDf.show(false)
> modifiedDf.write.mode(SaveMode.Append).insertInto("db.table")
>  
> it throws error, even if spark.sql.sources.validatePartitionColumns=false.
> java.lang.NumberFormatException: For input string: "partition_value"
>   at 
> java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67)
>   at java.base/java.lang.Integer.parseInt(Integer.java:668)
>   at java.base/java.lang.Integer.parseInt(Integer.java:786)
>   at 
> org.apache.spark.sql.execution.datasources.PartitioningUtils$.castPartValueToDesiredType(PartitioningUtils.scala:535)
>   at 
> org.apache.spark.sql.execution.datasources.PartitioningUtils$.removeLeadingZerosFromNumberTypePartition(PartitioningUtils.scala:362)
>   at 
> org.apache.spark.sql.execution.datasources.PartitioningUtils$.$anonfun$getPathFragment$1(PartitioningUtils.scala:355)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-51830) Spark SQL: Error handling for partition datatype conversion

Reply via email to