I found the reason why it did not work: When returning the Spark data type I was calling new StringType(). When changing it to DataTypes.StringType it worked.
Greets, Rico. > Am 17.02.2022 um 14:13 schrieb Gourav Sengupta <[email protected]>: > > > Hi, > > can you please post a screen shot of the exact CAST statement that you are > using? Did you use the SQL method mentioned by me earlier? > > Regards, > Gourav Sengupta > >> On Thu, Feb 17, 2022 at 12:17 PM Rico Bergmann <[email protected]> wrote: >> hi! >> >> Casting another int column that is not a partition column fails with the >> same error. >> >> The Schema before the cast (column names are anonymized): >> >> root >> |-- valueObject: struct (nullable = true) >> | |-- value1: string (nullable = true) >> | |-- value2: string (nullable = true) >> | |-- value3: timestamp (nullable = true) >> | |-- value4: string (nullable = true) >> |-- partitionColumn2: string (nullable = true) >> |-- partitionColumn3: timestamp (nullable = true) >> |-- partitionColumn1: integer (nullable = true) >> >> I wanted to cast partitionColumn1 to String which gives me the described >> error. >> >> Best, >> Rico >> >> >>>> Am 17.02.2022 um 09:56 schrieb ayan guha <[email protected]>: >>>> >>> >>> Can you try to cast any other Int field which is NOT a partition column? >>> >>>> On Thu, 17 Feb 2022 at 7:34 pm, Gourav Sengupta >>>> <[email protected]> wrote: >>>> Hi, >>>> >>>> This appears interesting, casting INT to STRING has never been an issue >>>> for me. >>>> >>>> Can you just help us with the output of : df.printSchema() ? >>>> >>>> I prefer to use SQL, and the method I use for casting is: CAST(<<column >>>> name>> AS STRING) <<alias>>. >>>> >>>> Regards, >>>> Gourav >>>> >>>> >>>> >>>> >>>> >>>> >>>>> On Thu, Feb 17, 2022 at 6:02 AM Rico Bergmann <[email protected]> >>>>> wrote: >>>>> Here is the code snippet: >>>>> >>>>> var df = session.read().parquet(basepath); >>>>> for(Column partition : partitionColumnsList){ >>>>> df = df.withColumn(partition.getName(), >>>>> df.col(partition.getName()).cast(partition.getType())); >>>>> } >>>>> >>>>> Column is a class containing Schema Information, like for example the >>>>> name of the column and the data type of the column. >>>>> >>>>> Best, Rico. >>>>> >>>>> > Am 17.02.2022 um 03:17 schrieb Morven Huang <[email protected]>: >>>>> > >>>>> > Hi Rico, you have any code snippet? I have no problem casting int to >>>>> > string. >>>>> > >>>>> >> 2022年2月17日 上午12:26,Rico Bergmann <[email protected]> 写道: >>>>> >> >>>>> >> Hi! >>>>> >> >>>>> >> I am reading a partitioned dataFrame into spark using automatic type >>>>> >> inference for the partition columns. For one partition column the data >>>>> >> contains an integer, therefor Spark uses IntegerType for this column. >>>>> >> In general this is supposed to be a StringType column. So I tried to >>>>> >> cast this column to StringType. But this fails with AnalysisException >>>>> >> “cannot cast int to string”. >>>>> >> >>>>> >> Is this a bug? Or is it really not allowed to cast an int to a string? >>>>> >> >>>>> >> I’m using Spark 3.1.1 >>>>> >> >>>>> >> Best regards >>>>> >> >>>>> >> Rico. >>>>> >> >>>>> >> --------------------------------------------------------------------- >>>>> >> To unsubscribe e-mail: [email protected] >>>>> >> >>>>> > >>>>> > >>>>> > --------------------------------------------------------------------- >>>>> > To unsubscribe e-mail: [email protected] >>>>> > >>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe e-mail: [email protected] >>>>> >>> -- >>> Best Regards, >>> Ayan Guha
