Hi Rico,

using SQL saves a lot of time, effort, and budget over the long term. But I
guess that there are certain joys in solving self induced complexities.

Thanks for sharing your findings.

Regards,
Gourav Sengupta

On Fri, Feb 18, 2022 at 7:26 AM Rico Bergmann <i...@ricobergmann.de> wrote:

> I found the reason why it did not work:
>
> When returning the Spark data type I was calling new StringType(). When
> changing it to DataTypes.StringType it worked.
>
> Greets,
> Rico.
>
> Am 17.02.2022 um 14:13 schrieb Gourav Sengupta <gourav.sengu...@gmail.com
> >:
>
> 
> Hi,
>
> can you please post a screen shot of the exact CAST statement that you are
> using? Did you use the SQL method mentioned by me earlier?
>
> Regards,
> Gourav Sengupta
>
> On Thu, Feb 17, 2022 at 12:17 PM Rico Bergmann <i...@ricobergmann.de>
> wrote:
>
>> hi!
>>
>> Casting another int column that is not a partition column fails with the
>> same error.
>>
>> The Schema before the cast (column names are anonymized):
>>
>> root
>>
>> |-- valueObject: struct (nullable = true)
>>
>> |    |-- value1: string (nullable = true)
>>
>> |    |-- value2: string (nullable = true)
>>
>> |    |-- value3: timestamp (nullable = true)
>>
>> |    |-- value4: string (nullable = true)
>>
>> |-- partitionColumn2: string (nullable = true)
>>
>> |-- partitionColumn3: timestamp (nullable = true)
>>
>> |-- partitionColumn1: integer (nullable = true)
>>
>>
>> I wanted to cast partitionColumn1 to String which gives me the described
>> error.
>>
>>
>> Best,
>>
>> Rico
>>
>>
>>
>> Am 17.02.2022 um 09:56 schrieb ayan guha <guha.a...@gmail.com>:
>>
>> 
>> Can you try to cast any other Int field which is NOT a partition column?
>>
>> On Thu, 17 Feb 2022 at 7:34 pm, Gourav Sengupta <
>> gourav.sengu...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> This appears interesting, casting INT to STRING has never been an issue
>>> for me.
>>>
>>> Can you just help us with the output of : df.printSchema()  ?
>>>
>>> I prefer to use SQL, and the method I use for casting is: CAST(<<column
>>> name>> AS STRING) <<alias>>.
>>>
>>> Regards,
>>> Gourav
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Feb 17, 2022 at 6:02 AM Rico Bergmann <i...@ricobergmann.de>
>>> wrote:
>>>
>>>> Here is the code snippet:
>>>>
>>>> var df = session.read().parquet(basepath);
>>>> for(Column partition : partitionColumnsList){
>>>>   df = df.withColumn(partition.getName(),
>>>> df.col(partition.getName()).cast(partition.getType()));
>>>> }
>>>>
>>>> Column is a class containing Schema Information, like for example the
>>>> name of the column and the data type of the column.
>>>>
>>>> Best, Rico.
>>>>
>>>> > Am 17.02.2022 um 03:17 schrieb Morven Huang <morven.hu...@gmail.com>:
>>>> >
>>>> > Hi Rico, you have any code snippet? I have no problem casting int to
>>>> string.
>>>> >
>>>> >> 2022年2月17日 上午12:26,Rico Bergmann <i...@ricobergmann.de> 写道:
>>>> >>
>>>> >> Hi!
>>>> >>
>>>> >> I am reading a partitioned dataFrame into spark using automatic type
>>>> inference for the partition columns. For one partition column the data
>>>> contains an integer, therefor Spark uses IntegerType for this column. In
>>>> general this is supposed to be a StringType column. So I tried to cast this
>>>> column to StringType. But this fails with AnalysisException “cannot cast
>>>> int to string”.
>>>> >>
>>>> >> Is this a bug? Or is it really not allowed to cast an int to a
>>>> string?
>>>> >>
>>>> >> I’m using Spark 3.1.1
>>>> >>
>>>> >> Best regards
>>>> >>
>>>> >> Rico.
>>>> >>
>>>> >> ---------------------------------------------------------------------
>>>> >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>> >>
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>> >
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>>
>>>> --
>> Best Regards,
>> Ayan Guha
>>
>>

Reply via email to