I don't think I'd expect an automatic conversion from a complex type like
Map to anything - implicit casts for primitive types are on the other hand
relatively common.

On Wed, Nov 10, 2021 at 6:01 PM Dhiren <navanidhi...@gmail.com> wrote:

> Hi Spark Users,
> I am trying to understand Spark’s type system a little better. I see that
> Spark implicitly casts some types to String and not for others. Here is an
> example.
>
> df = spark.createDataFrame([({"test": 1.0}, 2.0, 3.0, 4.0)],
>                                ("observed", "expected", "obs_wt", "exp_wt"))
>     df.select(df["expected"].like("1.0")).show()
>     df.select(df["observed"].like("1.0")).show()
>
> Schema of the dataframe
>
> StructType(List(StructField(observed,MapType(StringType,DoubleType,true),true),StructField(expected,DoubleType,true),StructField(obs_wt,DoubleType,true),StructField(exp_wt,DoubleType,true)))
>
> The second like executes fine, even though the type of “expected” column
> is double.
> The third line fails because Spark does not know how to convert map to
> strings and test it.
>
> pyspark.sql.utils.AnalysisException: cannot resolve '`observed` LIKE '1.0'' 
> due to data type mismatch: argument 1 requires string type, however, 
> '`observed`' is of map<string,double> type.;
> 'Project [observed#0 LIKE 1.0 AS observed LIKE 1.0#15]
> +- LogicalRDD [observed#0, expected#1, obs_wt#2, exp_wt#3], false
>
> Isn’t this inconsistent? Ideally, shouldn’t it fail for "DoubleType" as
> well as that is not the "StringType"? What would be the point of types if
> the conversion happens to Strings regardless?
>
> Can anyone from Spark community shed light on the reason for this
> behaviour?
>
>
> --
> Thanks and Regards,
> Dhiren Amar Navani
> (+1) 480-434-0661
>
> [image: https://www.linkedin.com/pub/dhiren-navani/41/b36/372]
> <https://www.linkedin.com/pub/dhiren-navani/41/b36/372>
>
>

Reply via email to