Dear Spark Development Community,

According to the PySpark documentation, it should be possible to have a MapType 
column with ArrayType keys. MapType supports keys of type DataType and 
ArrayType inherits from DataType.
When i try that with PySpark 3.5.3, the show() method of the DataFrame works as 
aspected, but the collect() method throws an exception:

from pyspark.sql import SparkSession
from pyspark.sql.types import MapType, ArrayType, StringType

schema = MapType(ArrayType(StringType()), StringType())
data = [{("A", "B"): "foo", ("X", "Y", "Z"): "bar"}]
df = spark.createDataFrame(data, schema)
df.show() # works
df.collect() # throws exception


Is this behavior correct?

Kind regards,

Eyck

Reply via email to