Re: [PYSPARK] df.collect throws exception for MapType with ArrayType as key

2025-05-23 Thread Soumasish
This looks like a bug to me. https://github.com/apache/spark/blob/master/python/pyspark/serializers.py When cloudpickle.loads() tries to deserialize {["A", "B"]: "foo"} -> List as a dict type will break. Tuple("A", "B") : Python Input -> ArrayData -> works fine ArrayData ->List["A", "B"] -> Breaks

[PYSPARK] df.collect throws exception for MapType with ArrayType as key

2025-05-23 Thread Eyck Troschke
Dear Spark Development Community, According to the PySpark documentation, it should be possible to have a MapType column with ArrayType keys. MapType supports keys of type DataType and ArrayType inherits from DataType. When i try that with PySpark 3.5.3, the show() method of the DataFrame works