Have you tried dropmalformed option ?
On Mon, Jul 3, 2023, 1:34 PM Shashank Rao wrote:
> Update: Got it working by using the *_corrupt_record *field for the first
> case (record 4)
>
> schema = schema.add("_corrupt_record", DataTypes.StringType);
> Dataset ds = spark.read().schema(schema).option
Hi,
I have a nested StructType. The StructType is deeply nested and may
comprise other Structs. Now I want to update this struct at the lowest
level.
I tried withField but it doesn't work if any of the top level struct is
null. I will appreciate any help with this.
The example schema is:
val sche
Doesn't directly answer your question but there are ways in scala and
pyspark - See if this helps:
https://repost.aws/questions/QUP_OJomilTO6oIgvK00VHEA/writing-data-to-kinesis-stream-from-py-spark
On Thu, Feb 16, 2023, 8:27 PM hueiyuan su wrote:
> *Component*: Spark Structured Streaming
> *Leve
I think these 4 steps should help:
Use zip
Explode
Withcolumn (getelement of array)
Drop the array column
Thanks
On Thu, Feb 16, 2023, 2:18 PM sam smith wrote:
> @Enrico Minack I used arrays_zip to merge values
> into one row, and then used toJSON() to export the data.
> @Bjørn explode_oute
Unsubscribe
Unsubscribe