Re: Structured Streaming foreach function

2019-06-23 Thread Magnus Nilsson
Row is a generic ordered collection of fields that most likely contain a Schema of StructType. You need to keep track of the datatypes of the fields yourself. If you want compile time safety of datatypes (and intellisense support) you need to use RDD:s or the Dataset[T] api. Dataset[T] might incur

Structured Streaming foreach function

2019-06-23 Thread RanXin
I use spark 2.4.3, python to build a structured streaming. May I know the data type of the parameter "row" in process_row function? The following codes is how the official programming guide instruct us to deal with foreach function: def process_row(row): # Write row to storage p