Hi,
We're working on Spark NLP by including multiple ML Estimators and
Transformers.
Getting a negative performance hit on Python side, because of the columns
being recalculated recursively (and more than recursively) on each
stage.transform() call.
I am not being able to trace the root of the p
I have the same problem as described in the following question in
StackOverflow (but nobody has answered to it).
https://stackoverflow.com/questions/51103634/spark-streaming-schema-mismatch-using-microbatchreader-with-columns-pruning
Any idea of how to solve it (using Spark 2.3)?
Thanks,
Kineret
Hi,
I am looking at Description column of a bank statement (CSV download) that
has the following format
scala> account_table.printSchema
root
|-- TransactionDate: date (nullable = true)
|-- TransactionType: string (nullable = true)
|-- Description: string (nullable = true)
|-- Value: double (