While the core of the Spark is and has been quite solid and a go-to
infrastructure, the *streaming *part of the story was still quite weak at
least through mid last year. I went into depth on both structured and the
older DStream. The structured in particular was difficult to use: both in
terms o
Correct. Also as explained in the book LearningSpark2.0 by Databiricks:
Unified Analytics
While the notion of unification is not unique to Spark, it is a core component
of its design philosophy and evolution. In November 2016, the Association for
Computing Machinery (ACM) recognized Apache Spark
My thought is that Spark supports analytics for structured and unstructured
data, batch as well as real time. This was pretty revolutionary when Spark
first came out. That's where the unified term came from I think. Even after
all these years, Spark remains the trusted framework for enterprise
anal
Hi,
I think that it is just a marketing statement. But with SPARK 3.x, now that
you are seeing that SPARK is no more than just another distributed data
processing engine, they are trying to join data pre-processing into ML
pipelines directly. I may call that unified.
But you get the same with sev
Apache Spark's mission statement is Apache Spark™ is a unified analytics engine for large-scale data processing.
To what is the word "unified" inferring ?
-
To unsubscribe e-mail: user-unsubscr...@spark