While the core of the Spark is and has been quite solid and a go-to infrastructure, the *streaming *part of the story was still quite weak at least through mid last year. I went into depth on both structured and the older DStream. The structured in particular was difficult to use: both in terms of limitations on what it supports and documentation/examples. Has there been meaningful advancements in the past twelve+ months?
On Sun, 25 Oct 2020 at 13:58, Khalid Mammadov <khalidmammad...@gmail.com> wrote: > Correct. Also as explained in the book LearningSpark2.0 by Databiricks: > > Unified Analytics > While the notion of unification is not unique to Spark, it is a core > component of its design philosophy and evolution. In November 2016, the > Association for Computing Machinery (ACM) recognized Apache Spark and > conferred upon its original creators the prestigious ACM Award for their > paper describing Apache Spark as a “Unified Engine for Big Data > Processing.” The award-winning paper notes that Spark replaces all the > separate batch processing, graph, stream, and query engines like Storm, > Impala, Dremel, Pregel, etc. with a unified stack of components that > addresses diverse workloads under a single distributed fast engine. > > Khalid > > On 19 Oct 2020, at 07:03, Sonal Goyal <sonalgoy...@gmail.com> wrote: > > > My thought is that Spark supports analytics for structured and > unstructured data, batch as well as real time. This was pretty > revolutionary when Spark first came out. That's where the unified term came > from I think. Even after all these years, Spark remains the trusted > framework for enterprise analytics. > > On Mon, 19 Oct 2020, 11:24 Gourav Sengupta <gourav.sengu...@gmail.com > wrote: > >> Hi, >> >> I think that it is just a marketing statement. But with SPARK 3.x, now >> that you are seeing that SPARK is no more than just another distributed >> data processing engine, they are trying to join data pre-processing into ML >> pipelines directly. I may call that unified. >> >> But you get the same with several other frameworks as well now so not >> quite sure how unified creates a unique brand value. >> >> >> Regards, >> Gourav Sengupta >> >> On Sun, Oct 18, 2020 at 6:40 PM Hulio andres <hulioand...@usa.com> wrote: >> >>> >>> Apache Spark's mission statement is *Apache Spark™* is a unified >>> analytics engine for large-scale data processing. >>> >>> To what is the word "unified" inferring ? >>> >>> >>> >>> >>> >>> >>> --------------------------------------------------------------------- To >>> unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >>