Another option is Apache Beam. We use it quite extensively. There are a few
options for Clojure wrappers (we use datasplash), and beam has libraries for a
number of popular languages.
Kind Regards,
Dom Parry
On 10 Jul 2020, 08:22 +0200, Alex Ott , wrote:
> From Spark perspective, I would really
>From Spark perspective, I would really advise to use Dataframe API as much
as possible, including the Spark Structured Streaming instead of Spark
Streaming - the main reason is more optimized execution of the code because
of all optimizations that Catalyst is able to make. But I really don't see
l
Hey Tim,
We at Amperity have used Sparkling for our Clojure Spark interop in the
past. After a few years of fighting, we eventually ended up with sparkplug (
https://github.com/amperity/sparkplug), which we now use to run all of our
production Spark jobs. There is built in support for proper fun
I'm putting together a big data system centered around using Spark
Streaming for data ingest and Spark SQL for querying the stored data. I've
been investigating what options there are for implementing Spark
applications using Clojure. It's been close to a decade since sparkling or
flambo have