Hi,

I guess you are not following the dev mailing list.

Spark runner supports almost all transforms and yes, you can fully use Spark runner to run your pipelines.

PCollection is represented with RDD and it's currently Spark 1.x.

I'm working on the Spark 2.x support (still using RDD): we have a vote in progress on the mailing list if we want to support both Spark 1.x & Spark 2.x or just upgrade to Spark 2.x and drop support for Spark 1.x.

You can take a look on the beam-samples: they all run using the Spark runner.

Regards
JB

On 11/10/2017 01:46 PM, Artur Mrozowski wrote:
Hi,
I have seen the compatibility matrix and I realize that Spark is not the most supported runner. I am curious if it is possible to run a pipeline on Spark, say with global windows, after processing triggers and group by key(CoGroupByKye, CombineByKey) . We have definitely problems to execute a pipeline that successfully runs on direct runner.

Is that a known issue? Is Flink the best option?

Best Regards
Artur

--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to