Hello .

I would like to ask question for spark runner.

Using spark downloaded from below link,

https://www.apache.org/dyn/closer.lua/spark/spark-3.1.2/spark-3.1.2-bin-hadoop3.2.tgz

I get below error when submitting a pipeline.
Full error is on
https://gist.github.com/yuwtennis/7b0c1dc0dcf98297af1e3179852ca693.

------------------------------------------------------------------------------------------------------------------
21/08/16 01:10:26 WARN TransportChannelHandler: Exception in connection
from /192.168.11.2:35601
java.io.InvalidClassException: scala.collection.mutable.WrappedArray$ofRef;
local class incompatible: stream classdesc serialVersionUID =
3456489343829468865, local class serialVersionUID = 1028182004549731694
at
java.base/java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:689)
...
------------------------------------------------------------------------------------------------------------------

SDK Harness and Job service are deployed as below.

1. SDK Harness

sudo docker run --net=host apache/beam_spark3_job_server:2.31.0
--spark-master-url=spark://localhost:7077 --clean-artifacts-per-job true

2. Job service

sudo docker run --net=host apache/beam_python3.8_sdk:2.31.0 --worker_pool

* apache/beam_spark_job_server:2.31.0 for spark 2.4.8

3. SDK client code

https://gist.github.com/yuwtennis/2e4c13c79f71e8f713e947955115b3e2

Spark 2.4.8 succeeded without any errors using above components.

https://archive.apache.org/dist/spark/spark-2.4.8/spark-2.4.8-bin-hadoop2.7.tgz

Would there be any setting which you need to be aware of for spark 3.1.2 ?

Thanks,
Yu Watanabe

-- 
Yu Watanabe

linkedin: www.linkedin.com/in/yuwatanabe1/
twitter:   twitter.com/yuwtennis

Reply via email to