Perhaps it could help, but I run simple WordCount (built with Beam 2.7) on YARN/Spark (HDP Sandbox) cluster and it worked fine for me.
> On 14 Sep 2018, at 06:56, Romain Manni-Bucau <rmannibu...@gmail.com> wrote: > > Hi Charles, > > I didn't get enough time to check deeply but it is clearly a dependency issue > and it is not in beam spark runner itself but in another transitive module of > beam. It does not happen in existing spark test cause none of them are in a > cluster (even just with 1 worker) but this seems to be a regression since 2.6 > works OOTB. > > Romain Manni-Bucau > @rmannibucau <https://twitter.com/rmannibucau> | Blog > <https://rmannibucau.metawerx.net/> | Old Blog > <http://rmannibucau.wordpress.com/> | Github <https://github.com/rmannibucau> > | LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book > <https://www.packtpub.com/application-development/java-ee-8-high-performance> > > Le jeu. 13 sept. 2018 à 22:15, Charles Chen <c...@google.com > <mailto:c...@google.com>> a écrit : > Romain and JB, can you please add the results of your investigations into the > errors you've seen above? Given that the existing SparkRunner tests pass for > this RC, and that the integration test you ran is in another repo that is not > continuously tested with Beam, it is not clear how we should move forward and > whether this is a blocking issue, unless we can find a root cause in Beam. > > On Wed, Sep 12, 2018 at 2:08 AM Etienne Chauchot <echauc...@apache.org > <mailto:echauc...@apache.org>> wrote: > Hi all, > > on a performance and functional regression stand point I see no regression: > > I looked at nexmark graphs "output pcollection size" and "execution time" > around release cut date on dataflow, spark, flink and direct runner in batch > and streaming modes. There seems to be no regression. > > Etienne > > Le mardi 11 septembre 2018 à 12:25 -0700, Charles Chen a écrit : >> The SparkRunner validation test (here: >> https://beam.apache.org/contribute/release-guide/#run-validation-tests >> <https://beam.apache.org/contribute/release-guide/#run-validation-tests>) >> passes on my machine. It looks like we are likely missing test coverage >> where Romain is hitting issues. >> >> On Tue, Sep 11, 2018 at 12:15 PM Ahmet Altay <al...@google.com >> <mailto:al...@google.com>> wrote: >>> Could anyone else help with looking at these issues earlier? >>> >>> On Tue, Sep 11, 2018 at 12:03 PM, Romain Manni-Bucau <rmannibu...@gmail.com >>> <mailto:rmannibu...@gmail.com>> wrote: >>>> Im running this main [1] through this IT [2]. Was working fine since ~1 >>>> year but 2.7.0 broke it. Didnt investigate more but can have a look later >>>> this month if it helps. >>>> >>>> [1] >>>> https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/it/serialization-over-cluster/src/main/java/org/talend/sdk/component/beam/it/clusterserialization/Main.java >>>> >>>> <https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/it/serialization-over-cluster/src/main/java/org/talend/sdk/component/beam/it/clusterserialization/Main.java> >>>> [2] >>>> https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/it/serialization-over-cluster/src/test/java/org/talend/sdk/component/beam/it/SerializationOverClusterIT.java >>>> >>>> <https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/it/serialization-over-cluster/src/test/java/org/talend/sdk/component/beam/it/SerializationOverClusterIT.java> >>>> >>>> Le mar. 11 sept. 2018 20:54, Charles Chen <c...@google.com >>>> <mailto:c...@google.com>> a écrit : >>>>> Romain: can you give more details on the failure you're encountering, >>>>> i.e. how you are performing this validation? >>>>> >>>>> On Tue, Sep 11, 2018 at 9:36 AM Jean-Baptiste Onofré <j...@nanthrax.net >>>>> <mailto:j...@nanthrax.net>> wrote: >>>>>> Hi, >>>>>> >>>>>> weird, I didn't have it on Beam samples. Let me try to reproduce and I >>>>>> will create the Jira. >>>>>> >>>>>> Regards >>>>>> JB >>>>>> >>>>>> On 11/09/2018 11:44, Romain Manni-Bucau wrote: >>>>>> > -1, seems spark integration is broken (tested with spark 2.3.1 and >>>>>> > 2.2.1): >>>>>> > >>>>>> > 18/09/11 11:33:29 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID >>>>>> > 0, RMANNIBUCAU, executor 0): java.lang.ClassCastException: cannot >>>>>> > assign instance of scala.collection.immutable.List$SerializationProxy >>>>>> > to fieldorg.apache.spark.rdd.RDD.org >>>>>> > <http://fieldorg.apache.spark.rdd.rdd.org/> >>>>>> > <http://org.apache.spark.rdd.RDD.org >>>>>> > <http://org.apache.spark.rdd.rdd.org/>>$apache$spark$rdd$RDD$$dependencies_ >>>>>> > of type scala.collection.Seq in instance of >>>>>> > org.apache.spark.rdd.MapPartitionsRDD >>>>>> > at >>>>>> > java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2233) >>>>>> > >>>>>> > >>>>>> > Also the issue Lukasz identified is important even if workarounds can >>>>>> > be >>>>>> > put in place so +1 to fix it as well if possible. >>>>>> > >>>>>> > Romain Manni-Bucau >>>>>> > @rmannibucau <https://twitter.com/rmannibucau >>>>>> > <https://twitter.com/rmannibucau>> | Blog >>>>>> > <https://rmannibucau.metawerx.net/ >>>>>> > <https://rmannibucau.metawerx.net/>> | Old Blog >>>>>> > <http://rmannibucau.wordpress.com <http://rmannibucau.wordpress.com/>> >>>>>> > | Github >>>>>> > <https://github.com/rmannibucau <https://github.com/rmannibucau>> | >>>>>> > LinkedIn >>>>>> > <https://www.linkedin.com/in/rmannibucau >>>>>> > <https://www.linkedin.com/in/rmannibucau>> | Book >>>>>> > <https://www.packtpub.com/application-development/java-ee-8-high-performance >>>>>> > >>>>>> > <https://www.packtpub.com/application-development/java-ee-8-high-performance>> >>>>>> > >>>>>> > >>>>>> > Le lun. 10 sept. 2018 à 20:48, Lukasz Cwik <lc...@google.com >>>>>> > <mailto:lc...@google.com> >>>>>> > <mailto:lc...@google.com <mailto:lc...@google.com>>> a écrit : >>>>>> > >>>>>> > I found an issue where we are no longer packaging the pom.xml >>>>>> > within >>>>>> > the artifact jars at META-INF/maven/groupId/artifactId. More >>>>>> > details >>>>>> > in https://issues.apache.org/jira/browse/BEAM-5351 >>>>>> > <https://issues.apache.org/jira/browse/BEAM-5351>. I wouldn't >>>>>> > consider this a blocker but it was an easy fix >>>>>> > (https://github.com/apache/beam/pull/6358 >>>>>> > <https://github.com/apache/beam/pull/6358>) and users may rely on the >>>>>> > pom.xml. >>>>>> > >>>>>> > Should we recut the release candidate to include this? >>>>>> > >>>>>> > On Mon, Sep 10, 2018 at 4:58 AM Jean-Baptiste Onofré >>>>>> > <j...@nanthrax.net <mailto:j...@nanthrax.net> >>>>>> > <mailto:j...@nanthrax.net <mailto:j...@nanthrax.net>>> wrote: >>>>>> > >>>>>> > +1 (binding) >>>>>> > >>>>>> > Tested successfully on Beam Samples. >>>>>> > >>>>>> > Thanks ! >>>>>> > >>>>>> > Regards >>>>>> > JB >>>>>> > >>>>>> > On 07/09/2018 23:56, Charles Chen wrote: >>>>>> > > Hi everyone, >>>>>> > > >>>>>> > > Please review and vote on the release candidate #1 for the >>>>>> > version >>>>>> > > 2.7.0, as follows: >>>>>> > > [ ] +1, Approve the release >>>>>> > > [ ] -1, Do not approve the release (please provide specific >>>>>> > comments) >>>>>> > > >>>>>> > > The complete staging area is available for your review, >>>>>> > which >>>>>> > includes: >>>>>> > > * JIRA release notes [1], >>>>>> > > * the official Apache source release to be deployed to >>>>>> > dist.apache.org <http://dist.apache.org/> >>>>>> > <http://dist.apache.org <http://dist.apache.org/>> >>>>>> > > <http://dist.apache.org <http://dist.apache.org/>> [2], >>>>>> > which is signed with the key with >>>>>> > > fingerprint 45C60AAAD115F560 [3], >>>>>> > > * all artifacts to be deployed to the Maven Central >>>>>> > Repository [4], >>>>>> > > * source code tag "v2.7.0-RC1" [5], >>>>>> > > * website pull request listing the release and publishing >>>>>> > the API >>>>>> > > reference manual [6]. >>>>>> > > * Java artifacts were built with Gradle 4.8 and OpenJDK >>>>>> > > 1.8.0_181-8u181-b13-1~deb9u1-b13. >>>>>> > > * Python artifacts are deployed along with the source >>>>>> > release >>>>>> > to the >>>>>> > > dist.apache.org <http://dist.apache.org/> >>>>>> > <http://dist.apache.org <http://dist.apache.org/>> >>>>>> > <http://dist.apache.org <http://dist.apache.org/>> [2]. >>>>>> > > >>>>>> > > The vote will be open for at least 72 hours. It is adopted >>>>>> > by >>>>>> > majority >>>>>> > > approval, with at least 3 PMC affirmative votes. >>>>>> > > >>>>>> > > Thanks, >>>>>> > > Charles >>>>>> > > >>>>>> > > [1] >>>>>> > > >>>>>> > >>>>>> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343654 >>>>>> > >>>>>> > <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343654> >>>>>> > > [2] https://dist.apache.org/repos/dist/dev/beam/2.7.0 >>>>>> > <https://dist.apache.org/repos/dist/dev/beam/2.7.0> >>>>>> > > [3] https://dist.apache.org/repos/dist/dev/beam/KEYS >>>>>> > <https://dist.apache.org/repos/dist/dev/beam/KEYS> >>>>>> > > [4] >>>>>> > >>>>>> > https://repository.apache.org/content/repositories/orgapachebeam-1046/ >>>>>> > <https://repository.apache.org/content/repositories/orgapachebeam-1046/> >>>>>> > > [5] https://github.com/apache/beam/tree/v2.7.0-RC1 >>>>>> > <https://github.com/apache/beam/tree/v2.7.0-RC1> >>>>>> > > [6] https://github.com/apache/beam-site/pull/549 >>>>>> > <https://github.com/apache/beam-site/pull/549> >>>>>> > >>>>>> > -- >>>>>> > Jean-Baptiste Onofré >>>>>> > jbono...@apache.org <mailto:jbono...@apache.org> >>>>>> > <mailto:jbono...@apache.org <mailto:jbono...@apache.org>> >>>>>> > http://blog.nanthrax.net <http://blog.nanthrax.net/> >>>>>> > Talend - http://www.talend.com <http://www.talend.com/> >>>>>> > >>>