Re: Spark progress feedback

2019-01-29 Thread Matt Casters
OK folks, I figured it out. For the other people desperately clutching to years-old google results in the hope to find any hint... The Spark requirement to work with a fat jar caused a collision in the packaging on file: META-INF/services/org.apache.hadoop.fs.FileSystem This in turn erased cert

Re: Spark progress feedback

2019-01-28 Thread Matt Casters
Yeah for this setup I used flintrock to start up a bunch of nodes with Spark and HDFS on AWS. I'm launching the pipeline on the master and all possible HDFS libraries I can think of are available and hdfs dfs commands work fine on the master and all the slaves. It's a problem of transparency I thin

Re: Spark progress feedback

2019-01-28 Thread Juan Carlos Garcia
Matt is the machine from where you are launching the pipeline different from where it should run? If that's the case make sure the machine used for launching has all the hdfs environments variable set, as the pipeline is being configured in the launching machine before it hit the worker machine.

Spark progress feedback

2019-01-28 Thread Matt Casters
Dear Beam friends, In preparation for my presentation of the Kettle Beam work in London next week I've been trying to get Spark Beam to run which worked in the end. The problem that resurfaced is however ... once again... back with a vengeance : java.lang.IllegalArgumentException: No filesystem f