Loading a text file

2022-03-14 Thread Hinko Kocevar
I have a standalone spark 3.2.0 cluster with two workers started on PC_A and want to run a pyspark job from PC_B. The job wants to load a text file. I keep getting file not found error messages when I execute the job. Folder/file "/home/bddev/parrot/words.txt" exists on PC_B but not on PC_A. tr

spark distribution build fails

2022-03-14 Thread Bulldog20630405
using tag v3.2.1 with java 8 getting a stackoverflow when building the distribution: > alias mvn alias mvn='mvn --errors --fail-at-end -DskipTests ' > dev/make-distribution.sh --name 'hadoop-3.2' --pip --tgz -Phive -Phive-thriftserver -Pmesos -Pyarn -Pkubernetes [INFO] ---

Re: spark distribution build fails

2022-03-14 Thread Sean Owen
Try increasing the stack size in the build. It's the Xss argument you find in various parts of the pom or sbt build. I have seen this and not sure why it happens on certain envs, but that's the workaround On Mon, Mar 14, 2022, 8:59 AM Bulldog20630405 wrote: > > using tag v3.2.1 with java 8 getti

How Spark establishes connectivity to Hive

2022-03-14 Thread Venkatesan Muniappan
hi Team, I wanted to understand how spark connects to Hive. Does it connect to Hive metastore directly bypassing hive server?. Lets say when we are inserting data into a hive table with its I/O format as Parquet. Does Spark creates the parquet file from the Dataframe/RDD/DataSet and put it in its

Re: spark distribution build fails

2022-03-14 Thread Bulldog20630405
thanx; that worked great! On Mon, Mar 14, 2022 at 11:17 AM Sean Owen wrote: > Try increasing the stack size in the build. It's the Xss argument you find > in various parts of the pom or sbt build. I have seen this and not sure why > it happens on certain envs, but that's the workaround > > On Mo