from:"Rares Vernica"

Set Job Descriptions for Scala application

2015-08-05 Thread Rares Vernica

Hello, My Spark application is written in Scala and submitted to a Spark cluster in standalone mode. The Spark Jobs for my application are listed in the Spark UI like this: Job Id Description ... 6 saveAsTextFile at Foo.scala:202 5 saveAsTextFile at Foo.scala:201 4

Driver ID from spark-submit

2015-04-27 Thread Rares Vernica

Hello, I am trying to use the default Spark cluster manager in a production environment. I will be submitting jobs with spark-submit. I wonder if the following is possible: 1. Get the Driver ID from spark-submit. We will use this ID to keep track of the job and kill it if necessary. 2. Weather i

Re: 2 input paths generate 3 partitions

2015-03-27 Thread Rares Vernica

15:38 + > > > Hi Rares, > > The number of partition is controlled by HDFS input format, and one file > may have multiple partitions if it consists of multiple block. In you case, > I think there is one file with 2 splits. > > Thanks. > > Zhan Zhang >

2 input paths generate 3 partitions

2015-03-27 Thread Rares Vernica

Hello, I am using the Spark shell in Scala on the localhost. I am using sc.textFile to read a directory. The directory looks like this (generated by another Spark script): part-0 part-1 _SUCCESS The part-0 has four short lines of text while part-1 has two short lines of text. Th

Set spark.fileserver.uri on private cluster

2015-03-17 Thread Rares Vernica

Hi, I have a private cluster with private IPs, 192.168.*.*, and a gateway node with both private IP, 192.168.*.*, and public internet IP. I setup the Spark master on the gateway node and set the SPARK_MASTER_IP to the private IP. I start Spark workers on the private nodes. It works fine. The pro

takeSample triggers 2 jobs

2015-03-06 Thread Rares Vernica

Hello, I am using takeSample from the Scala Spark 1.2.1 shell: scala> sc.textFile("README.md").takeSample(false, 3) and I notice that two jobs are generated on the Spark Jobs page: Job Id Description 1 takeSample at :13 0 takeSample at :13 Any ideas why the two jobs are needed? Thanks! Rar

Set Job Descriptions for Scala application

Driver ID from spark-submit

Re: 2 input paths generate 3 partitions

2 input paths generate 3 partitions

Set spark.fileserver.uri on private cluster

takeSample triggers 2 jobs

6 matches

Site Navigation

Mail list logo

Footer information