Re: Submitting to a cluster behind a VPN, configuring different IP address

2015-04-02 Thread Michael Quinlan
I was able to hack this on my similar setup issue by running (on the driver) $ sudo hostname ip Where ip is the same value set in the "spark.driver.host" property. This isn't a solution I would use universally and hope the someone can fix this bug in the distribution. Regards, Mike -- View

Java Implementation of StreamingContext.fileStream

2014-09-22 Thread Michael Quinlan
I'm attempting to code a Java only implementation accessing the StreamingContext.fileStream method and am especially interested in setting the boolean "newFilesOnly" to false. Unfortunately my code throws exceptions: Exception in thread "main" java.lang.InstantiationException at sun.reflec

Re: Java Implementation of StreamingContext.fileStream

2014-09-23 Thread Michael Quinlan
Thanks very much for the pointer, which validated my initial approach. It turns out that I was creating a tag for the abstract class "InputFormat.class". Using "TextInputFormat.class" instead fixed my issue. Regards, Mike -- View this message in context: http://apache-spark-user-list.1001560.

Re: Submiting multiple jobs via different threads

2014-12-12 Thread Michael Quinlan
Haoming If the Spark UI states that one of the jobs is in the "Waiting" state, this is a resources issue. You will need to set properties such as: spark.executor.memory spark.cores.max Set these so that each instance only takes a portion of the available worker memory and cores. Regards, Mike

Re: Using Customized Hadoop InputFormat class with Spark Streaming

2014-12-19 Thread Michael Quinlan
Soroka, You should be able to use the filestream() method of the JavaStreamingContext. In case you need something more custom, the code below is something I developed to provide the max functionality of the Scala method, but implemented in Java. //Set these to reflect your app and input format sp

Re: removing first record from RDD[String]

2014-12-23 Thread Michael Quinlan
Hafiz, You can probably use the RDD.mapPartitionsWithIndex method. Mike On Tue, Dec 23, 2014 at 8:35 AM, Hafiz Mujadid [via Apache Spark User List] wrote: > > hi dears! > > Is there some efficient way to drop first line of an RDD[String]? > > any suggestion? > > Thanks > > -

Re: Clean up app folders in worker nodes

2014-12-29 Thread Michael Quinlan
I'm also interested in the solution to this. Thanks, Mike On Mon, Dec 29, 2014 at 12:01 PM, hutashan [via Apache Spark User List] < ml-node+s1001560n20889...@n3.nabble.com> wrote: > > Hello All, > > I need to clean up app folder(include app downloaded jar) in spark under > work folder. > I have