date:20150321

Re: About the env of Spark1.2

2015-03-21 Thread sandeep vura

Make sure if you are using 127.0.0.1 please check in /etc/hosts and uncheck or create 127.0.1.1 named it as localhost On Sat, Mar 21, 2015 at 9:57 AM, Ted Yu wrote: > bq. Caused by: java.net.UnknownHostException: dhcp-10-35-14-100: Name or > service not known > > Can you check your DNS ? > > Che

Re: Can I start multiple executors in local mode?

2015-03-21 Thread Xi Shen

No, I didn't mean local cluster. I mean run in local, like in IDE. On Mon, 16 Mar 2015 23:12 xu Peng wrote: > Hi David, > > You can try the local-cluster. > > the number in local-cluster[2,2,1024] represents that there are 2 worker, > 2 cores and 1024M > > Best Regards > > Peng Xu > > 2015-03-16

Re: How to set Spark executor memory?

2015-03-21 Thread Xi Shen

Hi Sean, It's getting strange now. If I ran from IDE, my executor memory is always set to 6.7G, no matter what value I set in code. I have check my environment variable, and there's no value of 6.7, or 12.5 Any idea? Thanks, David On Tue, 17 Mar 2015 00:35 null wrote: > Hi Xi Shen, > > You c

Re: Spark Streaming Not Reading Messages From Multiple Kafka Topics

2015-03-21 Thread Jeffrey Jedele

Hey Eason! Weird problem indeed. More information will probably help to find te issue: Have you searched the logs for peculiar messages? How does your Spark environment look like? #workers, #threads, etc? Does it work if you create separate receivers for the topics? Regards, Jeff 2015-03-21 2:27

Re: Spark per app logging

2015-03-21 Thread Jeffrey Jedele

Hi, I'm not completely sure about this either, but this is what we are doing currently: Configure your logging to write to STDOUT, not to a file explicitely. Spark will capture stdour and stderr and separate the messages into a app/driver folder structure in the configured worker directory. We the

Re: Spark 1.3 Dynamic Allocation - Requesting 0 new executor(s) because tasks are backlogged

2015-03-21 Thread Ted Yu

bq. Requesting 1 new executor(s) because tasks are backlogged 1 executor was requested. Which hadoop release are you using ? Can you check resource manager log to see if there is some clue ? Thanks On Fri, Mar 20, 2015 at 4:17 PM, Manoj Samel wrote: > Forgot to add - the cluster is idle othe

Re: Spark Streaming S3 Performance Implications

2015-03-21 Thread Chris Fregly

hey mike! you'll definitely want to increase your parallelism by adding more shards to the stream - as well as spinning up 1 receiver per shard and unioning all the shards per the KinesisWordCount example that is included with the kinesis streaming package. you'll need more cores (cluster) or t

Re: How to set Spark executor memory?

2015-03-21 Thread Sean Owen

If you are running from your IDE, then I don't know what you are running or in what mode. The discussion here concerns using standard mechanisms like spark-submit to configure executor memory. Please try these first instead of trying to directly invoke Spark, which will require more understanding o

Re: Spark Streaming S3 Performance Implications

2015-03-21 Thread Ted Yu

Mike: Once hadoop 2.7.0 is released, you should be able to enjoy the enhanced performance of s3a. See HADOOP-11571 Cheers On Sat, Mar 21, 2015 at 8:09 AM, Chris Fregly wrote: > hey mike! > > you'll definitely want to increase your parallelism by adding more shards > to the stream - as well as s

'nested' RDD problem, advise needed

2015-03-21 Thread Michael Lewis

Hi, I wonder if someone can help suggest a solution to my problem, I had a simple process working using Strings and now want to convert to RDD[Char], the problem is when I end up with a nested call as follow: 1) Load a text file into an RDD[Char] val inputRDD = sc.textFile(“myFile.txt

Model deployment help

2015-03-21 Thread Shashidhar Rao

Hi, Apologies for the generic question. As I am developing predictive models for the first time and soon model will be deployed in production very soon. Could somebody help me with the model deployment in production , I have read quite a few on model deployment and have read some books on Datab

ArrayIndexOutOfBoundsException in ALS.trainImplicit

2015-03-21 Thread Sabarish Sasidharan

I am consistently running into this ArrayIndexOutOfBoundsException issue when using trainImplicit. I have tried changing the partitions and switching to JavaSerializer. But they don't seem to help. I see that this is the same as https://issues.apache.org/jira/browse/SPARK-3080. My lambda is 0.01, r

Spark streaming alerting

2015-03-21 Thread Mohit Anchlia

Is there a module in spark streaming that lets you listen to the alerts/conditions as they happen in the streaming module? Generally spark streaming components will execute on large set of clusters like hdfs or Cassandra, however when it comes to alerting you generally can't send it directly from t

Re: Accessing AWS S3 in Frankfurt (v4 only - AWS4-HMAC-SHA256)

2015-03-21 Thread Steve Loughran

1. make sure your secret key doesn't have a "/" in it. If it does, generate a new key. 2. jets3t and hadoop JAR versions need to be in sync; jets3t 0.9.0 was picked up in Hadoop 2.4 and not AFAIK 3. Hadoop 2.6 has a new S3 client, "s3a", which compatible with s3n data. It uses the AWS toolkit

Re: Upgrade from Spark 1.1.0 to 1.1.1+ Issues

2015-03-21 Thread Eason Hu

Thank you for your help Akhil! We found that it is no longer working from our laptop to remotely connect to the remote Spark cluster, but it works if the client is on the remote cluster as well, starting from the version 1.2.0 and beyond (v1.1.1 and below are fine). Not sure if this is related th

Re: saveAsTable broken in v1.3 DataFrames?

2015-03-21 Thread Michael Armbrust

I believe that you can get what you want by using HiveQL instead of the pure programatic API. This is a little verbose so perhaps a specialized function would also be useful here. I'm not sure I would call it saveAsExternalTable as there are also "external" spark sql data source tables that have

Re: Did DataFrames break basic SQLContext?

2015-03-21 Thread Michael Armbrust

> > Now, I am not able to directly use my RDD object and have it implicitly > become a DataFrame. It can be used as a DataFrameHolder, of which I could > write: > > rdd.toDF.registerTempTable("foo") > The rational here was that we added a lot of methods to DataFrame and made the implicits more

join two DataFrames, same column name

2015-03-21 Thread Eric Friedman

I have a couple of data frames that I pulled from SparkSQL and the primary key of one is a foreign key of the same name in the other. I'd rather not have to specify each column in the SELECT statement just so that I can rename this single column. When I try to join the data frames, I get an excep

Re: How to set Spark executor memory?

2015-03-21 Thread Xi Shen

In the log, I saw MemoryStorage: MemoryStore started with capacity 6.7GB But I still can not find where to set this storage capacity. On Sat, 21 Mar 2015 20:30 Xi Shen wrote: > Hi Sean, > > It's getting strange now. If I ran from IDE, my executor memory is always > set to 6.7G, no matter wha

Re: How to set Spark executor memory?

2015-03-21 Thread Xi Shen

Yeah, I think it is harder to troubleshot the properties issues in a IDE. But the reason I stick to IDE is because if I use spark-submit, the BLAS native cannot be loaded. May be I should open another thread to discuss that. Thanks, David On Sun, 22 Mar 2015 10:38 Xi Shen wrote: > In the log, I

netlib-java cannot load native lib in Windows when using spark-submit

2015-03-21 Thread Xi Shen

Hi, I use the *OpenBLAS* DLL, and have configured my application to work in IDE. When I start my Spark application from IntelliJ IDE, I can see in the log that the native lib is loaded successfully. But if I use *spark-submit* to start my application, the native lib still cannot be load. I saw th

Reducing Spark's logging verbosity

2015-03-21 Thread Edmon Begoli

Hi, Does anyone have concrete recommendations how to reduce Spark's logging verbosity. We have attempted on several occasions to address this by setting various log4j properties, both in configuration property files and in $SPARK_HOME/conf/ spark-env.sh; however, all of those attempts have failed.

Re: netlib-java cannot load native lib in Windows when using spark-submit

2015-03-21 Thread Ted Yu

Can you try the --driver-library-path option ? spark-submit --driver-library-path /opt/hadoop/lib/native ... Cheers On Sat, Mar 21, 2015 at 4:58 PM, Xi Shen wrote: > Hi, > > I use the *OpenBLAS* DLL, and have configured my application to work in > IDE. When I start my Spark application from In

Error while installing Spark 1.3.0 on local machine

2015-03-21 Thread HARIPRIYA AYYALASOMAYAJULA

Hello, I am trying to install Spark 1.3.0 on my mac. Earlier, I was working with Spark 1.1.0. Now, I come across this error : sbt.ResolveException: unresolved dependency: org.apache.spark#spark-network-common_2.10;1.3.0: configuration not public in org.apache.spark#spark-network-common_2.10;1.3.0

Re: How to set Spark executor memory?

2015-03-21 Thread Ted Yu

bq. the BLAS native cannot be loaded Have you tried specifying --driver-library-path option ? Cheers On Sat, Mar 21, 2015 at 4:42 PM, Xi Shen wrote: > Yeah, I think it is harder to troubleshot the properties issues in a IDE. > But the reason I stick to IDE is because if I use spark-submit, the

Re: Model deployment help

2015-03-21 Thread Donald Szeto

Hi Shashidhar, Our team at PredictionIO is trying to solve the production deployment of model. We built a powered-by-Spark framework (also certified on Spark by Databricks) that allows a user to build models with everything available from the Spark API, persist the model automatically with version

How to do nested foreach with RDD

2015-03-21 Thread Xi Shen

Hi, I have two big RDD, and I need to do some math against each pair of them. Traditionally, it is like a nested for-loop. But for RDD, it cause a nested RDD which is prohibited. Currently, I am collecting one of them, then do a nested for-loop, so to avoid nested RDD. But would like to know if t

Re: How to do nested foreach with RDD

2015-03-21 Thread Reza Zadeh

You can do this with the 'cartesian' product method on RDD. For example: val rdd1 = ... val rdd2 = ... val combinations = rdd1.cartesian(rdd2).filter{ case (a,b) => a < b } Reza On Sat, Mar 21, 2015 at 10:37 PM, Xi Shen wrote: > Hi, > > I have two big RDD, and I need to do some math against e

Re: can distinct transform applied on DStream?

2015-03-21 Thread Akhil Das

What do you mean not distinct? It does works for me: [image: Inline image 1] Code: import org.apache.spark.streaming.{Seconds, StreamingContext} import org.apache.spark.{SparkContext, SparkConf} val ssc = new StreamingContext(sc, Seconds(1)) val data = ssc.textFileStream("/home/akhld/mobi/loca

Re: About the env of Spark1.2

Re: Can I start multiple executors in local mode?

Re: How to set Spark executor memory?

Re: Spark Streaming Not Reading Messages From Multiple Kafka Topics

Re: Spark per app logging

Re: Spark 1.3 Dynamic Allocation - Requesting 0 new executor(s) because tasks are backlogged

Re: Spark Streaming S3 Performance Implications

Re: How to set Spark executor memory?

Re: Spark Streaming S3 Performance Implications

'nested' RDD problem, advise needed

Model deployment help

ArrayIndexOutOfBoundsException in ALS.trainImplicit

Spark streaming alerting

Re: Accessing AWS S3 in Frankfurt (v4 only - AWS4-HMAC-SHA256)

Re: Upgrade from Spark 1.1.0 to 1.1.1+ Issues

Re: saveAsTable broken in v1.3 DataFrames?

Re: Did DataFrames break basic SQLContext?

join two DataFrames, same column name

Re: How to set Spark executor memory?

Re: How to set Spark executor memory?

netlib-java cannot load native lib in Windows when using spark-submit

Reducing Spark's logging verbosity

Re: netlib-java cannot load native lib in Windows when using spark-submit

Error while installing Spark 1.3.0 on local machine

Re: How to set Spark executor memory?

Re: Model deployment help

How to do nested foreach with RDD

Re: How to do nested foreach with RDD

Re: can distinct transform applied on DStream?

29 matches

Site Navigation

Mail list logo

Footer information