Re: Spark on Mesos 0.20

2014-10-07 Thread RJ Nowling
I was able to reproduce it on a small 4 node cluster (1 mesos master and 3 mesos slaves) with relatively low-end specs. As I said, I just ran the log query examples with the fine-grained mesos mode. Spark 1.1.0 and mesos 0.20.1. Fairiz, could you try running the logquery example included with Sp

Local tests logging to log4j

2014-10-07 Thread Debasish Das
Hi, I have added some changes to ALS tests and I am re-running tests as: mvn -Dhadoop.version=2.3.0-cdh5.1.0 -Phadoop-2.3 -Pyarn -DwildcardSuites=org.apache.spark.mllib.recommendation.ALSSuite test I have some INFO logs in the code which I want to see on my console. They work fine if I add print

Re: TorrentBroadcast slow performance

2014-10-07 Thread Davies Liu
Could you create a JIRA for it? maybe it's a regression after https://issues.apache.org/jira/browse/SPARK-3119. We will appreciate that if you could tell how to reproduce it. On Mon, Oct 6, 2014 at 1:27 AM, Guillaume Pitel wrote: > Hi, > > I've had no answer to this on u...@spark.apache.org, so

Re: TorrentBroadcast slow performance

2014-10-07 Thread Matei Zaharia
Maybe there is a firewall issue that makes it slow for your nodes to connect through the IP addresses they're configured with. I see there's this 10 second pause between "Updated info of block broadcast_84_piece1" and "ensureFreeSpace(4194304) called" (where it actually receives the block). HTTP

Re: Local tests logging to log4j

2014-10-07 Thread Sean Owen
What has worked for me is to bundle log4j.properties in the root of the application's .jar file, since log4j will look for it there, and configuring log4j will turn off Spark's default log4j configuration. I don't think conf/log4j.properties is going to do anything by itself, but -Dlog4j.configura

Re: Extending Scala style checks

2014-10-07 Thread Nicholas Chammas
For starters, do we have a list of all the Scala style rules that are currently not enforced automatically but are likely well-suited for automation? Let's put such a list together in a JIRA issue and work through implementing them. Nick On Thu, Oct 2, 2014 at 12:06 AM, Cheng Lian wrote: > Sin

Re: Local tests logging to log4j

2014-10-07 Thread Debasish Das
Thanks Sean...trying them out... On Tue, Oct 7, 2014 at 12:24 PM, Sean Owen wrote: > What has worked for me is to bundle log4j.properties in the root of > the application's .jar file, since log4j will look for it there, and > configuring log4j will turn off Spark's default log4j configuration. >

Re: Spark on Mesos 0.20

2014-10-07 Thread Fairiz Azizi
Sure, could you point me to the example? The only thing I could find was https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/LogQuery.scala So do you mean running it like: MASTER="mesos://xxx*:5050*" ./run-example LogQuery I tried that and I can s

Unneeded branches/tags

2014-10-07 Thread Nicholas Chammas
Just curious: Are there branches and/or tags on the repo that we don’t need anymore? What are the scala-2.9 and streaming branches for, for example? And do we still need branches for older versions of Spark that we are not backporting stuff to, like branch-0.5? Nick ​

Re: Unneeded branches/tags

2014-10-07 Thread Reynold Xin
Those branches are no longer active. However, I don't think we can delete branches from github due to the way ASF mirroring works. I might be wrong there. On Tue, Oct 7, 2014 at 6:25 PM, Nicholas Chammas wrote: > Just curious: Are there branches and/or tags on the repo that we don’t need > any

Re: Unneeded branches/tags

2014-10-07 Thread Patrick Wendell
Actually - weirdly - we can delete old tags and it works with the mirroring. Nick if you put together a list of un-needed tags I can delete them. On Tue, Oct 7, 2014 at 6:27 PM, Reynold Xin wrote: > Those branches are no longer active. However, I don't think we can delete > branches from github d

RE: Spark SQL question: why build hashtable for both sides in HashOuterJoin?

2014-10-07 Thread Haopu Wang
Liquan, yes, for full outer join, one hash table on both sides is more efficient. For the left/right outer join, it looks like one hash table should be enought. From: Liquan Pei [mailto:liquan...@gmail.com] Sent: 2014年9月30日 18:34 To: Haopu Wang Cc: dev@sp

Re: How to do broadcast join in SparkSQL

2014-10-07 Thread Jianshi Huang
Looks like https://issues.apache.org/jira/browse/SPARK-1800 is not merged into master? I cannot find spark.sql.hints.broadcastTables in latest master, but it's in the following patch. https://github.com/apache/spark/commit/76ca4341036b95f71763f631049fdae033990ab5 Jianshi On Mon, Sep 29, 2014