IPython Notebook Debug Spam

2014-09-30 Thread Rick Richardson
I am experiencing significant logging spam when running PySpark in IPython Notebok Exhibit A: http://i.imgur.com/BDP0R2U.png I have taken into consideration advice from: http://apache-spark-user-list.1001560.n3.nabble.com/Disable-all-spark-logging-td1960.html also http://stackoverflow.com/ques

Re: IPython Notebook Debug Spam

2014-10-01 Thread Rick Richardson
to rolling file, or removed it entirely, all with the same results. On Wed, Oct 1, 2014 at 1:49 PM, Davies Liu wrote: > On Tue, Sep 30, 2014 at 10:14 PM, Rick Richardson > wrote: > > I am experiencing significant logging spam when running PySpark in > IPython > > Note

Re: IPython Notebook Debug Spam

2014-10-01 Thread Rick Richardson
gt; If you want to reduce the logging in console, you should change > /opt/spark-1.1.0/conf/log4j.properties > > log4j.rootCategory=WARN, console > og4j.logger.org.apache.spark=WARN > > > On Wed, Oct 1, 2014 at 11:49 AM, Rick Richardson > wrote: > > Thanks for your reply. Un

Re: IPython Notebook Debug Spam

2014-10-01 Thread Rick Richardson
ory 1g --executor-cores 1" ipython notebook --profile=pyspark On Wed, Oct 1, 2014 at 3:41 PM, Rick Richardson wrote: > I was starting PySpark as a profile within IPython Notebook as per: > > http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/ > >

Re: IPython Notebook Debug Spam

2014-10-01 Thread Rick Richardson
Out of curiosity, how do you actually launch pyspark in your set-up? On Wed, Oct 1, 2014 at 3:44 PM, Rick Richardson wrote: > Here is the other relevant bit of my set-up: > MASTER=spark://sparkmaster:7077 > IPYTHON_OPTS="notebook --pylab inline --ip=0.0.0.0" > CASSA

Re: IPython Notebook Debug Spam

2014-10-01 Thread Rick Richardson
all of the jars from the classpath and it began to use the SPARK_HOME/conf/log4j.properties On Wed, Oct 1, 2014 at 3:46 PM, Rick Richardson wrote: > Out of curiosity, how do you actually launch pyspark in your set-up? > > On Wed, Oct 1, 2014 at 3:44 PM, Rick Richardson > wrote: >

Re: Spark as Relational Database

2014-10-26 Thread Rick Richardson
Spark's API definitely covers all of the things that a relational database can do. It will probably outperform a relational star schema if all of your *working* data set can fit into RAM on your cluster. It will still perform quite well if most of the data fits and some has to spill over to disk.

Re: Spark as Relational Database

2014-10-26 Thread Rick Richardson
t;> >> For the moment, it looks like we should store these events in SQL. When >> appropriate, we will do analysis with relational queries. Or, when >> appropriate we will extract data into working sets in Spark. >> >> I imagine this is a pretty common use case for S