I was starting PySpark as a profile within IPython Notebook as per:
http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/
My setup looks like:
import os
import sys
spark_home = os.environ.get('SPARK_HOME', None)
if not spark_home:
raise ValueError('SPARK_HOME environment variable is not set')
sys.path.insert(0, os.path.join(spark_home, 'python'))
sys.path.insert(0, os.path.join(spark_home,
'python/lib/py4j-0.8.2.1-src.zip'))
execfile(os.path.join(spark_home, 'python/pyspark/shell.py'))
I also have some code to expand all of the jars (and the log4j.property) in
SPARK_HOME and CASSANDRA_HOME and add them to the SPARK_CLASSPATH
I'll try your launch method and see how that goes.
On Wed, Oct 1, 2014 at 3:31 PM, Davies Liu <[email protected]> wrote:
> How do you setup IPython to access pyspark in notebook?
>
> I did as following, it worked for me:
>
> $ export SPARK_HOME=/opt/spark-1.1.0/
> $ export
> PYTHONPATH=/opt/spark-1.1.0/python:/opt/spark-1.1.0/python/lib/py4j-0.8.2.1-src.zip
> $ ipython notebook
>
> All the logging will go into console (not in notebook),
>
> If you want to reduce the logging in console, you should change
> /opt/spark-1.1.0/conf/log4j.properties
>
> log4j.rootCategory=WARN, console
> og4j.logger.org.apache.spark=WARN
>
>
> On Wed, Oct 1, 2014 at 11:49 AM, Rick Richardson
> <[email protected]> wrote:
> > Thanks for your reply. Unfortunately changing the log4j.properties
> within
> > SPARK_HOME/conf has no effect on pyspark for me. When I change it in the
> > master or workers the log changes have the desired effect, but pyspark
> seems
> > to ignore them. I have changed the levels to WARN, changed the appender
> to
> > rolling file, or removed it entirely, all with the same results.
> >
> > On Wed, Oct 1, 2014 at 1:49 PM, Davies Liu <[email protected]>
> wrote:
> >>
> >> On Tue, Sep 30, 2014 at 10:14 PM, Rick Richardson
> >> <[email protected]> wrote:
> >> > I am experiencing significant logging spam when running PySpark in
> >> > IPython
> >> > Notebok
> >> >
> >> > Exhibit A: http://i.imgur.com/BDP0R2U.png
> >> >
> >> > I have taken into consideration advice from:
> >> >
> >> >
> http://apache-spark-user-list.1001560.n3.nabble.com/Disable-all-spark-logging-td1960.html
> >> >
> >> > also
> >> >
> >> >
> >> >
> http://stackoverflow.com/questions/25193488/how-to-turn-off-info-logging-in-pyspark
> >> >
> >> > I have only one log4j.properties it is in /opt/spark-1.1.0/conf
> >> >
> >> > Just before I launch IPython Notebook with a pyspark profile, I add
> the
> >> > dir
> >> > and the properties file directly to CLASSPATH and SPARK_CLASSPATH env
> >> > vars
> >> > (as you can also see from the png)
> >> >
> >> > I still haven't been able to make any change which disables this
> >> > infernal
> >> > debug output.
> >> >
> >> > Any ideas (WAGs, Solutions, commiserating) would be greatly
> >> > appreciated.
> >> >
> >> > ---
> >> >
> >> > My log4j.properties:
> >> >
> >> > log4j.rootCategory=INFO, console
> >> > log4j.appender.console=org.apache.log4j.ConsoleAppender
> >> > log4j.appender.console.layout=org.apache.log4j.PatternLayout
> >> > log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss}
> %p
> >> > %c{1}: %m%n
> >>
> >> You should change log4j.rootCategory to WARN, console
> >>
> >> > # Change this to set Spark log level
> >> > log4j.logger.org.apache.spark=INFO
> >> >
> >> > # Silence akka remoting
> >> > log4j.logger.Remoting=WARN
> >> >
> >> > # Ignore messages below warning level from Jetty, because it's a bit
> >> > verbose
> >> > log4j.logger.org.eclipse.jetty=WARN
> >> >
> >> >
> >
> >
> >
> >
> > --
> >
> >
> > “Science is the great antidote to the poison of enthusiasm and
> > superstition.” -- Adam Smith
> >
>
--
“Science is the great antidote to the poison of enthusiasm and
superstition.” -- Adam Smith