I spent ages on this recently, and here's what I found: --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:///local/file/on.executor.properties" works. Alternatively, you can also do: --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=filename.properties" --files="path/to/filename.properties" log4j.properties files packaged with the application don't seem to have any effect. This is likely because log4j gets initialised before your app stuff is loaded. You can also reinitialise log4j logging as part of your application code. That also worked for us, but we went the extraJavaOptions route as it was less invasive on the application side. -Ashic.
Date: Mon, 18 Apr 2016 10:32:03 -0300 Subject: Re: Logging in executors From: cma...@despegar.com To: yuzhih...@gmail.com CC: user@spark.apache.org Thanks Ted, already checked it but is not the same. I'm working with StandAlone spark, the examples refers to HDFS paths, therefore I assume Hadoop 2 Resource Manager is used. I've tried all possible flavours. The only one that worked was changing the spark-defaults.conf in every machine. I'll go with this by now, but the extra java opts for the executor are definitely not working, at least for logging configuration. Thanks,-carlos. On Fri, Apr 15, 2016 at 3:28 PM, Ted Yu <yuzhih...@gmail.com> wrote: See this thread: http://search-hadoop.com/m/q3RTtsFrd61q291j1 On Fri, Apr 15, 2016 at 5:38 AM, Carlos Rojas Matas <cma...@despegar.com> wrote: Hi guys, any clue on this? Clearly the spark.executor.extraJavaOpts=-Dlog4j.configuration is not working on the executors. Thanks,-carlos. On Wed, Apr 13, 2016 at 2:48 PM, Carlos Rojas Matas <cma...@despegar.com> wrote: Hi Yong, thanks for your response. As I said in my first email, I've tried both the reference to the classpath resource (env/dev/log4j-executor.properties) as the file:// protocol. Also, the driver logging is working fine and I'm using the same kind of reference. Below the content of my classpath: Plus this is the content of the exploded fat jar assembled with sbt assembly plugin: This folder is at the root level of the classpath. Thanks,-carlos. On Wed, Apr 13, 2016 at 2:35 PM, Yong Zhang <java8...@hotmail.com> wrote: Is the env/dev/log4j-executor.properties file within your jar file? Is the path matching with what you specified as env/dev/log4j-executor.properties? If you read the log4j document here: https://logging.apache.org/log4j/1.2/manual.html When you specify the log4j.configuration=my_custom.properties, you have 2 option: 1) the my_custom.properties has to be in the jar (or in the classpath). In your case, since you specify the package path, you need to make sure they are matched in your jar file2) use like log4j.configuration=file:///tmp/my_custom.properties. In this way, you need to make sure file my_custom.properties exists in /tmp folder on ALL of your worker nodes. Yong Date: Wed, 13 Apr 2016 14:18:24 -0300 Subject: Re: Logging in executors From: cma...@despegar.com To: yuzhih...@gmail.com CC: user@spark.apache.org Thanks for your response Ted. You're right, there was a typo. I changed it, now I'm executing: bin/spark-submit --master spark://localhost:7077 --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=env/dev/log4j-driver.properties" --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=env/dev/log4j-executor.properties" --class.... The content of this file is: # Set everything to be logged to the consolelog4j.rootCategory=INFO, FILElog4j.appender.console=org.apache.log4j.ConsoleAppenderlog4j.appender.console.target=System.errlog4j.appender.console.layout=org.apache.log4j.PatternLayoutlog4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n log4j.appender.FILE=org.apache.log4j.RollingFileAppenderlog4j.appender.FILE.File=/tmp/executor.loglog4j.appender.FILE.ImmediateFlush=truelog4j.appender.FILE.Threshold=debuglog4j.appender.FILE.Append=truelog4j.appender.FILE.MaxFileSize=100MBlog4j.appender.FILE.MaxBackupIndex=5log4j.appender.FILE.layout=org.apache.log4j.PatternLayoutlog4j.appender.FILE.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n # Settings to quiet third party logs that are too verboselog4j.logger.org.spark-project.jetty=WARNlog4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERRORlog4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFOlog4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFOlog4j.logger.org.apache.parquet=ERRORlog4j.logger.parquet=ERRORlog4j.logger.com.despegar.p13n=DEBUG # SPARK-9183: Settings to avoid annoying messages when looking up nonexistent UDFs in SparkSQL with Hive supportlog4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATALlog4j.logger.org.apache.hadoop.hive.ql.exec.FunctionRegistry=ERROR Finally, the code on which I'm using logging in the executor is: def groupAndCount(keys: DStream[(String, List[String])])(handler: ResultHandler) = { val result = keys.reduceByKey((prior, current) => { (prior ::: current) }).flatMap { case (date, keys) => val rs = keys.groupBy(x => x).map( obs =>{ val (d,t) = date.split("@") match { case Array(d,t) => (d,t) } import org.apache.log4j.Logger import scala.collection.JavaConverters._ val logger: Logger = Logger.getRootLogger logger.info(s"Metric retrieved $d") Metric("PV", d, obs._1, t, obs._2.size) } ) rs } result.foreachRDD((rdd: RDD[Metric], time: Time) => { handler(rdd, time) }) } Originally the import and logger object was outside the map function. I'm also using the root logger just to see if it's working, but nothing gets logged. I've checked that the property is set correctly on the executor side through println(System.getProperty("log4j.configuration")) and is OK, but still not working. Thanks again,-carlos.