Re: Logging in executors

Carlos Rojas Matas Mon, 18 Apr 2016 06:32:41 -0700

Thanks Ted, already checked it but is not the same. I'm working with
StandAlone spark, the examples refers to HDFS paths, therefore I assume
Hadoop 2 Resource Manager is used. I've tried all possible flavours. The
only one that worked was changing the spark-defaults.conf in every machine.
I'll go with this by now, but the extra java opts for the executor are
definitely not working, at least for logging configuration.


Thanks,
-carlos.

On Fri, Apr 15, 2016 at 3:28 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> See this thread: http://search-hadoop.com/m/q3RTtsFrd61q291j1
>
> On Fri, Apr 15, 2016 at 5:38 AM, Carlos Rojas Matas <cma...@despegar.com>
> wrote:
>
>> Hi guys,
>>
>> any clue on this? Clearly the
>> spark.executor.extraJavaOpts=-Dlog4j.configuration is not working on the
>> executors.
>>
>> Thanks,
>> -carlos.
>>
>> On Wed, Apr 13, 2016 at 2:48 PM, Carlos Rojas Matas <cma...@despegar.com>
>> wrote:
>>
>>> Hi Yong,
>>>
>>> thanks for your response. As I said in my first email, I've tried both
>>> the reference to the classpath resource (env/dev/log4j-executor.properties)
>>> as the file:// protocol. Also, the driver logging is working fine and I'm
>>> using the same kind of reference.
>>>
>>> Below the content of my classpath:
>>>
>>> [image: Inline image 1]
>>>
>>> Plus this is the content of the exploded fat jar assembled with sbt
>>> assembly plugin:
>>>
>>> [image: Inline image 2]
>>>
>>>
>>> This folder is at the root level of the classpath.
>>>
>>> Thanks,
>>> -carlos.
>>>
>>> On Wed, Apr 13, 2016 at 2:35 PM, Yong Zhang <java8...@hotmail.com>
>>> wrote:
>>>
>>>> Is the env/dev/log4j-executor.properties file within your jar file? Is
>>>> the path matching with what you specified as
>>>> env/dev/log4j-executor.properties?
>>>>
>>>> If you read the log4j document here:
>>>> https://logging.apache.org/log4j/1.2/manual.html
>>>>
>>>> When you specify the log4j.configuration=my_custom.properties, you have
>>>> 2 option:
>>>>
>>>> 1) the my_custom.properties has to be in the jar (or in the classpath).
>>>> In your case, since you specify the package path, you need to make sure
>>>> they are matched in your jar file
>>>> 2) use like log4j.configuration=file:///tmp/my_custom.properties. In
>>>> this way, you need to make sure file my_custom.properties exists in /tmp
>>>> folder on ALL of your worker nodes.
>>>>
>>>> Yong
>>>>
>>>> ------------------------------
>>>> Date: Wed, 13 Apr 2016 14:18:24 -0300
>>>> Subject: Re: Logging in executors
>>>> From: cma...@despegar.com
>>>> To: yuzhih...@gmail.com
>>>> CC: user@spark.apache.org
>>>>
>>>>
>>>> Thanks for your response Ted. You're right, there was a typo. I changed
>>>> it, now I'm executing:
>>>>
>>>> bin/spark-submit --master spark://localhost:7077 --conf
>>>> "spark.driver.extraJavaOptions=-Dlog4j.configuration=env/dev/log4j-driver.properties"
>>>> --conf
>>>> "spark.executor.extraJavaOptions=-Dlog4j.configuration=env/dev/log4j-executor.properties"
>>>> --class....
>>>>
>>>> The content of this file is:
>>>>
>>>> # Set everything to be logged to the console
>>>> log4j.rootCategory=INFO, FILE
>>>> log4j.appender.console=org.apache.log4j.ConsoleAppender
>>>> log4j.appender.console.target=System.err
>>>> log4j.appender.console.layout=org.apache.log4j.PatternLayout
>>>> log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss}
>>>> %p %c{1}: %m%n
>>>>
>>>> log4j.appender.FILE=org.apache.log4j.RollingFileAppender
>>>> log4j.appender.FILE.File=/tmp/executor.log
>>>> log4j.appender.FILE.ImmediateFlush=true
>>>> log4j.appender.FILE.Threshold=debug
>>>> log4j.appender.FILE.Append=true
>>>> log4j.appender.FILE.MaxFileSize=100MB
>>>> log4j.appender.FILE.MaxBackupIndex=5
>>>> log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
>>>> log4j.appender.FILE.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p
>>>> %c{1}: %m%n
>>>>
>>>> # Settings to quiet third party logs that are too verbose
>>>> log4j.logger.org.spark-project.jetty=WARN
>>>>
>>>> log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
>>>> log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
>>>> log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
>>>> log4j.logger.org.apache.parquet=ERROR
>>>> log4j.logger.parquet=ERROR
>>>> log4j.logger.com.despegar.p13n=DEBUG
>>>>
>>>> # SPARK-9183: Settings to avoid annoying messages when looking up
>>>> nonexistent UDFs in SparkSQL with Hive support
>>>> log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATAL
>>>> log4j.logger.org.apache.hadoop.hive.ql.exec.FunctionRegistry=ERROR
>>>>
>>>>
>>>> Finally, the code on which I'm using logging in the executor is:
>>>>
>>>> def groupAndCount(keys: DStream[(String, List[String])])(handler: 
>>>> ResultHandler) = {
>>>>
>>>>   val result = keys.reduceByKey((prior, current) => {
>>>>     (prior ::: current)
>>>>   }).flatMap {
>>>>     case (date, keys) =>
>>>>       val rs = keys.groupBy(x => x).map(
>>>>           obs =>{
>>>>             val (d,t) = date.split("@") match {
>>>>               case Array(d,t) => (d,t)
>>>>             }
>>>>             import org.apache.log4j.Logger
>>>>             import scala.collection.JavaConverters._
>>>>             val logger: Logger = Logger.getRootLogger
>>>>             logger.info(s"Metric retrieved $d")
>>>>             Metric("PV", d, obs._1, t, obs._2.size)
>>>>         }
>>>>       )
>>>>       rs
>>>>   }
>>>>
>>>>   result.foreachRDD((rdd: RDD[Metric], time: Time) => {
>>>>     handler(rdd, time)
>>>>   })
>>>>
>>>> }
>>>>
>>>>
>>>> Originally the import and logger object was outside the map function.
>>>> I'm also using the root logger just to see if it's working, but nothing
>>>> gets logged. I've checked that the property is set correctly on the
>>>> executor side through println(System.getProperty("log4j.configuration"))
>>>> and is OK, but still not working.
>>>>
>>>> Thanks again,
>>>> -carlos.
>>>>
>>>
>>>
>>
>

Re: Logging in executors

Reply via email to