Yes, spark.yarn.historyServer.address is used to access the spark history server from yarn, it is not needed if you use only the yarn history server. It may be possible to have both history servers running, but I have not tried that yet.
Besides, as far as I have understood, yarn and spark history servers have two different purposes: - yarn history server is for looking at your application logs after it has finished - spark history server is for looking at your application in the spark web ui (the one with the "Stages", "Storage", "Environment" and "Executors") after it has finished Regards, Christophe. On 26/02/2015 20:30, Colin Kincaid Williams wrote: Right now I have set spark.yarn.historyServer.address in my spark configs to have yarn point to the spark-history server. Then from your mail it sounds like I should try another setting, or remove it completely. I also noticed that the aggregated log files appear in a directory in hdfs under application/spark vs. application/yarn or similar. I will review my configurations and see if I can get this working. Thanks, Colin Williams On Thu, Feb 26, 2015 at 9:11 AM, Christophe Préaud <christophe.pre...@kelkoo.com<mailto:christophe.pre...@kelkoo.com>> wrote: You can see this information in the yarn web UI using the configuration I provided in my former mail (click on the application id, then on logs; you will then be automatically redirected to the yarn history server UI). On 24/02/2015 19:49, Colin Kincaid Williams wrote: So back to my original question. I can see the spark logs using the example above: yarn logs -applicationId application_1424740955620_0009 This shows yarn log aggregation working. I can see the std out and std error in that container information above. Then how can I get this information in a web-ui ? Is this not currently supported? On Tue, Feb 24, 2015 at 10:44 AM, Imran Rashid <iras...@cloudera.com<mailto:iras...@cloudera.com>> wrote: the spark history server and the yarn history server are totally independent. Spark knows nothing about yarn logs, and vice versa, so unfortunately there isn't any way to get all the info in one place. On Tue, Feb 24, 2015 at 12:36 PM, Colin Kincaid Williams <disc...@uw.edu<mailto:disc...@uw.edu>> wrote: Looks like in my tired state, I didn't mention spark the whole time. However, it might be implied by the application log above. Spark log aggregation appears to be working, since I can run the yarn command above. I do have yarn logging setup for the yarn history server. I was trying to use the spark history-server, but maybe I should try setting spark.yarn.historyServer.address to the yarn history-server, instead of the spark history-server? I tried this configuration when I started, but didn't have much luck. Are you getting your spark apps run in yarn client or cluster mode in your yarn history server? If so can you share any spark settings? On Tue, Feb 24, 2015 at 8:48 AM, Christophe Préaud <christophe.pre...@kelkoo.com<mailto:christophe.pre...@kelkoo.com>> wrote: Hi Colin, Here is how I have configured my hadoop cluster to have yarn logs available through both the yarn CLI and the _yarn_ history server (with gzip compression and 10 days retention): 1. Add the following properties in the yarn-site.xml on each node managers and on the resource manager: <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>864000</value> </property> <property> <name>yarn.log.server.url</name> <value>http://dc1-kdp-dev-hadoop-03.dev.dc1.kelkoo.net:19888/jobhistory/logs</value> </property> <property> <name>yarn.nodemanager.log-aggregation.compression-type</name> <value>gz</value> </property> 2. Restart yarn and then start the yarn history server on the server defined in the yarn.log.server.url property above: /opt/hadoop/sbin/mr-jobhistory-daemon.sh stop historyserver # should fail if historyserver is not yet started /opt/hadoop/sbin/stop-yarn.sh /opt/hadoop/sbin/start-yarn.sh /opt/hadoop/sbin/mr-jobhistory-daemon.sh start historyserver It may be slightly different for you if the resource manager and the history server are not on the same machine. Hope it will work for you as well! Christophe. On 24/02/2015 06:31, Colin Kincaid Williams wrote: > Hi, > > I have been trying to get my yarn logs to display in the spark history-server > or yarn history-server. I can see the log information > > > yarn logs -applicationId application_1424740955620_0009 > 15/02/23 22:15:14 INFO client.ConfiguredRMFailoverProxyProvider: Failing over > to us3sm2hbqa04r07-comp-prod-local > > > Container: container_1424740955620_0009_01_000002 on > us3sm2hbqa07r07.comp.prod.local_8041 > =========================================================================================== > LogType: stderr > LogLength: 0 > Log Contents: > > LogType: stdout > LogLength: 897 > Log Contents: > [GC [PSYoungGen: 262656K->23808K(306176K)] 262656K->23880K(1005568K), > 0.0283450 secs] [Times: user=0.14 sys=0.03, real=0.03 secs] > Heap > PSYoungGen total 306176K, used 111279K [0x00000000eaa80000, > 0x0000000100000000, 0x0000000100000000) > eden space 262656K, 33% used > [0x00000000eaa80000,0x00000000effebbe0,0x00000000fab00000) > from space 43520K, 54% used > [0x00000000fab00000,0x00000000fc240320,0x00000000fd580000) > to space 43520K, 0% used > [0x00000000fd580000,0x00000000fd580000,0x0000000100000000) > ParOldGen total 699392K, used 72K [0x00000000bff80000, > 0x00000000eaa80000, 0x00000000eaa80000) > object space 699392K, 0% used > [0x00000000bff80000,0x00000000bff92010,0x00000000eaa80000) > PSPermGen total 35328K, used 34892K [0x00000000bad80000, > 0x00000000bd000000, 0x00000000bff80000) > object space 35328K, 98% used > [0x00000000bad80000,0x00000000bcf93088,0x00000000bd000000) > > > > Container: container_1424740955620_0009_01_000003 on > us3sm2hbqa09r09.comp.prod.local_8041 > =========================================================================================== > LogType: stderr > LogLength: 0 > Log Contents: > > LogType: stdout > LogLength: 896 > Log Contents: > [GC [PSYoungGen: 262656K->23725K(306176K)] 262656K->23797K(1005568K), > 0.0358650 secs] [Times: user=0.28 sys=0.04, real=0.04 secs] > Heap > PSYoungGen total 306176K, used 65712K [0x00000000eaa80000, > 0x0000000100000000, 0x0000000100000000) > eden space 262656K, 15% used > [0x00000000eaa80000,0x00000000ed380bf8,0x00000000fab00000) > from space 43520K, 54% used > [0x00000000fab00000,0x00000000fc22b4f8,0x00000000fd580000) > to space 43520K, 0% used > [0x00000000fd580000,0x00000000fd580000,0x0000000100000000) > ParOldGen total 699392K, used 72K [0x00000000bff80000, > 0x00000000eaa80000, 0x00000000eaa80000) > object space 699392K, 0% used > [0x00000000bff80000,0x00000000bff92010,0x00000000eaa80000) > PSPermGen total 29696K, used 29486K [0x00000000bad80000, > 0x00000000bca80000, 0x00000000bff80000) > object space 29696K, 99% used > [0x00000000bad80000,0x00000000bca4b838,0x00000000bca80000) > > > > Container: container_1424740955620_0009_01_000001 on > us3sm2hbqa09r09.comp.prod.local_8041 > =========================================================================================== > LogType: stderr > LogLength: 0 > Log Contents: > > LogType: stdout > LogLength: 21 > Log Contents: > Pi is roughly 3.1416 > > I can see some details for the application in the spark history-server at > this url > http://us3sm2hbqa04r07.comp.prod.local:18080/history/application_1424740955620_0009/jobs/ > . When running in spark-master mode, I can see the stdout and stderror > somewhere in the spark history-server. Then how do I get the information > which I see above into the Spark history-server ? Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org<mailto:user-h...@spark.apache.org> ________________________________ Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur. ________________________________ Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.