Hi all! Thanks for the answers, this has been very helpful and we could set up a similar scheme using the Env variables.
Cheers, Gyula On Tue, Oct 15, 2019 at 9:55 AM Paul Lam <paullin3...@gmail.com> wrote: > +1 to Rong’s approach. We use a similar solution to the log context > problem > on YARN setups. FYI. > > WRT container contextual informations, we collection logs via ELK so that > the log file paths (which contains application id and container id) and > the host > are attached with the logs. But if you don’t want a new log collector, you > can > also use the system env variables in your log pattern. Flink sets the > container > informations into the system env variables, which could be found in the > container > launch script. > > WRT job contextual informations, we’ve tried MDC on task threads but it > ended > up with poor readability because Flink system threads are not set with the > MDC > variables (in my case user info), so now we use user name in system env as > the logger pattern variable instead. However, for job id/name, I’m afraid > that > they can not be found in the default system env variables. You may need > to find a way to set them into the system env or system properties. > > Best, > Paul Lam > > > 在 2019年10月15日,12:50,Rong Rong <walter...@gmail.com> 写道: > > > > Hi Gyula, > > > > Sorry for the late reply. I think it is definitely a challenge in terms > of > > log visibility. > > However, for your requirement I think you can customize your Flink job by > > utilizing a customized log formatter/encoder (e.g. log4j.properties or > > logback.xml) and a suitable logger implementation. > > > > One example you can follow is to provide customFields in your log > encoding > > [1,2] and utilizing a supported Appender to append your log to a file. > > You can also utilize a more customized appender to log the data into some > > external database (for example, ElasticSearch and access via Kibana). > > > > One challenge you might face is how to configure these contextual > > information dynamically. In our setup, these contextual information are > > configured as system env params when job launches. so loggers can > > dynamically resolve them during start time. > > > > Please let me know if any of the suggestions above helps. > > > > Cheers, > > Rong > > > > [1] > > > https://github.com/logstash/logstash-logback-encoder/blob/master/src/test/resources/logback-test.xml#L13 > > [2] https://github.com/logstash/logstash-logback-encoder > > > > On Thu, Oct 3, 2019 at 1:56 AM Gyula Fóra <gyula.f...@gmail.com> wrote: > > > >> Hi all! > >> > >> We have been thinking that it would be a great improvement to add > >> contextual information to the Flink logs: > >> > >> - Container / yarn / host info to JM/TM logs > >> - Job info (job id/ jobname) to task logs > >> > >> I this should be similar to how the metric scopes are set up and should > be > >> able to provide the same information for logs. Ideally it would be user > >> configurable. > >> > >> We are wondering what would be the best way to do this, and would like > to > >> ask for opinions or past experiences. > >> > >> Our natural first thought was setting NDC / MDC in the different threads > >> but it seems to be a somewhat fragile mechanism as it can be easily > >> "cleared" or deleted by the user. > >> > >> What do you think? > >> > >> Gyula > >> > >