Thank you for proposing this improvement, Wenhao. Changing the logging level dynamically at runtime is very useful when users are trying to debug their jobs. They can set the logging level to DEBUG and find out more details in the logs.
1. I'm wondering if we could add a REST API to query the current logging level? This API will be useful for users to get to know the current status of the logging level, especially for those who have their own job management platform. 2. Would it be better if we add a field to specify the target JobManager/TaskManager for the logconfig API? Currently, it seems the modified logging level will be applied to all components in the cluster. If we change the logging level to DEBUG, the overall size of logs may increase rapidly, especially for large-scale clusters. It may become a heavy burden for the disk usage. Adding a field to specify the target could minimize the impact. Users can only change the logging level for the TaskManager they are focusing on. Furthermore, if users want to change the logging level for all components, the target field can be set to "ALL". On Tue, Jan 11, 2022 at 12:27 AM Konstantin Knauf <kna...@apache.org> wrote: > Thank you for starting the discussion. Being able to change the logging > level at runtime is very valuable in my experience. > > Instead of introducing our own API (and eventually even persistence), could > we just periodically reload the log4j or logback configuration from the > environment/filesystem? I only quickly googled the topic and [1,2] suggest > that this might be possible? > > [1] https://stackoverflow.com/a/16216956/6422562? > [2] https://logback.qos.ch/manual/configuration.html#autoScan > > > > > > On Mon, Jan 10, 2022 at 5:10 PM Wenhao Ji <predator....@gmail.com> wrote: > > > Hi everyone, > > > > Hope you enjoyed the Holiday Season. > > > > I would like to start the discussion on the improvement purpose > > FLIP-210 [1] which aims to provide a way to change log levels at > > runtime to simplify issues and bugs detection as reported in the > > ticket FLINK-16478 [2]. > > Firstly, thanks Xingxing Di and xiaodao for their previous effort. The > > FLIP I drafted is largely influenced by their previous designs [3][4]. > > Although we have reached some agreements under the jira comments about > > the scope of this feature, we still have the following questions > > listed below ready to be discussed in this thread. > > > > ## Question 1 > > > > > Creating as custom DSL and implementing it for several logging backend > > sounds like quite a maintenance burden. Extensions to the DSL, and > > supported backends, could become quite an effort. (by Chesnay Schepler) > > > > I tried to design the API of the logging backend to stay away from the > > details of implementations but I did not find any slf4j-specific API > > that is available to change the log level of a logger. So what I did > > is to introduce another kind of abstraction on top of the slf4j / > > log4j / logback so that we will not depend on the logging provider's > > api directly. It will be convenient for us to adopt any other logging > > providers. Please see the "Logging Abstraction" section. > > > > ## Question 2 > > > > > Do we know whether other systems support this kind of feature? If yes, > > how do they solve it for different logging backends? (by Till Rohrmann) > > > > I investigated several Java frameworks including Spark, Storm, and > > Spring Boot. Here is what I found. > > Spark & Storm directly depend on the log4j implementations, which > > means they do not support any other slf4j implementation at all. They > > simply call the log4j api directly. (see SparkContext.scala#L381 [5], > > Utils.scala#L2443 [6] in Spark, and LogConfigManager.java#L144 [7] in > > Storm). They are pretty different from what Flink provides. > > However, I found Spring Boot has implemented what we are interested > > in. Just as Flink, Spring boot also supports many slf4j > > implementations. Users are not limited to log4j. They have the ability > > to declare different logging frameworks by importing certain > > dependencies. After that spring will decide the activated one by > > scanning its classpath and context. (see LoggingSystem.java#L164 [8] > > and LoggersEndpoint.java#L99 [9]) > > > > ## Question 3 > > > > Besides the questions raised in the jira comments, I also find another > > thing that has not been discussed. Considering this feature as an MVP, > > do we need to introduce a HighAvailabilityService to store the log > > settings so that they can be synced to newly-joined task managers and > > also job manager followers for consistency? This issue is included in > > the "Limitations" section in the flip. > > > > Finally, thanks for your time for joining this discussion and > > reviewing this FLIP. I would appreciate it if you could have any > > comments or suggestions on this. > > > > > > [1]: > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-210%3A+Change+logging+level+dynamically+at+runtime > > [2]: https://issues.apache.org/jira/browse/FLINK-16478 > > [3]: > > > https://docs.google.com/document/d/1Q02VSSBzlZaZzvxuChIo1uinw8KDQsyTZUut6_IDErY > > [4]: > > > https://docs.google.com/document/d/19AyuTHeERP6JKmtHYnCdBw29LnZpRkbTS7K12q4OfbA > > [5]: > > > https://github.com/apache/spark/blob/11596b3b17b5e0f54e104cd49b1397c33c34719d/core/src/main/scala/org/apache/spark/SparkContext.scala#L381 > > [6]: > > > https://github.com/apache/spark/blob/11596b3b17b5e0f54e104cd49b1397c33c34719d/core/src/main/scala/org/apache/spark/util/Utils.scala#L2433 > > [7]: > > > https://github.com/apache/storm/blob/3f96c249cbc17ce062491bfbb39d484e241ab168/storm-client/src/jvm/org/apache/storm/daemon/worker/LogConfigManager.java#L144 > > [8]: > > > https://github.com/spring-projects/spring-boot/blob/main/spring-boot-project/spring-boot/src/main/java/org/springframework/boot/logging/LoggingSystem.java#L164 > > [9]: > > > https://github.com/spring-projects/spring-boot/blob/main/spring-boot-project/spring-boot-actuator/src/main/java/org/springframework/boot/actuate/logging/LoggersEndpoint.java#L99 > > > > Thanks, > > Wenhao > > > > > -- > > Konstantin Knauf > > https://twitter.com/snntrable > > https://github.com/knaufk >