Hi Michael, I agree that the current default logging levels are too verbose and would welcome a review of the logs. At some scale, the logging is sure to have a performance impact and puts a lot of strain on any centralized log collection system people are running. When using centralized logging tools (ex ELK stack), I find the signal-to-noise ratio bogs down these tools making it harder to find what you are looking for.
Thanks, Chris On Fri, 19 Feb 2021 at 01:33, Michael Marshall <mikemars...@gmail.com> wrote: > Hello Pulsar Community, > > I'm running a Pulsar cluster with thousands of topics where each topic has > active producers and consumers that scale up and down dynamically depending > on load. > > The brokers are producing a ton of logs. Many come from the > "org.apache.pulsar.broker.service.ServerCnx" class. Anecdotally, in the > past 24 hours, my 5 node broker cluster has logged over 23,800,000 INFO log > lines from that class alone. From looking at the class, I can see that any > given connection gets several log lines in its life cycle (at least 2 on > connecting and 2 on closing), and there are other log lines in the class as > well. > > From my perspective, this level of detailed logging is a bit excessive. The > logging about normal, successful connection activity is not actionable for > me as an owner of a cluster with many producers/consumers, and it could be > hiding other, more important logs. > > Does anyone know the reasoning for this level of detailed INFO logging from > this class? I can see that these logs have been in the class for over 4 > years, but given that pulsar is supposed to scale to a million topics and > each producer/consumer needs its own connection to a broker, I wouldn't > expect this level of logging. If the community is open to it, I'd be happy > to submit a PR demonstrating the logs that I'd like to switch from INFO to > DEBUG level. > > I recognize that it's possible to filter the logs for just that class, but > I also think it's possible that most users running pulsar don't need this > level of detailed logging about connections to brokers, which is why I > wanted to start this discussion on the mailing list. > > It's relevant to note that the coding guide on the website ( > https://pulsar.apache.org/en/coding-guide/#logging-levels) mentions the > following about logging: > "INFO is the level you should assume the software will be run in. INFO > messages are things which are not bad but which the user will definitely > want to know about every time they occur." > > Personally, I don't "definitely want to know" a producer's or consumer's > connection status "every time" it changes. > > Thanks! > Michael Marshall >