Thanks for your replies Chris and Joshua.

I agree that it is challenging to find the right balance with log levels
and that it can come down to preference. However, I think it's telling that
you both have implemented work arounds because the logging is too verbose
right now. It's unlikely we'll find a perfect logging configuration, but I
do think it's worth making an effort for obvious places that are very noisy.

I know there has been discussion about user experience lately, and I think
actionable logging that isn't too noisy is essential to user experience,
especially for new pulsar users. In this case, I think connection level
logging should be left to the DEBUG level. It's not something most people
will want out of the box, and if they do, it is there to turn on. Further,
this type of logging could really impact the performance of the broker as
connections increase, and that is going to be true for certain types of log
filtering as well.

Based on Chris's comment, I'm wondering if a more thorough review of the
logs are required. If so, is that the type of work to get a PIP? For now, I
think I'll write up a potential solution (the ServerCxn class isn't too
big) to demonstrate the changes I have in mind. If this initiative gets
support, I'm happy to take a look at other classes.

Thanks!
Michael Marshall

On Fri, Feb 19, 2021 at 7:53 AM Joshua Odmark <j...@pandio.com> wrote:

> We have had the same issue Michael.
>
> In my experience log levels come down to opinion many times. It also comes
> down to use cases or implementations.
>
> Because of that, our solution was to create a module at our local collector
> that has the ability to rewrite log levels and in some cases filter the
> noise completely.
>
> The full log is kept local but rotated swiftly. The central logging gets
> the filtered set.
>
> This was the best solution from our perspective because it keeps the log
> levels the same for all current pulsar users but allows us to fine tune
> them for each of our installs.
>
> Hope this helps.
>
> On Fri, Feb 19, 2021 at 6:20 AM Chris Bartholomew <
> chris.bartholo...@kesque.com> wrote:
>
> > Hi Michael,
> >
> > I agree that the current default logging levels are too verbose and would
> > welcome a review of the logs. At some scale, the logging is sure to have
> a
> > performance impact and puts a lot of strain on any centralized log
> > collection system people are running. When using centralized logging
> tools
> > (ex ELK stack), I find the signal-to-noise ratio bogs down these tools
> > making it harder to find what you are looking for.
> >
> > Thanks,
> > Chris
> >
> > On Fri, 19 Feb 2021 at 01:33, Michael Marshall <mikemars...@gmail.com>
> > wrote:
> >
> > > Hello Pulsar Community,
> > >
> > > I'm running a Pulsar cluster with thousands of topics where each topic
> > has
> > > active producers and consumers that scale up and down dynamically
> > depending
> > > on load.
> > >
> > > The brokers are producing a ton of logs. Many come from the
> > > "org.apache.pulsar.broker.service.ServerCnx" class. Anecdotally, in the
> > > past 24 hours, my 5 node broker cluster has logged over 23,800,000 INFO
> > log
> > > lines from that class alone. From looking at the class, I can see that
> > any
> > > given connection gets several log lines in its life cycle (at least 2
> on
> > > connecting and 2 on closing), and there are other log lines in the
> class
> > as
> > > well.
> > >
> > > From my perspective, this level of detailed logging is a bit excessive.
> > The
> > > logging about normal, successful connection activity is not actionable
> > for
> > > me as an owner of a cluster with many producers/consumers, and it could
> > be
> > > hiding other, more important logs.
> > >
> > > Does anyone know the reasoning for this level of detailed INFO logging
> > from
> > > this class? I can see that these logs have been in the class for over 4
> > > years, but given that pulsar is supposed to scale to a million topics
> and
> > > each producer/consumer needs its own connection to a broker, I wouldn't
> > > expect this level of logging. If the community is open to it, I'd be
> > happy
> > > to submit a PR demonstrating the logs that I'd like to switch from INFO
> > to
> > > DEBUG level.
> > >
> > > I recognize that it's possible to filter the logs for just that class,
> > but
> > > I also think it's possible that most users running pulsar don't need
> this
> > > level of detailed logging about connections to brokers, which is why I
> > > wanted to start this discussion on the mailing list.
> > >
> > > It's relevant to note that the coding guide on the website (
> > > https://pulsar.apache.org/en/coding-guide/#logging-levels) mentions
> the
> > > following about logging:
> > > "INFO is the level you should assume the software will be run in. INFO
> > > messages are things which are not bad but which the user will
> definitely
> > > want to know about every time they occur."
> > >
> > > Personally, I don't "definitely want to know" a producer's or
> consumer's
> > > connection status "every time" it changes.
> > >
> > > Thanks!
> > > Michael Marshall
> > >
> >
>

Reply via email to