+1 

This sort of throttling ability is very important and useful to have. 

A minor suggestion on the implementation proposal - it feels like it’d far 
simpler to localize the change to the definition of Warning, Error etc instead 
of having to replace each and every call to those functions. ie Check if the 
throttling is enabled and call the throttled version inside the definition of 
the respective methods?

> On Oct 16, 2020, at 2:13 PM, Brian Neradt 
> <brian.ner...@verizonmedia.com.invalid> wrote:
> 
> dev@trafficserver.apache.org,
> 
> We've had issues with excessive logging under conditions where something
> basic goes wrong. For instance the log pipe mechanism will emit a warning
> or error log for every logging event if the pipe's buffer fills up or the
> reader goes down. This can result in thousands of log messages per second.
> 
> I propose we add throttled versions of our current logging functions that
> will emit a message on a certain periodicity (currently 60 seconds as a
> default):
> 
> ThrottledStatus
> ThrottledNote
> ThrottledWarning
> ThrottledError
> 
> I've created a draft PR with this implemented and used in a few places:
> https://github.com/apache/trafficserver/pull/7279
> 
> I've verified in our production simulation environment that this patch
> achieves throttling as desired. I disabled our log pipe reader, ran a few
> hundred thousand transactions for a minute, and the messages look like the
> following:
> 
> [Oct 16 17:52:55.097] [LOG_FLUSH] ERROR: Failed to write log to
> /some/path.pipe: [tried 298, wrote 0, Broken pipe]
> [Oct 16 17:53:55.280] [LOG_FLUSH] ERROR: Skipped the following message
> 257999 times.
> [Oct 16 17:53:55.280] [LOG_FLUSH] ERROR: Failed to write log
> to /some/path.pipe: [tried 298, wrote 0, Broken pipe]
> 
> Notice that rather than having about 260,000 log messages, the throttling
> results in three.
> 
> To use these new throttling functions, we simply change the Error, Warning,
> etc., log function call to the Throttled version:
> 
> -        Warning("File:%s was closed, have dropped (%d) bytes.",
> logfile->get_name(), total_bytes);
> +        ThrottledWarning("File:%s was closed, have dropped (%d) bytes.",
> logfile->get_name(), total_bytes);
> 
> This change should be pretty low risk since from an implementation
> standpoint only the messages we convert to these throttled versions are
> impacted.
> 
> Let me know if you have any thoughts or ideas for improvement, either with
> the design or the implementation.
> 
> Thanks!
> -- 
> Brian Neradt
> 
> 1908 S. First Street
> 
> Champaign, IL 61820

Reply via email to