With TSDEBUG there is no way to updated the global variable after you start up, even with traffic_ctl. A plugin can set the variable directly, but that seems hacky.
Debug should already be checking global variables to see if debug is enabled first before calling into PCRE. -Bryan > On Jun 17, 2019, at 12:35 PM, Walt Karas <wka...@verizonmedia.com.INVALID> > wrote: > > It looked like a simple fast change. Are you saying you can see that > TSDEBUG would not start printing debug messages if you enabled the global > flag with traffic_cntl ? > > An alternative would be to have the core functions that print the debug > messages check the global flag first before they call PCRE to check if the > specific tag is enabled. That would improvement would have effect for both > core and plugin debug messages. If wouldn't eliminate the parameter push > and function call overhead, but the testing indicates that is much less > significant. > > On Mon, Jun 17, 2019 at 2:01 PM Bryan Call <bc...@apache.org> wrote: > >> Walt, >> >> TSDEBUG uses diags_on_for_plugins and that looks to only be set on startup >> and then just calls TSDebug(). It doesn’t make sense to replace TSDebug >> with TSDEBUG. >> >> I don’t see why you are trying to optimize just the plugin debug. When >> debug is enabled it has a performance impact on the entire system, not just >> for plugins. >> >> -Bryan >> >> >>> On Jun 14, 2019, at 4:51 PM, Leif Hedstrom <zw...@apache.org> wrote: >>> >>> >>> >>>> On Jun 14, 2019, at 5:18 PM, Walt Karas <wka...@verizonmedia.com.INVALID> >> wrote: >>>> >>>> I got some crude performance numbers for TSDebug() and TSDEBUG() >>>> when proxy.config.diags.debug.enabled is 0. The good news is that >>>> TSDebug() looks to take about 8 nanoseconds (the raw data is below). >> So, >>>> if we take 1000 requests per core per seconds as a ballpark estimate of >> TS >>>> performance, it would take a lot of them to have a significant impact. >> But >>>> TSDebug() takes about 30 times as long as TSDEBUG(). So it seems >>>> worthwhile to make the tiny effort to change TSDEBUG() so it won't >> swallow >>>> a following 'else' clause. I also think we should rename it to >> TSFastDbg(). >>> >>> The numbers sounds good, but why on earth keep the old Debug? Why would >> I chose to use a slow API over a fast API? That just makes no sense. >>> >>> if we do this, just change TSDebug(). If it’s incompatible, we document >> it with appropriate warnings. >>> >>> Does any of this affect / improve the core Debug() statements? Those are >> probably 10x - 100x more critical to get fast, because they are >> *everywhere*. >>> >>> — leif >>> >>>> >>>> We could rename TSDebug() to TSDebug_(), and TSDEBUG() to TSDebug(). >> But I >>>> think the very small benefit is outweighed by the potential risk. TS >> seems >>>> to be rife with race conditions, so a change that will alter the length >> of >>>> lots of code paths could create yet more instability. >>>> >>>> Here is the raw data (6 tests, 3 repetitions). Done with >>>> https://github.com/apache/trafficserver/pull/5638 . I don't >> understand why >>>> it's not more consistent. If anyone knows a better way to time this, >>>> please let me know. >>>> >>>> Performance for TSDebug() with no parameters, 1000000000 repetitions, >>>> microseconds user=7670000 system=20000 >>>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions, >>>> microseconds user=260000 system=0 >>>> Performance for TSDebug() with parameters, 1000000000 repetitions, >>>> microseconds user=8030000 system=50000 >>>> Performance for TSDEBUG() with parameters, 1000000000 repetitions, >>>> microseconds user=280000 system=0 >>>> Performance for TSDebug() with volatile parameters, 1000000000 >> repetitions, >>>> microseconds user=8830000 system=30000 >>>> Performance for TSDEBUG() with volatile parameters, 1000000000 >> repetitions, >>>> microseconds user=270000 system=0 >>>> >>>> Performance for TSDebug() with no parameters, 1000000000 repetitions, >>>> microseconds user=7520000 system=50000 >>>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions, >>>> microseconds user=260000 system=0 >>>> Performance for TSDebug() with parameters, 1000000000 repetitions, >>>> microseconds user=8040000 system=30000 >>>> Performance for TSDEBUG() with parameters, 1000000000 repetitions, >>>> microseconds user=260000 system=0 >>>> Performance for TSDebug() with volatile parameters, 1000000000 >> repetitions, >>>> microseconds user=8450000 system=50000 >>>> Performance for TSDEBUG() with volatile parameters, 1000000000 >> repetitions, >>>> microseconds user=260000 system=0 >>>> >>>> Performance for TSDebug() with no parameters, 1000000000 repetitions, >>>> microseconds user=7470000 system=80000 >>>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions, >>>> microseconds user=260000 system=0 >>>> Performance for TSDebug() with parameters, 1000000000 repetitions, >>>> microseconds user=8120000 system=50000 >>>> Performance for TSDEBUG() with parameters, 1000000000 repetitions, >>>> microseconds user=270000 system=0 >>>> Performance for TSDebug() with volatile parameters, 1000000000 >> repetitions, >>>> microseconds user=8500000 system=60000 >>>> Performance for TSDEBUG() with volatile parameters, 1000000000 >> repetitions, >>>> microseconds user=270000 system=0 >>>> >>>> On Wed, Jun 12, 2019 at 12:58 PM Walt Karas <wka...@verizonmedia.com> >> wrote: >>>> >>>>> Since I'm still relatively new to this application, there are limits to >>>>> what I can take on on my own initiative. I don't see how anything I've >>>>> proposed would interfere with any higher-priority work. >>>>> >>>>> On Wed, Jun 12, 2019 at 12:41 PM Leif Hedstrom <zw...@apache.org> >> wrote: >>>>> >>>>>> Also, I’d like to reiterate my call to arms on v9.0.0: >>>>>> >>>>>> Please, everyone lets focus on the things that are critical to ATS >>>>>> v9.0.0 readiness! >>>>>> >>>>>> >>>>>> This includes: >>>>>> >>>>>> 1) Fixing all the critical bugs (and there’s a lot of them) >>>>>> >>>>>> 2) Testing “master" >>>>>> >>>>>> 3) Doing the incompatible changes that needs to be done before >> we >>>>>> can release >>>>>> >>>>>> 4) Removing features that we do not want to support for the next >>>>>> 3+ years >>>>>> >>>>>> 5) Make sure that v9.0.0 has the functionality that you >>>>>> *absolutely* need to be able >>>>>> to use it >>>>>> >>>>>> >>>>>> I kinda feel that several of these recent proposal do not fit into >> this, >>>>>> and in fact, are a fair amount of distraction. I urge everyone to not >> go >>>>>> down this route until we’ve got v9.0.0 stable and functional, and >> instead >>>>>> focus on the v9.0.0 tasks at hand. I don’t think this one for >> example fits >>>>>> into any of those 5 points above. >>>>>> >>>>>> My $.01. >>>>>> >>>>>> — Leif >>>>>> >>>>>> P.s >>>>>> https://github.com/apache/trafficserver/issues/5457 >>>>>> >>>>>> >>>>>>> On Jun 12, 2019, at 11:30 AM, Walt Karas <wka...@verizonmedia.com >> .INVALID> >>>>>> wrote: >>>>>>> >>>>>>> I would guess the performance increase would be well under 1%. But >>>>>> it's a >>>>>>> fairly simple change. The biggest performance increase I think would >>>>>> come >>>>>>> from avoiding a PCRE match, to see if the specific debug tag was >>>>>> enabled. >>>>>>> For some TSDebug calls, there can be many arguments to push, and >>>>>> perhaps a >>>>>>> fair amount of calculation to generate the correct arguments. >>>>>>> >>>>>>> On Wed, Jun 12, 2019 at 12:21 PM Bryan Call <bc...@apache.org> >> wrote: >>>>>>> >>>>>>>> I am surprised you are seeing this as a performance issue. Looking >> at >>>>>> the >>>>>>>> code it looks like 1 function call and a couple pointer >> dereferences. >>>>>>>> >>>>>>>> If you wanted to speed up TSDebug and make the code harder to >> maintain >>>>>> you >>>>>>>> could create a structure that looks like like DiagsConfig in ts.h >> and >>>>>> then >>>>>>>> set a pointer used in the API to look at the gobal DiagsConfig. >> Then >>>>>> you >>>>>>>> could have a macro that would do basically what on() does. This >> would >>>>>>>> bypass a function call and would have 1 more pointer dereference >> than >>>>>> your >>>>>>>> example. >>>>>>>> >>>>>>>> -Bryan >>>>>>>> >>>>>>>> >>>>>>>>> On Jun 12, 2019, at 10:00 AM, Walt Karas <wka...@verizonmedia.com >>>>>> .INVALID> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> What do you mean by "exposing diags"? >>>>>>>>> >>>>>>>>> On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org> >> wrote: >>>>>>>>> >>>>>>>>>> I would rather see TSDebug be optimized and exposing diags >> instead of >>>>>>>>>> creating a new type of debug. >>>>>>>>>> >>>>>>>>>> -Bryan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Jun 12, 2019, at 9:44 AM, Walt Karas <wka...@verizonmedia.com >>>>>>>> .INVALID> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> I don't believe so. The issue I'm trying to address is that >> there >>>>>> is >>>>>>>>>> still >>>>>>>>>>> significant execution time overhead for debug output calls when >>>>>>>>>>> proxy.config.diags.debug.enabled is 0. This change would make >> the >>>>>>>>>>> execution time overhead negligible (if TSFastDbg is used or >> TSDebug >>>>>> is >>>>>>>>>>> changed to be like the proposed TSFastDbg). Another way of >> putting >>>>>> it >>>>>>>>>> is, >>>>>>>>>>> I want to clean up the existing TSDEBUG alternative to TSDebug. >>>>>>>>>>> >>>>>>>>>>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org> >>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> How would this be controlled on the command line when running >>>>>>>>>>>> trafic_server -T? >>>>>>>>>>>> >>>>>>>>>>>> It looks like it is a departure from how debug tags and >> controlled >>>>>>>> now. >>>>>>>>>>>> >>>>>>>>>>>> -Bryan >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas < >> wka...@verizonmedia.com >>>>>>>>>> .INVALID> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Any +1's for simply replacing TSDebug with the proposed >> TSFastDbg >>>>>> ? >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org >>> >>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas < >> wka...@verizonmedia.com >>>>>>>>>>>> .INVALID> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Because senility. probably "on" should be a non-pointer of >> type >>>>>>>> const >>>>>>>>>>>>>> volatile char, and InkAPI in the core can just keep a pointer >> to >>>>>> the >>>>>>>>>>>> whole >>>>>>>>>>>>>> structure for updating. >>>>>>>>>>>>>> >>>>>>>>>>>>>> But I'm leaning more to fixing up the capability that already >>>>>> kinda >>>>>>>>>>>> there. >>>>>>>>>>>>>> Which is optimizimg only for the case when all debug output is >>>>>>>>>> disabled. >>>>>>>>>>>>>> Especially since maybe zwoop would be on my side for once :o) >> . >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> My bad, whatever Walt said, I take the opposite stance. >>>>>>>>>>>>>> >>>>>>>>>>>>>> — leif >>>>>>>>>>>>>> >>>>>>>>>>>>>> P.s >>>>>>>>>>>>>> J/K! >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll >>>>>>>>>>>>>> <solidwallofc...@verizonmedia.com.invalid> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> This could enable a significant speed up for debug tags. One >>>>>> point >>>>>>>> is >>>>>>>>>>>> that >>>>>>>>>>>>>> when the debug tag string is set, at that point the debug >> objects >>>>>>>>>> could >>>>>>>>>>>> be >>>>>>>>>>>>>> updated to the correct state according to the debug tag >> string, >>>>>>>> rather >>>>>>>>>>>> than >>>>>>>>>>>>>> checking it every time a debug message is logged. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have to ask, why is there a pointer to the enable flag, >>>>>> instead of >>>>>>>>>>>> just >>>>>>>>>>>>>> the enable flag? Your logic doesn't even check that flag, it >>>>>> checks >>>>>>>>>>>> whether >>>>>>>>>>>>>> the pointer is null. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom < >> zw...@apache.org> >>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas < >> wka...@verizonmedia.com >>>>>>>>>>>>>> >>>>>>>>>>>>>> .INVALID> >>>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> It looks like the TS API already has less granular version of >>>>>> this: >>>>>>>>>>>>>> >>>>>>>>>>>>>> extern int diags_on_for_plugins; >>>>>>>>>>>>>> >>>>>>>>>>>>>> #define TSDEBUG \ >>>>>>>>>>>>>> >>>>>>>>>>>>>> if (diags_on_for_plugins) \ >>>>>>>>>>>>>> >>>>>>>>>>>>>> TSDebug >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> A problem with this is if you write: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> if (flag) >>>>>>>>>>>>>> >>>>>>>>>>>>>> TSDEBUG(yadayada); >>>>>>>>>>>>>> >>>>>>>>>>>>>> else >>>>>>>>>>>>>> >>>>>>>>>>>>>> do_something(); >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> then the else will be associated with the if hidden in the >> macro. >>>>>>>> I'd >>>>>>>>>>>>>> prefer to change to something like: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> extern const volatile char TS_detail_dbg_output_enabled; >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> #define TSFastDbg(_TAG, ...) \ >>>>>>>>>>>>>> do { \ >>>>>>>>>>>>>> if (TS_detail_dbg_output_enabled) \ >>>>>>>>>>>>>> TSDebug(_TAG, __VA_ARGS__); \ >>>>>>>>>>>>>> >>>>>>>>>>>>>> } while(0) >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yeh, that kinda sounds like a bug, doesn’t it? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> The negative is that we'd only see the benefit when all debug >>>>>> output >>>>>>>>>>>>>> >>>>>>>>>>>>>> was >>>>>>>>>>>>>> >>>>>>>>>>>>>> disabled. But, in any case, that is the main benefit, to >> avoid >>>>>>>>>> impacts >>>>>>>>>>>>>> during normal operation. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Right, we should always optimize for the common cases, >> obviously. >>>>>>>> When >>>>>>>>>>>>>> there’s a decision to make, where you can either make it >>>>>> “equally” >>>>>>>>>>>>>> expensive always, or more expensive in a rare case, but >> cheaper >>>>>> in >>>>>>>> the >>>>>>>>>>>>>> common case, we should likely chose the latter. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable >>>>>> flag >>>>>>>> is >>>>>>>>>>>>>> >>>>>>>>>>>>>> only >>>>>>>>>>>>>> >>>>>>>>>>>>>> set at startup time. So we wouldn't be able to enable >> TSFastDbg >>>>>>>> using >>>>>>>>>>>>>> traffic_cntl without adding complexity. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas < >>>>>> wka...@verizonmedia.com> >>>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Add this new structure to apidefs.h: >>>>>>>>>>>>>> >>>>>>>>>>>>>> typedef struct TSFastDbgCntl_ >>>>>>>>>>>>>> { >>>>>>>>>>>>>> const char * const tag; // nul-terminated string >>>>>>>>>>>>>> const volatile char * const on; // pointer to 1-byte flag >>>>>>>>>>>>>> } >>>>>>>>>>>>>> TSFastDbgCntl; >>>>>>>>>>>>>> >>>>>>>>>>>>>> Add this new API function, that returns a pointer to an >> instance >>>>>> of >>>>>>>>>>>>>> >>>>>>>>>>>>>> the >>>>>>>>>>>>>> >>>>>>>>>>>>>> above structure: >>>>>>>>>>>>>> >>>>>>>>>>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag); >>>>>>>>>>>>>> >>>>>>>>>>>>>> Add this macro, which would be used in place of direct use of >>>>>>>>>>>>>> >>>>>>>>>>>>>> TSDebug(). >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> #define TSFastDbg(_FD_CNTL, ...) \ >>>>>>>>>>>>>> do { \ >>>>>>>>>>>>>> if ((_FD_CNTL)->on) \ >>>>>>>>>>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \ >>>>>>>>>>>>>> } while(0) >>>>>>>>>>>>>> >>>>>>>>>>>>>> The first parameter to TSFastDbg() is a pointer that was >>>>>> returned by >>>>>>>>>>>>>> TSCreateFastDbgCntl(). The remaining parameters are printf >>>>>>>> parameters >>>>>>>>>>>>>> (format string and values matching format specifiers). The >> core >>>>>>>> would >>>>>>>>>>>>>> >>>>>>>>>>>>>> be >>>>>>>>>>>>>> >>>>>>>>>>>>>> responsible for changing the value of the 'on' fields in the >>>>>>>>>>>>>> >>>>>>>>>>>>>> TSFastDbgCntl >>>>>>>>>>>>>> >>>>>>>>>>>>>> instances. For direct calls to TSDebug(), even if the tag is >>>>>>>>>>>>>> >>>>>>>>>>>>>> disabled, >>>>>>>>>>>>>> >>>>>>>>>>>>>> all >>>>>>>>>>>>>> >>>>>>>>>>>>>> the parameters have to be prepared and pushed onto the >> stack. As >>>>>>>> was >>>>>>>>>>>>>> discussed in IRC, you have to feel guilty about doing >> converting >>>>>> a >>>>>>>>>>>>>> string_view to a string and outputting it with c_str(). >> Because >>>>>> the >>>>>>>>>>>>>> >>>>>>>>>>>>>> code >>>>>>>>>>>>>> >>>>>>>>>>>>>> to do this will be executed even if the debug tag is disabled. >>>>>>>>>>>>>> Furthermore, for each call to TSDebug, it's necessary to do an >>>>>>>>>>>>>> >>>>>>>>>>>>>> associative >>>>>>>>>>>>>> >>>>>>>>>>>>>> lookup on the tag string to determine that output for the tag >> is >>>>>>>>>>>>>> >>>>>>>>>>>>>> disabled. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> We would want to put equivalent capabilities in Diags.h in the >>>>>> core >>>>>>>>>>>>>> >>>>>>>>>>>>>> for >>>>>>>>>>>>>> >>>>>>>>>>>>>> debug output. The implementation is non-trivial, but I think >> it >>>>>>>> would >>>>>>>>>>>>>> >>>>>>>>>>>>>> take >>>>>>>>>>>>>> >>>>>>>>>>>>>> at most a week including the Au test additions. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I looked at making TSFastDbg() an inline function rather than >> a >>>>>>>> macro: >>>>>>>>>>>>>> https://godbolt.org/z/IfVbBk >>>>>>>>>>>>>> The opitimization works well if you compile the code with the >>>>>> inline >>>>>>>>>>>>>> function as C, but the optimization is poor if you compile it >> as >>>>>>>> C++. >>>>>>>>>>>>>> >>>>>>>>>>>>>> This >>>>>>>>>>>>>> >>>>>>>>>>>>>> difference exists for both gcc and clang. It's weird. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>> >> >>