Walt, TSDEBUG uses diags_on_for_plugins and that looks to only be set on startup and then just calls TSDebug(). It doesn’t make sense to replace TSDebug with TSDEBUG.
I don’t see why you are trying to optimize just the plugin debug. When debug is enabled it has a performance impact on the entire system, not just for plugins. -Bryan > On Jun 14, 2019, at 4:51 PM, Leif Hedstrom <zw...@apache.org> wrote: > > > >> On Jun 14, 2019, at 5:18 PM, Walt Karas <wka...@verizonmedia.com.INVALID> >> wrote: >> >> I got some crude performance numbers for TSDebug() and TSDEBUG() >> when proxy.config.diags.debug.enabled is 0. The good news is that >> TSDebug() looks to take about 8 nanoseconds (the raw data is below). So, >> if we take 1000 requests per core per seconds as a ballpark estimate of TS >> performance, it would take a lot of them to have a significant impact. But >> TSDebug() takes about 30 times as long as TSDEBUG(). So it seems >> worthwhile to make the tiny effort to change TSDEBUG() so it won't swallow >> a following 'else' clause. I also think we should rename it to TSFastDbg(). > > The numbers sounds good, but why on earth keep the old Debug? Why would I > chose to use a slow API over a fast API? That just makes no sense. > > if we do this, just change TSDebug(). If it’s incompatible, we document it > with appropriate warnings. > > Does any of this affect / improve the core Debug() statements? Those are > probably 10x - 100x more critical to get fast, because they are *everywhere*. > > — leif > >> >> We could rename TSDebug() to TSDebug_(), and TSDEBUG() to TSDebug(). But I >> think the very small benefit is outweighed by the potential risk. TS seems >> to be rife with race conditions, so a change that will alter the length of >> lots of code paths could create yet more instability. >> >> Here is the raw data (6 tests, 3 repetitions). Done with >> https://github.com/apache/trafficserver/pull/5638 . I don't understand why >> it's not more consistent. If anyone knows a better way to time this, >> please let me know. >> >> Performance for TSDebug() with no parameters, 1000000000 repetitions, >> microseconds user=7670000 system=20000 >> Performance for TSDEBUG() with no parameters, 1000000000 repetitions, >> microseconds user=260000 system=0 >> Performance for TSDebug() with parameters, 1000000000 repetitions, >> microseconds user=8030000 system=50000 >> Performance for TSDEBUG() with parameters, 1000000000 repetitions, >> microseconds user=280000 system=0 >> Performance for TSDebug() with volatile parameters, 1000000000 repetitions, >> microseconds user=8830000 system=30000 >> Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions, >> microseconds user=270000 system=0 >> >> Performance for TSDebug() with no parameters, 1000000000 repetitions, >> microseconds user=7520000 system=50000 >> Performance for TSDEBUG() with no parameters, 1000000000 repetitions, >> microseconds user=260000 system=0 >> Performance for TSDebug() with parameters, 1000000000 repetitions, >> microseconds user=8040000 system=30000 >> Performance for TSDEBUG() with parameters, 1000000000 repetitions, >> microseconds user=260000 system=0 >> Performance for TSDebug() with volatile parameters, 1000000000 repetitions, >> microseconds user=8450000 system=50000 >> Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions, >> microseconds user=260000 system=0 >> >> Performance for TSDebug() with no parameters, 1000000000 repetitions, >> microseconds user=7470000 system=80000 >> Performance for TSDEBUG() with no parameters, 1000000000 repetitions, >> microseconds user=260000 system=0 >> Performance for TSDebug() with parameters, 1000000000 repetitions, >> microseconds user=8120000 system=50000 >> Performance for TSDEBUG() with parameters, 1000000000 repetitions, >> microseconds user=270000 system=0 >> Performance for TSDebug() with volatile parameters, 1000000000 repetitions, >> microseconds user=8500000 system=60000 >> Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions, >> microseconds user=270000 system=0 >> >> On Wed, Jun 12, 2019 at 12:58 PM Walt Karas <wka...@verizonmedia.com> wrote: >> >>> Since I'm still relatively new to this application, there are limits to >>> what I can take on on my own initiative. I don't see how anything I've >>> proposed would interfere with any higher-priority work. >>> >>> On Wed, Jun 12, 2019 at 12:41 PM Leif Hedstrom <zw...@apache.org> wrote: >>> >>>> Also, I’d like to reiterate my call to arms on v9.0.0: >>>> >>>> Please, everyone lets focus on the things that are critical to ATS >>>> v9.0.0 readiness! >>>> >>>> >>>> This includes: >>>> >>>> 1) Fixing all the critical bugs (and there’s a lot of them) >>>> >>>> 2) Testing “master" >>>> >>>> 3) Doing the incompatible changes that needs to be done before we >>>> can release >>>> >>>> 4) Removing features that we do not want to support for the next >>>> 3+ years >>>> >>>> 5) Make sure that v9.0.0 has the functionality that you >>>> *absolutely* need to be able >>>> to use it >>>> >>>> >>>> I kinda feel that several of these recent proposal do not fit into this, >>>> and in fact, are a fair amount of distraction. I urge everyone to not go >>>> down this route until we’ve got v9.0.0 stable and functional, and instead >>>> focus on the v9.0.0 tasks at hand. I don’t think this one for example fits >>>> into any of those 5 points above. >>>> >>>> My $.01. >>>> >>>> — Leif >>>> >>>> P.s >>>> https://github.com/apache/trafficserver/issues/5457 >>>> >>>> >>>>> On Jun 12, 2019, at 11:30 AM, Walt Karas <wka...@verizonmedia.com.INVALID> >>>> wrote: >>>>> >>>>> I would guess the performance increase would be well under 1%. But >>>> it's a >>>>> fairly simple change. The biggest performance increase I think would >>>> come >>>>> from avoiding a PCRE match, to see if the specific debug tag was >>>> enabled. >>>>> For some TSDebug calls, there can be many arguments to push, and >>>> perhaps a >>>>> fair amount of calculation to generate the correct arguments. >>>>> >>>>> On Wed, Jun 12, 2019 at 12:21 PM Bryan Call <bc...@apache.org> wrote: >>>>> >>>>>> I am surprised you are seeing this as a performance issue. Looking at >>>> the >>>>>> code it looks like 1 function call and a couple pointer dereferences. >>>>>> >>>>>> If you wanted to speed up TSDebug and make the code harder to maintain >>>> you >>>>>> could create a structure that looks like like DiagsConfig in ts.h and >>>> then >>>>>> set a pointer used in the API to look at the gobal DiagsConfig. Then >>>> you >>>>>> could have a macro that would do basically what on() does. This would >>>>>> bypass a function call and would have 1 more pointer dereference than >>>> your >>>>>> example. >>>>>> >>>>>> -Bryan >>>>>> >>>>>> >>>>>>> On Jun 12, 2019, at 10:00 AM, Walt Karas <wka...@verizonmedia.com >>>> .INVALID> >>>>>> wrote: >>>>>>> >>>>>>> What do you mean by "exposing diags"? >>>>>>> >>>>>>> On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org> wrote: >>>>>>> >>>>>>>> I would rather see TSDebug be optimized and exposing diags instead of >>>>>>>> creating a new type of debug. >>>>>>>> >>>>>>>> -Bryan >>>>>>>> >>>>>>>> >>>>>>>>> On Jun 12, 2019, at 9:44 AM, Walt Karas <wka...@verizonmedia.com >>>>>> .INVALID> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> I don't believe so. The issue I'm trying to address is that there >>>> is >>>>>>>> still >>>>>>>>> significant execution time overhead for debug output calls when >>>>>>>>> proxy.config.diags.debug.enabled is 0. This change would make the >>>>>>>>> execution time overhead negligible (if TSFastDbg is used or TSDebug >>>> is >>>>>>>>> changed to be like the proposed TSFastDbg). Another way of putting >>>> it >>>>>>>> is, >>>>>>>>> I want to clean up the existing TSDEBUG alternative to TSDebug. >>>>>>>>> >>>>>>>>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org> >>>> wrote: >>>>>>>>> >>>>>>>>>> How would this be controlled on the command line when running >>>>>>>>>> trafic_server -T? >>>>>>>>>> >>>>>>>>>> It looks like it is a departure from how debug tags and controlled >>>>>> now. >>>>>>>>>> >>>>>>>>>> -Bryan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <wka...@verizonmedia.com >>>>>>>> .INVALID> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Any +1's for simply replacing TSDebug with the proposed TSFastDbg >>>> ? >>>>>>>>>>> >>>>>>>>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org> >>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <wka...@verizonmedia.com >>>>>>>>>> .INVALID> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Because senility. probably "on" should be a non-pointer of type >>>>>> const >>>>>>>>>>>> volatile char, and InkAPI in the core can just keep a pointer to >>>> the >>>>>>>>>> whole >>>>>>>>>>>> structure for updating. >>>>>>>>>>>> >>>>>>>>>>>> But I'm leaning more to fixing up the capability that already >>>> kinda >>>>>>>>>> there. >>>>>>>>>>>> Which is optimizimg only for the case when all debug output is >>>>>>>> disabled. >>>>>>>>>>>> Especially since maybe zwoop would be on my side for once :o) . >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> My bad, whatever Walt said, I take the opposite stance. >>>>>>>>>>>> >>>>>>>>>>>> — leif >>>>>>>>>>>> >>>>>>>>>>>> P.s >>>>>>>>>>>> J/K! >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll >>>>>>>>>>>> <solidwallofc...@verizonmedia.com.invalid> wrote: >>>>>>>>>>>> >>>>>>>>>>>> This could enable a significant speed up for debug tags. One >>>> point >>>>>> is >>>>>>>>>> that >>>>>>>>>>>> when the debug tag string is set, at that point the debug objects >>>>>>>> could >>>>>>>>>> be >>>>>>>>>>>> updated to the correct state according to the debug tag string, >>>>>> rather >>>>>>>>>> than >>>>>>>>>>>> checking it every time a debug message is logged. >>>>>>>>>>>> >>>>>>>>>>>> I have to ask, why is there a pointer to the enable flag, >>>> instead of >>>>>>>>>> just >>>>>>>>>>>> the enable flag? Your logic doesn't even check that flag, it >>>> checks >>>>>>>>>> whether >>>>>>>>>>>> the pointer is null. >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org> >>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <wka...@verizonmedia.com >>>>>>>>>>>> >>>>>>>>>>>> .INVALID> >>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> It looks like the TS API already has less granular version of >>>> this: >>>>>>>>>>>> >>>>>>>>>>>> extern int diags_on_for_plugins; >>>>>>>>>>>> >>>>>>>>>>>> #define TSDEBUG \ >>>>>>>>>>>> >>>>>>>>>>>> if (diags_on_for_plugins) \ >>>>>>>>>>>> >>>>>>>>>>>> TSDebug >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> A problem with this is if you write: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> if (flag) >>>>>>>>>>>> >>>>>>>>>>>> TSDEBUG(yadayada); >>>>>>>>>>>> >>>>>>>>>>>> else >>>>>>>>>>>> >>>>>>>>>>>> do_something(); >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> then the else will be associated with the if hidden in the macro. >>>>>> I'd >>>>>>>>>>>> prefer to change to something like: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> extern const volatile char TS_detail_dbg_output_enabled; >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> #define TSFastDbg(_TAG, ...) \ >>>>>>>>>>>> do { \ >>>>>>>>>>>> if (TS_detail_dbg_output_enabled) \ >>>>>>>>>>>> TSDebug(_TAG, __VA_ARGS__); \ >>>>>>>>>>>> >>>>>>>>>>>> } while(0) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Yeh, that kinda sounds like a bug, doesn’t it? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The negative is that we'd only see the benefit when all debug >>>> output >>>>>>>>>>>> >>>>>>>>>>>> was >>>>>>>>>>>> >>>>>>>>>>>> disabled. But, in any case, that is the main benefit, to avoid >>>>>>>> impacts >>>>>>>>>>>> during normal operation. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Right, we should always optimize for the common cases, obviously. >>>>>> When >>>>>>>>>>>> there’s a decision to make, where you can either make it >>>> “equally” >>>>>>>>>>>> expensive always, or more expensive in a rare case, but cheaper >>>> in >>>>>> the >>>>>>>>>>>> common case, we should likely chose the latter. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable >>>> flag >>>>>> is >>>>>>>>>>>> >>>>>>>>>>>> only >>>>>>>>>>>> >>>>>>>>>>>> set at startup time. So we wouldn't be able to enable TSFastDbg >>>>>> using >>>>>>>>>>>> traffic_cntl without adding complexity. >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas < >>>> wka...@verizonmedia.com> >>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Add this new structure to apidefs.h: >>>>>>>>>>>> >>>>>>>>>>>> typedef struct TSFastDbgCntl_ >>>>>>>>>>>> { >>>>>>>>>>>> const char * const tag; // nul-terminated string >>>>>>>>>>>> const volatile char * const on; // pointer to 1-byte flag >>>>>>>>>>>> } >>>>>>>>>>>> TSFastDbgCntl; >>>>>>>>>>>> >>>>>>>>>>>> Add this new API function, that returns a pointer to an instance >>>> of >>>>>>>>>>>> >>>>>>>>>>>> the >>>>>>>>>>>> >>>>>>>>>>>> above structure: >>>>>>>>>>>> >>>>>>>>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag); >>>>>>>>>>>> >>>>>>>>>>>> Add this macro, which would be used in place of direct use of >>>>>>>>>>>> >>>>>>>>>>>> TSDebug(). >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> #define TSFastDbg(_FD_CNTL, ...) \ >>>>>>>>>>>> do { \ >>>>>>>>>>>> if ((_FD_CNTL)->on) \ >>>>>>>>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \ >>>>>>>>>>>> } while(0) >>>>>>>>>>>> >>>>>>>>>>>> The first parameter to TSFastDbg() is a pointer that was >>>> returned by >>>>>>>>>>>> TSCreateFastDbgCntl(). The remaining parameters are printf >>>>>> parameters >>>>>>>>>>>> (format string and values matching format specifiers). The core >>>>>> would >>>>>>>>>>>> >>>>>>>>>>>> be >>>>>>>>>>>> >>>>>>>>>>>> responsible for changing the value of the 'on' fields in the >>>>>>>>>>>> >>>>>>>>>>>> TSFastDbgCntl >>>>>>>>>>>> >>>>>>>>>>>> instances. For direct calls to TSDebug(), even if the tag is >>>>>>>>>>>> >>>>>>>>>>>> disabled, >>>>>>>>>>>> >>>>>>>>>>>> all >>>>>>>>>>>> >>>>>>>>>>>> the parameters have to be prepared and pushed onto the stack. As >>>>>> was >>>>>>>>>>>> discussed in IRC, you have to feel guilty about doing converting >>>> a >>>>>>>>>>>> string_view to a string and outputting it with c_str(). Because >>>> the >>>>>>>>>>>> >>>>>>>>>>>> code >>>>>>>>>>>> >>>>>>>>>>>> to do this will be executed even if the debug tag is disabled. >>>>>>>>>>>> Furthermore, for each call to TSDebug, it's necessary to do an >>>>>>>>>>>> >>>>>>>>>>>> associative >>>>>>>>>>>> >>>>>>>>>>>> lookup on the tag string to determine that output for the tag is >>>>>>>>>>>> >>>>>>>>>>>> disabled. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> We would want to put equivalent capabilities in Diags.h in the >>>> core >>>>>>>>>>>> >>>>>>>>>>>> for >>>>>>>>>>>> >>>>>>>>>>>> debug output. The implementation is non-trivial, but I think it >>>>>> would >>>>>>>>>>>> >>>>>>>>>>>> take >>>>>>>>>>>> >>>>>>>>>>>> at most a week including the Au test additions. >>>>>>>>>>>> >>>>>>>>>>>> I looked at making TSFastDbg() an inline function rather than a >>>>>> macro: >>>>>>>>>>>> https://godbolt.org/z/IfVbBk >>>>>>>>>>>> The opitimization works well if you compile the code with the >>>> inline >>>>>>>>>>>> function as C, but the optimization is poor if you compile it as >>>>>> C++. >>>>>>>>>>>> >>>>>>>>>>>> This >>>>>>>>>>>> >>>>>>>>>>>> difference exists for both gcc and clang. It's weird. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >