Walt,

TSDEBUG uses diags_on_for_plugins and that looks to only be set on startup and 
then just calls TSDebug().  It doesn’t make sense to replace TSDebug with 
TSDEBUG.

I don’t see why you are trying to optimize just the plugin debug.  When debug 
is enabled it has a performance impact on the entire system, not just for 
plugins.

-Bryan


> On Jun 14, 2019, at 4:51 PM, Leif Hedstrom <zw...@apache.org> wrote:
> 
> 
> 
>> On Jun 14, 2019, at 5:18 PM, Walt Karas <wka...@verizonmedia.com.INVALID> 
>> wrote:
>> 
>> I got some crude performance numbers for TSDebug() and TSDEBUG()
>> when proxy.config.diags.debug.enabled is 0.  The good news is that
>> TSDebug() looks to take about 8 nanoseconds (the raw data is below).  So,
>> if we take 1000 requests per core per seconds as a ballpark estimate of TS
>> performance, it would take a lot of them to have a significant impact.  But
>> TSDebug() takes about 30 times as long as TSDEBUG().  So it seems
>> worthwhile to make the tiny effort to change TSDEBUG() so it won't swallow
>> a following 'else' clause.  I also think we should rename it to TSFastDbg().
> 
> The numbers sounds good, but why on earth keep the old Debug? Why would I 
> chose to use a slow API over a fast API? That just makes no sense.
> 
> if we do this, just change TSDebug().  If it’s incompatible, we document it 
> with appropriate warnings.
> 
> Does any of this affect / improve the core Debug() statements? Those are 
> probably 10x - 100x more critical to get fast, because they are *everywhere*.
> 
> — leif
> 
>> 
>> We could rename TSDebug() to TSDebug_(), and TSDEBUG() to TSDebug().  But I
>> think the very small benefit is outweighed by the potential risk.  TS seems
>> to be rife with race conditions, so a change that will alter the length of
>> lots of code paths could create yet more instability.
>> 
>> Here is the raw data (6 tests, 3 repetitions).  Done with
>> https://github.com/apache/trafficserver/pull/5638 .  I don't understand why
>> it's not more consistent.  If anyone knows a better way to time this,
>> please let me know.
>> 
>> Performance for TSDebug() with no parameters, 1000000000 repetitions,
>> microseconds user=7670000 system=20000
>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
>> microseconds user=260000 system=0
>> Performance for TSDebug() with parameters, 1000000000 repetitions,
>> microseconds user=8030000 system=50000
>> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
>> microseconds user=280000 system=0
>> Performance for TSDebug() with volatile parameters, 1000000000 repetitions,
>> microseconds user=8830000 system=30000
>> Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions,
>> microseconds user=270000 system=0
>> 
>> Performance for TSDebug() with no parameters, 1000000000 repetitions,
>> microseconds user=7520000 system=50000
>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
>> microseconds user=260000 system=0
>> Performance for TSDebug() with parameters, 1000000000 repetitions,
>> microseconds user=8040000 system=30000
>> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
>> microseconds user=260000 system=0
>> Performance for TSDebug() with volatile parameters, 1000000000 repetitions,
>> microseconds user=8450000 system=50000
>> Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions,
>> microseconds user=260000 system=0
>> 
>> Performance for TSDebug() with no parameters, 1000000000 repetitions,
>> microseconds user=7470000 system=80000
>> Performance for TSDEBUG() with no parameters, 1000000000 repetitions,
>> microseconds user=260000 system=0
>> Performance for TSDebug() with parameters, 1000000000 repetitions,
>> microseconds user=8120000 system=50000
>> Performance for TSDEBUG() with parameters, 1000000000 repetitions,
>> microseconds user=270000 system=0
>> Performance for TSDebug() with volatile parameters, 1000000000 repetitions,
>> microseconds user=8500000 system=60000
>> Performance for TSDEBUG() with volatile parameters, 1000000000 repetitions,
>> microseconds user=270000 system=0
>> 
>> On Wed, Jun 12, 2019 at 12:58 PM Walt Karas <wka...@verizonmedia.com> wrote:
>> 
>>> Since I'm still relatively new to this application, there are limits to
>>> what I can take on on my own initiative.  I don't see how anything I've
>>> proposed would interfere with any higher-priority work.
>>> 
>>> On Wed, Jun 12, 2019 at 12:41 PM Leif Hedstrom <zw...@apache.org> wrote:
>>> 
>>>> Also, I’d like to reiterate my call to arms on v9.0.0:
>>>> 
>>>>  Please, everyone lets focus on the things that are critical to ATS
>>>> v9.0.0 readiness!
>>>> 
>>>> 
>>>> This includes:
>>>> 
>>>>       1) Fixing all the critical bugs (and there’s a lot of them)
>>>> 
>>>>       2) Testing “master"
>>>> 
>>>>       3) Doing the incompatible changes that needs to be done before we
>>>> can release
>>>> 
>>>>       4) Removing features that we do not want to support for the next
>>>> 3+ years
>>>> 
>>>>       5) Make sure that v9.0.0 has the functionality that you
>>>> *absolutely* need to be able
>>>>       to use it
>>>> 
>>>> 
>>>> I kinda feel that several of these recent proposal do not fit into this,
>>>> and in fact, are a fair amount of distraction. I urge everyone to not go
>>>> down this route until we’ve got v9.0.0 stable and functional, and instead
>>>> focus on the v9.0.0 tasks at hand.  I don’t think this one for example fits
>>>> into any of those 5 points above.
>>>> 
>>>> My $.01.
>>>> 
>>>> — Leif
>>>> 
>>>> P.s
>>>>  https://github.com/apache/trafficserver/issues/5457
>>>> 
>>>> 
>>>>> On Jun 12, 2019, at 11:30 AM, Walt Karas <wka...@verizonmedia.com.INVALID>
>>>> wrote:
>>>>> 
>>>>> I would guess the performance increase would be well under 1%.  But
>>>> it's a
>>>>> fairly simple change.  The biggest performance increase I think would
>>>> come
>>>>> from avoiding a PCRE match, to see if the specific debug tag was
>>>> enabled.
>>>>> For some TSDebug calls, there can be many arguments to push, and
>>>> perhaps a
>>>>> fair amount of calculation to generate the correct arguments.
>>>>> 
>>>>> On Wed, Jun 12, 2019 at 12:21 PM Bryan Call <bc...@apache.org> wrote:
>>>>> 
>>>>>> I am surprised you are seeing this as a performance issue.  Looking at
>>>> the
>>>>>> code it looks like 1 function call and a couple pointer dereferences.
>>>>>> 
>>>>>> If you wanted to speed up TSDebug and make the code harder to maintain
>>>> you
>>>>>> could create a structure that looks like like DiagsConfig in ts.h and
>>>> then
>>>>>> set a pointer used in the API to look at the gobal DiagsConfig.  Then
>>>> you
>>>>>> could have a macro that would do basically what on() does.  This would
>>>>>> bypass a function call and would have 1 more pointer dereference than
>>>> your
>>>>>> example.
>>>>>> 
>>>>>> -Bryan
>>>>>> 
>>>>>> 
>>>>>>> On Jun 12, 2019, at 10:00 AM, Walt Karas <wka...@verizonmedia.com
>>>> .INVALID>
>>>>>> wrote:
>>>>>>> 
>>>>>>> What do you mean by "exposing diags"?
>>>>>>> 
>>>>>>> On Wed, Jun 12, 2019 at 11:50 AM Bryan Call <bc...@apache.org> wrote:
>>>>>>> 
>>>>>>>> I would rather see TSDebug be optimized and exposing diags instead of
>>>>>>>> creating a new type of debug.
>>>>>>>> 
>>>>>>>> -Bryan
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Jun 12, 2019, at 9:44 AM, Walt Karas <wka...@verizonmedia.com
>>>>>> .INVALID>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> I don't believe so.  The issue I'm trying to address is that there
>>>> is
>>>>>>>> still
>>>>>>>>> significant execution time overhead for debug output calls when
>>>>>>>>> proxy.config.diags.debug.enabled is 0.  This change would make the
>>>>>>>>> execution time overhead negligible (if TSFastDbg is used or TSDebug
>>>> is
>>>>>>>>> changed to be like the proposed TSFastDbg).  Another way of putting
>>>> it
>>>>>>>> is,
>>>>>>>>> I want to clean up the existing TSDEBUG alternative to TSDebug.
>>>>>>>>> 
>>>>>>>>> On Wed, Jun 12, 2019 at 11:31 AM Bryan Call <bc...@apache.org>
>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> How would this be controlled on the command line when running
>>>>>>>>>> trafic_server -T?
>>>>>>>>>> 
>>>>>>>>>> It looks like it is a departure from how debug tags and controlled
>>>>>> now.
>>>>>>>>>> 
>>>>>>>>>> -Bryan
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Jun 11, 2019, at 1:21 PM, Walt Karas <wka...@verizonmedia.com
>>>>>>>> .INVALID>
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Any +1's for simply replacing TSDebug with the proposed TSFastDbg
>>>> ?
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Apr 8, 2019 at 6:20 PM Leif Hedstrom <zw...@apache.org>
>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Apr 4, 2019, at 8:05 AM, Walt Karas <wka...@verizonmedia.com
>>>>>>>>>> .INVALID>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Because senility. probably "on" should be a non-pointer of type
>>>>>> const
>>>>>>>>>>>> volatile char, and InkAPI in the core can just keep a pointer to
>>>> the
>>>>>>>>>> whole
>>>>>>>>>>>> structure for updating.
>>>>>>>>>>>> 
>>>>>>>>>>>> But I'm leaning more to fixing up the capability that already
>>>> kinda
>>>>>>>>>> there.
>>>>>>>>>>>> Which is optimizimg only for the case when all debug output is
>>>>>>>> disabled.
>>>>>>>>>>>> Especially since maybe zwoop would be on my side for once :o) .
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> My bad, whatever Walt said, I take the opposite stance.
>>>>>>>>>>>> 
>>>>>>>>>>>> — leif
>>>>>>>>>>>> 
>>>>>>>>>>>> P.s
>>>>>>>>>>>> J/K!
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Thu, Apr 4, 2019 at 8:51 AM Alan Carroll
>>>>>>>>>>>> <solidwallofc...@verizonmedia.com.invalid> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> This could enable a significant speed up for debug tags. One
>>>> point
>>>>>> is
>>>>>>>>>> that
>>>>>>>>>>>> when the debug tag string is set, at that point the debug objects
>>>>>>>> could
>>>>>>>>>> be
>>>>>>>>>>>> updated to the correct state according to the debug tag string,
>>>>>> rather
>>>>>>>>>> than
>>>>>>>>>>>> checking it every time a debug message is logged.
>>>>>>>>>>>> 
>>>>>>>>>>>> I have to ask, why is there a pointer to the enable flag,
>>>> instead of
>>>>>>>>>> just
>>>>>>>>>>>> the enable flag? Your logic doesn't even check that flag, it
>>>> checks
>>>>>>>>>> whether
>>>>>>>>>>>> the pointer is null.
>>>>>>>>>>>> 
>>>>>>>>>>>> On Wed, Apr 3, 2019 at 11:52 AM Leif Hedstrom <zw...@apache.org>
>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Apr 3, 2019, at 10:47 AM, Walt Karas <wka...@verizonmedia.com
>>>>>>>>>>>> 
>>>>>>>>>>>> .INVALID>
>>>>>>>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> It looks like the TS API already has less granular version of
>>>> this:
>>>>>>>>>>>> 
>>>>>>>>>>>> extern int diags_on_for_plugins;
>>>>>>>>>>>> 
>>>>>>>>>>>> #define TSDEBUG             \
>>>>>>>>>>>> 
>>>>>>>>>>>> if (diags_on_for_plugins) \
>>>>>>>>>>>> 
>>>>>>>>>>>> TSDebug
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> A problem with this is if you write:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> if (flag)
>>>>>>>>>>>> 
>>>>>>>>>>>> TSDEBUG(yadayada);
>>>>>>>>>>>> 
>>>>>>>>>>>> else
>>>>>>>>>>>> 
>>>>>>>>>>>> do_something();
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> then the else will be associated with the if hidden in the macro.
>>>>>> I'd
>>>>>>>>>>>> prefer to change to something like:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> extern const volatile char TS_detail_dbg_output_enabled;
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> #define TSFastDbg(_TAG, ...) \
>>>>>>>>>>>> do { \
>>>>>>>>>>>> if (TS_detail_dbg_output_enabled) \
>>>>>>>>>>>> TSDebug(_TAG, __VA_ARGS__); \
>>>>>>>>>>>> 
>>>>>>>>>>>> } while(0)
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Yeh, that kinda sounds like a bug, doesn’t it?
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> The negative is that we'd only see the benefit when all debug
>>>> output
>>>>>>>>>>>> 
>>>>>>>>>>>> was
>>>>>>>>>>>> 
>>>>>>>>>>>> disabled.  But, in any case, that is the main benefit, to avoid
>>>>>>>> impacts
>>>>>>>>>>>> during normal operation.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Right, we should always optimize for the common cases, obviously.
>>>>>> When
>>>>>>>>>>>> there’s a decision to make, where you can either make it
>>>> “equally”
>>>>>>>>>>>> expensive always, or more expensive in a rare case, but cheaper
>>>> in
>>>>>> the
>>>>>>>>>>>> common case, we should likely chose the latter.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Looking in src/tscore/Diags.cc, it looks like the debug enable
>>>> flag
>>>>>> is
>>>>>>>>>>>> 
>>>>>>>>>>>> only
>>>>>>>>>>>> 
>>>>>>>>>>>> set at startup time.  So we wouldn't be able to enable TSFastDbg
>>>>>> using
>>>>>>>>>>>> traffic_cntl without adding complexity.
>>>>>>>>>>>> 
>>>>>>>>>>>> On Tue, Apr 2, 2019 at 5:48 PM Walt Karas <
>>>> wka...@verizonmedia.com>
>>>>>>>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Add this new structure to apidefs.h:
>>>>>>>>>>>> 
>>>>>>>>>>>> typedef struct TSFastDbgCntl_
>>>>>>>>>>>> {
>>>>>>>>>>>> const char * const tag; // nul-terminated string
>>>>>>>>>>>> const volatile char * const on; // pointer to 1-byte flag
>>>>>>>>>>>> }
>>>>>>>>>>>> TSFastDbgCntl;
>>>>>>>>>>>> 
>>>>>>>>>>>> Add this new API function, that returns a pointer to an instance
>>>> of
>>>>>>>>>>>> 
>>>>>>>>>>>> the
>>>>>>>>>>>> 
>>>>>>>>>>>> above structure:
>>>>>>>>>>>> 
>>>>>>>>>>>> TSFastDbgCntl * TSCreateFastDbgCntl(const char *tag);
>>>>>>>>>>>> 
>>>>>>>>>>>> Add this macro, which would be used in place of direct use of
>>>>>>>>>>>> 
>>>>>>>>>>>> TSDebug().
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> #define TSFastDbg(_FD_CNTL, ...) \
>>>>>>>>>>>> do { \
>>>>>>>>>>>> if ((_FD_CNTL)->on) \
>>>>>>>>>>>> TSDebug((_FD_CNTL)->tag, __VA_ARGS__); \
>>>>>>>>>>>> } while(0)
>>>>>>>>>>>> 
>>>>>>>>>>>> The first parameter to TSFastDbg() is a pointer that was
>>>> returned by
>>>>>>>>>>>> TSCreateFastDbgCntl().  The remaining parameters are printf
>>>>>> parameters
>>>>>>>>>>>> (format string and values matching format specifiers).  The core
>>>>>> would
>>>>>>>>>>>> 
>>>>>>>>>>>> be
>>>>>>>>>>>> 
>>>>>>>>>>>> responsible for changing the value of the 'on' fields in the
>>>>>>>>>>>> 
>>>>>>>>>>>> TSFastDbgCntl
>>>>>>>>>>>> 
>>>>>>>>>>>> instances.  For direct calls to TSDebug(), even if the tag is
>>>>>>>>>>>> 
>>>>>>>>>>>> disabled,
>>>>>>>>>>>> 
>>>>>>>>>>>> all
>>>>>>>>>>>> 
>>>>>>>>>>>> the parameters have to be prepared and pushed onto the stack.  As
>>>>>> was
>>>>>>>>>>>> discussed in IRC, you have to feel guilty about doing converting
>>>> a
>>>>>>>>>>>> string_view to a string and outputting it with c_str().  Because
>>>> the
>>>>>>>>>>>> 
>>>>>>>>>>>> code
>>>>>>>>>>>> 
>>>>>>>>>>>> to do this will be executed even if the debug tag is disabled.
>>>>>>>>>>>> Furthermore, for each call to TSDebug, it's necessary to do an
>>>>>>>>>>>> 
>>>>>>>>>>>> associative
>>>>>>>>>>>> 
>>>>>>>>>>>> lookup on the tag string to determine that output for the tag is
>>>>>>>>>>>> 
>>>>>>>>>>>> disabled.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> We would want to put equivalent capabilities in Diags.h in the
>>>> core
>>>>>>>>>>>> 
>>>>>>>>>>>> for
>>>>>>>>>>>> 
>>>>>>>>>>>> debug output.  The implementation is non-trivial, but I think it
>>>>>> would
>>>>>>>>>>>> 
>>>>>>>>>>>> take
>>>>>>>>>>>> 
>>>>>>>>>>>> at most a week including the Au test additions.
>>>>>>>>>>>> 
>>>>>>>>>>>> I looked at making TSFastDbg() an inline function rather than a
>>>>>> macro:
>>>>>>>>>>>> https://godbolt.org/z/IfVbBk
>>>>>>>>>>>> The opitimization works well if you compile the code with the
>>>> inline
>>>>>>>>>>>> function as C, but the optimization is poor if you compile it as
>>>>>> C++.
>>>>>>>>>>>> 
>>>>>>>>>>>> This
>>>>>>>>>>>> 
>>>>>>>>>>>> difference exists for both gcc and clang.  It's weird.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
> 

Reply via email to