RE: r1344347 - stream buffering in config_file.c

Bert Huijben Thu, 03 Apr 2014 12:36:07 -0700


> -----Original Message-----
> From: Julian Foad [mailto:julianf...@btopenworld.com]
> Sent: donderdag 3 april 2014 20:15
> To: Bert Huijben
> Cc: Ivan Zhakov; Subversion Development
> Subject: Re: r1344347 - stream buffering in config_file.c
> 
> >> URL: http://svn.apache.org/viewvc?rev=1344347&view=rev
> 
> >> Log:
> >> Remove about 15% of the cpu overhead of svnserve and httpd while
> running
> >> the test suite by introducing a very simple parser buffer in the config
file
> >> parser code.
> [...]
> >>  * subversion/libsvn_subr/config_file.c
> >>    (parse_context_t): Add buffer and position variables. Use EOF as no
> >> ungetc var.
> [...]
> >>  Modified: subversion/trunk/subversion/libsvn_subr/config_file.c
> >>
> ==========================================================
> ====================
> >>  +
> >>  +  /* Parser buffer for getc() to avoid call overhead into several
libraries
> >>  +     for every character */
> >>  +  char parser_buffer[SVN_STREAM_CHUNK_SIZE];
> [...]
> 
> This commit implements buffering of a generic stream, directly in the
config
> file parser. Would it not be better to implement buffering of a generic
stream
> as a generic module?


Perhaps...

But a patch like that didn't allow speeding up the test suite by 15% in one
afternoon. This was a very local patch with a very huge result. (And as it
was very local it is easy to remove it when somebody else optimizes it in a
different place). 

The stream api is optimized for using larger buffers while the config parser
is really specific in that it reads byte by byte. And the callback on
callback on callback wrapping (stream over svn_io over apr... none of them
inlineable by the compiler) was getting very expensive here, while there are
as far as I know no other users that have the same needs.
(Almost every other location uses per line parsing, which is already
optimized)

This was (at the time I applied the patch) by far the most expensive code in
both mod_dav_svn and svnserve when running the testsuite on Windows (on a
ramdrive), as we are reading the config files (e.g. fsfs.conf and the authz
files) over and over again.

I would guess this code should be easy to abstract... if/when there are more
parts of Subversion that need it, but I haven't found other performance
critical code paths that would need this.

And just abstracting it, just for abstracting will add more overhead... (The
current code is all in one C file and will be completely inlined by every
compiler... Which is impossible if we need 3 callback and/or shared library
calls for reading every byte... Even worse when that is taking out mutexes
such as the apr read code)

        Bert

RE: r1344347 - stream buffering in config_file.c

Reply via email to