I've gone ahead and moved the default hooks-env to hooks-env.tmpl. We'll still try to open the file but we won't bother to read it in or parse a completely commented out file unless the user puts a file in place. The .tmpl is a common pattern with hooks already so it should add any confusion.
On Thu, Mar 28, 2013 at 2:33 PM, Bert Huijben <b...@qqmail.nl> wrote: > > >> -----Original Message----- >> From: Branko Čibej [mailto:br...@wandisco.com] >> Sent: donderdag 28 maart 2013 22:06 >> To: dev@subversion.apache.org >> Subject: Re: Opening the repository hooks environment file >> >> On 28.03.2013 21:50, Bert Huijben wrote: >> > >> >> -----Original Message----- >> >> From: MARTIN PHILIP [mailto:codematt...@ntlworld.com] On Behalf Of >> >> Philip Martin >> >> Sent: donderdag 28 maart 2013 19:32 >> >> To: Bert Huijben >> >> Cc: dev@subversion.apache.org >> >> Subject: Opening the repository hooks environment file >> >> >> >> "Bert Huijben" <b...@qqmail.nl> writes: >> >> >> >>> The reading of one file for each access to the repository is a more >> >>> than measurable slowdown when profiling operations. (Reading >> fsfs.conf >> >>> over and over again is one of the most expensive things apache worker >> >>> processes do when I profiled them. I think stefan2 optimized some of >> >>> this away) >> >> We have already picked up one new file on every access in 1.8: the hooks >> >> environment file. This appears to be opened and parsed for every time >> >> mod_dav_svn opens the repository, both read and write operations. >> >> >> >> Perhaps we should require an explict config setting to enable the hooks >> >> file so that we can avoid opening it when it is empty? Or perhaps we >> >> could make the opening/parsing lazy and delay it until running a hook >> >> thus avoiding it for read operations? >> > Would be nice if we can read it on first use (after the hook exists check?) >> > and then cache it. >> > >> > I'm not sure why this didn't turn op in the performance traces though. >> Maybe >> > because this file doesn't exist by default? >> >> Sure it does, "svnadmin create" will create a template. I think you're >> overestimating the cost of reading such small files; it'll mostly stay >> in RAM once it's been read, and parsing it is not all that expensive. > > I'm pretty sure you remember that with Subversion 1.6 running 'svn update' on > ^/subversion/branches took one and a half minutes on Windows to obtain the > directory locks, before it even started doing anything update related? > > (And this was without a virusscanner on a for that time fast desktop harddisk) > > Simple file operations may be cheap on one system (E.g. linux) but don't use > that as a measurement for other systems. That one and a half minute operation > took less than half a second on ext3/ext4. > And this problem existed on many Subversion releases without anybody > noticing... The common suggestion was "Windows is slow". > > Loading 'fsfs.conf' is a slow operation. Maybe because it is too big to fit > in a buffer; maybe because our parsing is inefficient and even more > inefficient on systems where we use "\r\n" as EOL, but it really is > performance critical on Windows. > > I don't say that we should stop reading fsfs.conf, but why should we > introduce a 'fs.conf' at the repos level for things that don't belong at that > level and most likely 99% of our users will never use? > > fsfs contains things like sharding information which are essential for > functionality. It is not a diagnostics configuration file. > > > > > Reading a very tiny file hundreds of thousands times during a typical ra-serf > test run as on our buildbot makes its impact huge. (Even when using a > ramdrive). This will certainly have an impact on production Subversion > servers with similar loads. > > But as noted earlier... our testsuite uses uncommon small files. > But the changes applied to the files might be in the more common space. > > > Bert >