On Mon, Jun 12, 2017 at 9:44 PM, John D. Ament <johndam...@apache.org> wrote:
> On Mon, Jun 12, 2017 at 9:24 PM Sam Ruby <ru...@intertwingly.net> wrote:
>
>> On Mon, Jun 12, 2017 at 9:06 PM, Sam Ruby <ru...@intertwingly.net> wrote:
>> > On Mon, Jun 12, 2017 at 7:59 PM, John D. Ament <johndam...@apache.org>
>> wrote:
>> >> On Mon, Jun 12, 2017 at 7:55 PM Sam Ruby <ru...@intertwingly.net>
>> wrote:
>> >>
>> >>> On Mon, Jun 12, 2017 at 7:44 PM,  <johndam...@apache.org> wrote:
>> >>> > ---
>> >>> >  lib/whimsy/asf/svn.rb         | 11 +++++++++++
>> >>> >  www/roster/public_podlings.rb |  7 ++++++-
>> >>> >  2 files changed, 17 insertions(+), 1 deletion(-)
>> >>> >
>> >>> > diff --git a/lib/whimsy/asf/svn.rb b/lib/whimsy/asf/svn.rb
>> >>> > index 134609c..64a596e 100644
>> >>> > --- a/lib/whimsy/asf/svn.rb
>> >>> > +++ b/lib/whimsy/asf/svn.rb
>> >>> > @@ -141,6 +141,17 @@ module ASF
>> >>> >        return revision, content
>> >>> >      end
>> >>> >
>> >>> > +    def self.updateSimple(path)
>> >>> > +      cmd = ['svn', 'update', path, '--non-interactive']
>> >>>
>> >>> This will undoubtedly fail as the $apache::user (www-data) does not
>> >>> have write access to those directories.
>> >>
>> >> Err so should we run cron as whimsysvn ?
>> >
>> > That's indeed possible, but then it probably can't write to the web
>> directory.
>> >
>> > Also from reading, bad things can happen if two processes are updating
>> > the same directory at the same time.  This can be fixed via file
>> > locking.  My gitpubsub logic solves this by running the puppet agent
>> > itself, and puppet ensures that there is only one agent running at one
>> > time.
>> >
>> > I learned all this the hard way on the original whimsy_vm where
>> > directories often got 'wedged' and needed manual intervention for
>> > cleanup.  That's why I instituted a hard separation between what can
>> > be updated in each process.
>>
>> Adding to my answer: this decision (which can be changed if that what
>> we collectively want to do) was to prefer slightly stale data over
>> data that (at best) might occasionally stop updating, and (at worst)
>> can become corrupt.
>>
>> The /srv/svn files update every 10 minutes.  For most purposes, that
>> is fast enough.
>>
>> Programs like the board agenda tool, the secretary mail tool, and now
>> the roster take great care to update svn in separate tmp directories.
>>
> This is a very valuable piece of information.  My main concern isn't roster
> but instead the podlings information.
>
> Shane and I were jokingly talking about this on hipchat - we should switch
> all of this to be pubsub.  I'm more convinced that this is correct.

You would still need to use flock(*) or equivalent, but definitely doable.

The code for pubsub is basically the same for svn as it is for git.
The only real difference is that the notification is 'commit' instead
of 'push'.

https://github.com/apache/whimsy/blob/master/tools/pubsub.rb

The other thing to be aware of is that pubsub is only available for
publicly readable sources.  So things like foundation and documents
can't be done this way.

> Where's the logic that clones/svn's in a tmp directory?

Plenty of places.  Here is one:

https://github.com/apache/whimsy/blob/master/www/roster/views/actions/ppmc.json.rb#L71

"git grep tmpdir" to find more.

>> - Sam Ruby

(*) https://ruby-doc.org/core-2.4.0/File.html#method-i-flock

Reply via email to