On Wed, 2011-01-26 at 14:36 -0800, Daniel Pittman wrote:
> On Wed, Jan 26, 2011 at 13:56, Jason Wright <jwri...@google.com> wrote:
> > On Wed, Jan 26, 2011 at 1:17 PM, Daniel Pittman <dan...@puppetlabs.com> 
> > wrote:
> >
> >> For what it is worth I have been looking at this quietly in the
> >> background, and come to the conclusion that to progress further I am
> >> going to have to either reproduce this myself (failed, so far), or get
> >> a bit of state instrumentation into that code to track down exactly
> >> what conditions are being hit to trigger the failure.
> >
> > I haven't been able to reproduce it either.  So far, I've tried
> > annexing a bunch of machines and running puppetd in a tight loop
> > against an otherwise idle puppetmaster VM and I can get the rate of
> > API calls and catalog compiles up to the correct level for one of our
> > busy VMs, but no 500s (or even 400s) so far.  If this fails, I have
> > some code which fetches pluginsync metadata and then proceeeds to make
> > fileserver calls for every .rb listed.  I'll start using that generate
> > traffic, since these are the sorts of operations which get the most
> > errors.
> >
> >> Sounds like a good next step might be for y'all to let me know when
> >> you might look at being able to do that instrumentation, and I can try
> >> and send you a satisfactory patch to trial?
> >
> > What instrumentation would you be looking for?
> 
> Specifically, around the "not mounted" fault, in the 'splitpath'
> method, identify what the value of 'mount' in the outer 'unless' is,
> and what @mounts and mount_name contain.  My hope would be to use that
> to narrow down the possible causes, and either confirm or eliminate a
> thread race or something.

There are some thread races in this codepath: 
* we currently know that all cached_attrs (and splitpath uses one
through the module accessor of the environment) are subject to a thread
race in 0.25.
* there is another one when reading fileserver.conf (in readconfig).

But since normally passenger should make sure there is only one thread
in a given running puppet process we should be immune.

> I doubt that will be the complete data set, but it should help move
> forward.  Annoyingly, I don't have a super-solid picture of what the
> problem is at this stage, because it looks like it shouldn't be
> possible to hit the situation but, clearly, it is getting there...

Yes, so we're certainly missing something, and instrumenting this
codepath will help understand the root cause.
-- 
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.

Reply via email to