I've got a very, very flaky network and many remote hosts which phone home 
hourly to pick up puppet updates. Some will complete really quickly, others 
can take minutes for a do-nothing agent run. My server is 4.3 and clients 
are mostly 3.8.6 but some are 4.3 as well. A mix of Centos (6 & 7) and 
Fedora (21+).

Every so often, I get the "Could not retrieve file metadata for ... :end of 
file reached" error on clients. It's usually random -- some will run fine 
for a days, then suddenly exhibit this once or twice, then be fine again.

To try to get to a state where my errors actually mean something, I started 
cranking up the http_keepalive_timeout value. I'll readily admit that I'm 
not sure I completely understand how to bound it. I started at 30s, went to 
3m, and am now sitting on 30m on the server, 29m on agents.

How big should this be? Enough to encapsulate a complete successful run or 
the expected duration of a single file request? What's the downside of 
cranking this up? The affected file has changed since I raised the value so 
I think it's having some affect, but I'm also seeing more failures (though 
that may be a red herring if our network is acting up today).

What's a good guideline for properly settting this value?


-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-users/0fd32cf6-c4d4-46f4-8642-f200b0379db4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to