I've got a very, very flaky network and many remote hosts which phone home hourly to pick up puppet updates. Some will complete really quickly, others can take minutes for a do-nothing agent run. My server is 4.3 and clients are mostly 3.8.6 but some are 4.3 as well. A mix of Centos (6 & 7) and Fedora (21+).
Every so often, I get the "Could not retrieve file metadata for ... :end of file reached" error on clients. It's usually random -- some will run fine for a days, then suddenly exhibit this once or twice, then be fine again. To try to get to a state where my errors actually mean something, I started cranking up the http_keepalive_timeout value. I'll readily admit that I'm not sure I completely understand how to bound it. I started at 30s, went to 3m, and am now sitting on 30m on the server, 29m on agents. How big should this be? Enough to encapsulate a complete successful run or the expected duration of a single file request? What's the downside of cranking this up? The affected file has changed since I raised the value so I think it's having some affect, but I'm also seeing more failures (though that may be a red herring if our network is acting up today). What's a good guideline for properly settting this value? -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/0fd32cf6-c4d4-46f4-8642-f200b0379db4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.