Re: "was not able to copy file" - critical corruption during cf-agentrun

2010-11-10 Thread Mark Burgess
Can you prove that Cfengine corrupted the file? And can you show us the result of the corruption? Without this, the speculations about bugs in Cfengine are merely speculations. M On 11/09/2010 09:24 PM, Frans Lawaetz wrote: > So you are quite right that there is more to the story. I dug around

Re: "was not able to copy file" - critical corruption during cf-agentrun

2010-11-09 Thread Seva Gluschenko
Frans, can you show not just one mycopy promise (which I'm pretty sure is innocent with respect to that issue), but the second promise as well? >From my point of view it seems like the second promise broke the file while the first one was simply unable to repair it that unhappy time. 2010/11/9 Fr

Re: "was not able to copy file" - critical corruption during cf-agentrun

2010-11-09 Thread Frans Lawaetz
So you are quite right that there is more to the story. I dug around in my bundles and found that there was overlap with respect to this file. A generic "centos_5" promise included update of limits.conf whereas further down in the bundle I had a more specific class "centos_5.special_hosts" which

Re: "was not able to copy file" - critical corruption during cf-agentrun

2010-11-09 Thread Frans Lawaetz
Mike, cf-serverd was terminated by pkill during the cron restart of cf3 services. pkill defaults to SIGTERM. I will attempt to reproduce using a test environment and a looping cf-serverd / cf-agent script that sigterms cf-serverd at increasing time intervals after cf-agent executes. Frans

Re: "was not able to copy file" - critical corruption during cf-agentrun

2010-11-09 Thread Seva Gluschenko
I'm pretty sure there's some uncovered detail behind that issue. Usually that detail hides in promises. I understand that one rarely feels happy about showing their promises to the public since they can tell too much about the system, but this is the case when exact promises snapshot would really s

Re: "was not able to copy file" - critical corruption during cf-agentrun

2010-11-09 Thread Mike Hoskins
Apologies for email without enough coffee -- Please s/cf-execd/cf-serverd/g in my comments. I'd be mostly curious if this is easily reproducible, or simply an edge case that needs identified. On 11/9/10 9:16 AM, "Mike Hoskins" wrote: > If he'd kill -9'd cf-execd, I'd expect corruption. Since t

Re: "was not able to copy file" - critical corruption during cf-agentrun

2010-11-09 Thread Mike Hoskins
If he'd kill -9'd cf-execd, I'd expect corruption. Since the output he pasted looked like it was a signal 15, I would have expected it to be caught and cleaned up after (e.g. Finish in-progress transfers). Further, the cf2 behavior of copying to a temp file and then moving into place does still w