Well, Cfengine reports in deed that a file was corrupted in transfer.
It doesn't report about replacing the old file, though, i.e. don't
treat the message that seriously.

2010/11/9 Bas van der Vlies <b...@sara.nl>:
> Maybe i am misreading the info. But the mail is about a file gets corrupted 
> during a copy.  to my knowledge this should not happen when
>
> On 9 nov 2010, at 10:22, Seva Gluschenko wrote:
>
>> Of course, Cfengine3 acts the same way, file never gets installed
>> directly in place of an older file.
>>
>
> Maybe i am misreading the info. But the mail is about a file gets corrupted 
> during a copy.  to my knowledge this should not happen when this mechanism is 
> used.
>
>> 2010/11/9 Bas van der Vlies <b...@sara.nl>:
>>>
>>> On 9 nov 2010, at 08:37, Seva Gluschenko wrote:
>>>
>>>> Frans,
>>>>
>>>> since you're terminating cf-serverd in the middle of a file transfer,
>>>> the receiving agent reasonably treats it as a corruption. There's
>>>> nothing wrong with it. On the other hand, why terminating cf-serverd
>>>> when you just need to restart cf-execd? Modify your promise and feel
>>>> safe.
>>>>
>>>
>>> I thought cfengine has some logic for transfering files: (i think this 
>>> cfengine2 style, did not check it for cfengine3)
>>>  * first copy it to <filename>.cfnew
>>>  * if this succeed and it is correct move to <filename>
>>>
>>> This is to avoid corruption like this. I server can crash and you don't 
>>> want the clients to sufffer from this with file that are corrupted.
>>>
>>>
>>>> 2010/11/8 Frans Lawaetz <fr...@broadinstitute.org>:
>>>>> Hi-
>>>>>
>>>>> I recently implemented a "service cfengine3 restart" weekly cron job as a
>>>>> workaround to the MAX_FD bug that others and myself have seen.  I 
>>>>> neglected
>>>>> to except the master from the restart so when cf-serverd was killed a 
>>>>> number
>>>>> of hosts complained about in-flight transfers or not being able to reach 
>>>>> the
>>>>> master.  This is quite reasonable however I found one host that suffered a
>>>>> complete loss or corruption of its limits.conf file.  It essentially 
>>>>> bricked
>>>>> the system, requiring a rebuild.
>>>>>
>>>>> Here is the sequence:
>>>>>
>>>>> cron job restarts cf3.  cf3 reports to syslog:
>>>>>
>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  Received signal 15 (SIGTERM)
>>>>> while doing [lock.independent.server_cfengine.-cfengine3.the_server_d
>>>>> aemon_2542_MD5=5b2c904169606aa9b27ec369fd13e016]
>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  Received signal 15 (SIGTERM)
>>>>> while doing
>>>>> [lock.independent.server_cfengine.-cfengine3.the_server_daemon_2542_MD5=5b2c904169606aa9b27ec369fd13e016]
>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  Logical start time Fri Oct 
>>>>> 29
>>>>> 04:41:01 2010
>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  This sub-task started really
>>>>> at Thu Oct 28 12:28:34 2010
>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  Logical start time Thu Oct 
>>>>> 28
>>>>> 12:28:34 2010
>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  This sub-task started really
>>>>> at Thu Oct 28 12:28:34 2010
>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  Received signal 15 (SIGTERM)
>>>>> while doing [lock.independent.server_cfengine.-cfengine3.the_server_d
>>>>> aemon_2542_MD5=5b2c904169606aa9b27ec369fd13e016]Nov  7 04:23:06 cfengine3
>>>>> cf-serverd[14585]:  Logical start time Fri Oct 29 04:41:01 2010
>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  This sub-task started really
>>>>> at Thu Oct 28 12:28:34 2010
>>>>>
>>>>>
>>>>> cf3 on the client host emailed me at approximately the time of the restart
>>>>> that it failed to copy limits.conf
>>>>>
>>>>>
>>>>> date: Sun, Nov 7, 2010 at 4:23 AM
>>>>> subject: community [hap10.broadinstitute.org/192.168.32.34]
>>>>>
>>>>> Was not able to copy /cfengine/farm/etc/security/limits.conf.crdwga to
>>>>> /etc/security/limits.conf
>>>>> I: Made in version 'not specified' of '/var/cfengine/inputs/farm.cf' near
>>>>> line 279
>>>>>
>>>>>
>>>>> I have noticed other similar such failures on other hosts before but cf3
>>>>> usually makes a note that it aborted the transaction:
>>>>>
>>>>> !! New file /etc/security/limits.conf.cfnew seems to have been corrupted 
>>>>> in
>>>>> transit (dest 0 and src 1844), aborting!
>>>>> Was not able to copy /cfengine/farm/etc/security/limits.conf to
>>>>> /etc/security/limits.conf
>>>>>
>>>>> Immediately after the failure on the host in question it started reporting
>>>>> over the network that limits.conf was corrupt.
>>>>>
>>>>> Nov  7 04:23:02 hap10 crond[13650]: pam_limits(crond:session): cannot read
>>>>> settings from /etc/security/limits.conf: No such file or directory
>>>>> Nov  7 04:23:02 hap10 crond[13650]: pam_limits(crond:session): error 
>>>>> parsing
>>>>> the configuration file: '/etc/security/limits.conf'
>>>>>
>>>>> I was of course unable to login to the system to investigate further so
>>>>> rebuilt it.
>>>>>
>>>>> I've since excepted the master from the weekly restart but I am alarmed 
>>>>> that
>>>>> there is a use case where cf-agent can corrupt a file.  Any ideas on how
>>>>> this might have happened and whether there are any added safeguards that 
>>>>> can
>>>>> be put in place?
>>>>>
>>>>> The client is running cfengine3-community 3.0.5 and the master is running
>>>>> 3.1.0b2.  Both are on CentOS5.5 x86_64.
>>>>>
>>>>> Thanks,
>>>>> Frans
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Help-cfengine mailing list
>>>>> Help-cfengine@cfengine.org
>>>>> https://cfengine.org/mailman/listinfo/help-cfengine
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> SY, Seva Gluschenko.
>>>> _______________________________________________
>>>> Help-cfengine mailing list
>>>> Help-cfengine@cfengine.org
>>>> https://cfengine.org/mailman/listinfo/help-cfengine
>>>
>>> --
>>> Bas van der Vlies
>>> b...@sara.nl
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> SY, Seva Gluschenko.
>
> --
> Bas van der Vlies
> b...@sara.nl
>
>
>
>



-- 
SY, Seva Gluschenko.
_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to