On 9 nov 2010, at 10:53, Seva Gluschenko wrote:

> Well, Cfengine reports in deed that a file was corrupted in transfer.
> It doesn't report about replacing the old file, though, i.e. don't
> treat the message that seriously.
> 

? It could not login into the system due a corrupted file.

> 2010/11/9 Bas van der Vlies <b...@sara.nl>:
>> Maybe i am misreading the info. But the mail is about a file gets corrupted 
>> during a copy.  to my knowledge this should not happen when
>> 
>> On 9 nov 2010, at 10:22, Seva Gluschenko wrote:
>> 
>>> Of course, Cfengine3 acts the same way, file never gets installed
>>> directly in place of an older file.
>>> 
>> 
>> Maybe i am misreading the info. But the mail is about a file gets corrupted 
>> during a copy.  to my knowledge this should not happen when this mechanism 
>> is used.
>> 
>>> 2010/11/9 Bas van der Vlies <b...@sara.nl>:
>>>> 
>>>> On 9 nov 2010, at 08:37, Seva Gluschenko wrote:
>>>> 
>>>>> Frans,
>>>>> 
>>>>> since you're terminating cf-serverd in the middle of a file transfer,
>>>>> the receiving agent reasonably treats it as a corruption. There's
>>>>> nothing wrong with it. On the other hand, why terminating cf-serverd
>>>>> when you just need to restart cf-execd? Modify your promise and feel
>>>>> safe.
>>>>> 
>>>> 
>>>> I thought cfengine has some logic for transfering files: (i think this 
>>>> cfengine2 style, did not check it for cfengine3)
>>>>  * first copy it to <filename>.cfnew
>>>>  * if this succeed and it is correct move to <filename>
>>>> 
>>>> This is to avoid corruption like this. I server can crash and you don't 
>>>> want the clients to sufffer from this with file that are corrupted.
>>>> 
>>>> 
>>>>> 2010/11/8 Frans Lawaetz <fr...@broadinstitute.org>:
>>>>>> Hi-
>>>>>> 
>>>>>> I recently implemented a "service cfengine3 restart" weekly cron job as a
>>>>>> workaround to the MAX_FD bug that others and myself have seen.  I 
>>>>>> neglected
>>>>>> to except the master from the restart so when cf-serverd was killed a 
>>>>>> number
>>>>>> of hosts complained about in-flight transfers or not being able to reach 
>>>>>> the
>>>>>> master.  This is quite reasonable however I found one host that suffered 
>>>>>> a
>>>>>> complete loss or corruption of its limits.conf file.  It essentially 
>>>>>> bricked
>>>>>> the system, requiring a rebuild.
>>>>>> 
>>>>>> Here is the sequence:
>>>>>> 
>>>>>> cron job restarts cf3.  cf3 reports to syslog:
>>>>>> 
>>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  Received signal 15 
>>>>>> (SIGTERM)
>>>>>> while doing [lock.independent.server_cfengine.-cfengine3.the_server_d
>>>>>> aemon_2542_MD5=5b2c904169606aa9b27ec369fd13e016]
>>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  Received signal 15 
>>>>>> (SIGTERM)
>>>>>> while doing
>>>>>> [lock.independent.server_cfengine.-cfengine3.the_server_daemon_2542_MD5=5b2c904169606aa9b27ec369fd13e016]
>>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  Logical start time Fri Oct 
>>>>>> 29
>>>>>> 04:41:01 2010
>>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  This sub-task started 
>>>>>> really
>>>>>> at Thu Oct 28 12:28:34 2010
>>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  Logical start time Thu Oct 
>>>>>> 28
>>>>>> 12:28:34 2010
>>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  This sub-task started 
>>>>>> really
>>>>>> at Thu Oct 28 12:28:34 2010
>>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  Received signal 15 
>>>>>> (SIGTERM)
>>>>>> while doing [lock.independent.server_cfengine.-cfengine3.the_server_d
>>>>>> aemon_2542_MD5=5b2c904169606aa9b27ec369fd13e016]Nov  7 04:23:06 cfengine3
>>>>>> cf-serverd[14585]:  Logical start time Fri Oct 29 04:41:01 2010
>>>>>> Nov  7 04:23:06 cfengine3 cf-serverd[14585]:  This sub-task started 
>>>>>> really
>>>>>> at Thu Oct 28 12:28:34 2010
>>>>>> 
>>>>>> 
>>>>>> cf3 on the client host emailed me at approximately the time of the 
>>>>>> restart
>>>>>> that it failed to copy limits.conf
>>>>>> 
>>>>>> 
>>>>>> date: Sun, Nov 7, 2010 at 4:23 AM
>>>>>> subject: community [hap10.broadinstitute.org/192.168.32.34]
>>>>>> 
>>>>>> Was not able to copy /cfengine/farm/etc/security/limits.conf.crdwga to
>>>>>> /etc/security/limits.conf
>>>>>> I: Made in version 'not specified' of '/var/cfengine/inputs/farm.cf' near
>>>>>> line 279
>>>>>> 
>>>>>> 
>>>>>> I have noticed other similar such failures on other hosts before but cf3
>>>>>> usually makes a note that it aborted the transaction:
>>>>>> 
>>>>>> !! New file /etc/security/limits.conf.cfnew seems to have been corrupted 
>>>>>> in
>>>>>> transit (dest 0 and src 1844), aborting!
>>>>>> Was not able to copy /cfengine/farm/etc/security/limits.conf to
>>>>>> /etc/security/limits.conf
>>>>>> 
>>>>>> Immediately after the failure on the host in question it started 
>>>>>> reporting
>>>>>> over the network that limits.conf was corrupt.
>>>>>> 
>>>>>> Nov  7 04:23:02 hap10 crond[13650]: pam_limits(crond:session): cannot 
>>>>>> read
>>>>>> settings from /etc/security/limits.conf: No such file or directory
>>>>>> Nov  7 04:23:02 hap10 crond[13650]: pam_limits(crond:session): error 
>>>>>> parsing
>>>>>> the configuration file: '/etc/security/limits.conf'
>>>>>> 
>>>>>> I was of course unable to login to the system to investigate further so
>>>>>> rebuilt it.
>>>>>> 
>>>>>> I've since excepted the master from the weekly restart but I am alarmed 
>>>>>> that
>>>>>> there is a use case where cf-agent can corrupt a file.  Any ideas on how
>>>>>> this might have happened and whether there are any added safeguards that 
>>>>>> can
>>>>>> be put in place?
>>>>>> 
>>>>>> The client is running cfengine3-community 3.0.5 and the master is running
>>>>>> 3.1.0b2.  Both are on CentOS5.5 x86_64.
>>>>>> 
>>>>>> Thanks,
>>>>>> Frans
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Help-cfengine mailing list
>>>>>> Help-cfengine@cfengine.org
>>>>>> https://cfengine.org/mailman/listinfo/help-cfengine
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> SY, Seva Gluschenko.
>>>>> _______________________________________________
>>>>> Help-cfengine mailing list
>>>>> Help-cfengine@cfengine.org
>>>>> https://cfengine.org/mailman/listinfo/help-cfengine
>>>> 
>>>> --
>>>> Bas van der Vlies
>>>> b...@sara.nl
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> SY, Seva Gluschenko.
>> 
>> --
>> Bas van der Vlies
>> b...@sara.nl
>> 
>> 
>> 
>> 
> 
> 
> 
> -- 
> SY, Seva Gluschenko.

--
Bas van der Vlies
b...@sara.nl



_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to