Ah, cool. I see your changes in SVN. I was going about it the hard way. I was 
writing a patch that replaced malloc on systems that don't have GNU libc type 
malloc (aka malloc(0) returns a real pointer). I have seen this done in some 
GNU tools (like patch). I'll compile up the svn 675 and let ya know. 

Thanks!

cat malloc.c:

#if HAVE_CONFIG_H
# include <config.h>
#endif
#undef malloc

#include <sys/types.h>

char *malloc ();

/* Allocate an N-byte block of memory from the heap.
   If N is zero, allocate a 1-byte block.  */

char *
rpl_malloc (size_t n)
{
  if (n == 0)
    n = 1;
  return malloc (n);
}

On Dec 9, 2009, at 2:30 PM, Mark Burgess wrote:

> 
> Matt, could you try svn and see if this helps please.
> 
> Matt Richards wrote:
>> I am a little closer now. I added some debugging information in 
>> client_protocol.c (~line 338):
>> 
>> Debug("Receive counter challenge from server\n");
>> 
>> /* proposition S3 */
>> memset(in,0,CF_BUFSIZE);
>> encrypted_len = ReceiveTransaction(conn->sd,in,NULL);
>> 
>> if (encrypted_len < 0)
>>   {
>>   CfOut(cf_error,"","Protocol transaction sent illegal cipher length");
>>   return false;
>>   }
>> 
>> if ((decrypted_cchall = malloc(encrypted_len)) == NULL)
>>   {
>>   snprintf(MATT_MESS,CF_BUFSIZE,"memory failure 
>> TWO:encrypted_len:%d",encrypted_len);
>>   FatalError(MATT_MESS);
>>   }
>> 
>> cf-agent dies with FatalError:
>> Fatal cfengine error: memory failure TWO:encrypted_len:0
>> 
>> 
>> It appears that the encrypted_len is indeed zero on the challenge response 
>> to the policy host. On AIX, that will result in a NULL malloc - which in 
>> turn fatals with a memory error in cf-agent. From the timestamps, the client 
>> who fails first, then cf-serverd on the policy host core dumps two seconds 
>> later. 
>> 
>> I don't know enough about the SSL communication between client and host, so 
>> I need a little help here. Is it possible that a encrypted length can be 
>> zero?
>> 
>> 
>> 
>> On Dec 7, 2009, at 8:55 AM, Mark Burgess wrote:
>> 
>>> Perhaps you have access to some fancy tools, like purify, insight etc
>>> that might help debug this. It sounds like some kind of heap corruption.
>>> 
>>> M
>>> 
>>> Matt Richards wrote:
>>>> Well, I hate to say this, but I am still having this problem (svn 657
>>>> now). However, I am getting closer. When cf-serverd core dumps, I get
>>>> a corresponding "Fatal cfengine error: memory failure" on the client.
>>>> I am not sure which one dies first, but I am guessing the client
>>>> (cf-agent). I don't understand why it would get a memory failure, the
>>>> code is just doing a regular malloc, and the machines (random, never
>>>> the same one twice) in question have plenty of memory. I will dig
>>>> (pulling my soxs up) more, but it is just odd.
>> _______________________________________________
>> Help-cfengine mailing list
>> Help-cfengine@cfengine.org
>> https://cfengine.org/mailman/listinfo/help-cfengine
> 
> -- 
> Mark Burgess
> 
> -------------------------------------------------
> Professor of Network and System Administration
> Oslo University College, Norway
> 
> Personal Web: http://www.iu.hio.no/~mark
> Office Telf : +47 22453272
> -------------------------------------------------

_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to