Ah, cool. I see your changes in SVN. I was going about it the hard way. I was writing a patch that replaced malloc on systems that don't have GNU libc type malloc (aka malloc(0) returns a real pointer). I have seen this done in some GNU tools (like patch). I'll compile up the svn 675 and let ya know.
Thanks! cat malloc.c: #if HAVE_CONFIG_H # include <config.h> #endif #undef malloc #include <sys/types.h> char *malloc (); /* Allocate an N-byte block of memory from the heap. If N is zero, allocate a 1-byte block. */ char * rpl_malloc (size_t n) { if (n == 0) n = 1; return malloc (n); } On Dec 9, 2009, at 2:30 PM, Mark Burgess wrote: > > Matt, could you try svn and see if this helps please. > > Matt Richards wrote: >> I am a little closer now. I added some debugging information in >> client_protocol.c (~line 338): >> >> Debug("Receive counter challenge from server\n"); >> >> /* proposition S3 */ >> memset(in,0,CF_BUFSIZE); >> encrypted_len = ReceiveTransaction(conn->sd,in,NULL); >> >> if (encrypted_len < 0) >> { >> CfOut(cf_error,"","Protocol transaction sent illegal cipher length"); >> return false; >> } >> >> if ((decrypted_cchall = malloc(encrypted_len)) == NULL) >> { >> snprintf(MATT_MESS,CF_BUFSIZE,"memory failure >> TWO:encrypted_len:%d",encrypted_len); >> FatalError(MATT_MESS); >> } >> >> cf-agent dies with FatalError: >> Fatal cfengine error: memory failure TWO:encrypted_len:0 >> >> >> It appears that the encrypted_len is indeed zero on the challenge response >> to the policy host. On AIX, that will result in a NULL malloc - which in >> turn fatals with a memory error in cf-agent. From the timestamps, the client >> who fails first, then cf-serverd on the policy host core dumps two seconds >> later. >> >> I don't know enough about the SSL communication between client and host, so >> I need a little help here. Is it possible that a encrypted length can be >> zero? >> >> >> >> On Dec 7, 2009, at 8:55 AM, Mark Burgess wrote: >> >>> Perhaps you have access to some fancy tools, like purify, insight etc >>> that might help debug this. It sounds like some kind of heap corruption. >>> >>> M >>> >>> Matt Richards wrote: >>>> Well, I hate to say this, but I am still having this problem (svn 657 >>>> now). However, I am getting closer. When cf-serverd core dumps, I get >>>> a corresponding "Fatal cfengine error: memory failure" on the client. >>>> I am not sure which one dies first, but I am guessing the client >>>> (cf-agent). I don't understand why it would get a memory failure, the >>>> code is just doing a regular malloc, and the machines (random, never >>>> the same one twice) in question have plenty of memory. I will dig >>>> (pulling my soxs up) more, but it is just odd. >> _______________________________________________ >> Help-cfengine mailing list >> Help-cfengine@cfengine.org >> https://cfengine.org/mailman/listinfo/help-cfengine > > -- > Mark Burgess > > ------------------------------------------------- > Professor of Network and System Administration > Oslo University College, Norway > > Personal Web: http://www.iu.hio.no/~mark > Office Telf : +47 22453272 > ------------------------------------------------- _______________________________________________ Help-cfengine mailing list Help-cfengine@cfengine.org https://cfengine.org/mailman/listinfo/help-cfengine