Hi, I had a re- look into the segmentation fault issue in my application. What I observe is that OPENSSL_malloc returns NULL in the 'int EVP_DigestInit_ex' function in digest.c:
if (ctx->digest != type) { if (ctx->digest && ctx->digest->ctx_size) OPENSSL_free(ctx->md_data); ctx->digest=type; if (type->ctx_size) { ctx->md_data=OPENSSL_malloc(type->ctx_size); if(*ctx->md_data* == 0L) { printf("OPENSSL_malloc has returned NULL!!\n"); } } } I observe that the "OPENSSL_malloc has returned NULL!!" gets printed . The function later returns ctx->digest->init(ctx); The init in turn returns SHA1_Init( *ctx->md_data*) where the application faults as shown in the stack trace below: > #0 SHA1_Init (c=0x0) at sha_locl.h:150 > #1 0x405b2bb0 *in* *init* (ctx=0x0) at m_sha1.c:72 > #2 0x405afc91 *in* EVP_DigestInit_ex (ctx=0x4d606230, > type=0x4061f620, impl=0x0) at digest.c:207 > #3 0x405ac08e *in* ssleay_rand_add (buf=0x0, num=0, > add= 2.5863007356866632e-306) at md_rand.c:263 > #4 0x405ace6e *in* RAND_add (buf=0x8a269f8, > num=144861688, entropy=0) at rand_lib.c:151 Before running the application I set : ulimit -s unlimited. And while creating the thread, I set the stack size as : . pthread_attr_setstacksize(&attr, 1024*1536); Is synchronization or memory constraints the issue here that OPENSSL_malloc returns NULL when 1000 threads are active? Thanks, Prabhu. S On Oct 20, 2007 12:27 AM, Prabhu S <[EMAIL PROTECTED]> wrote: > SHA_CTX *c is getting corrupted. GDB indicated ctx=0x0 in init(). However > it was not the case. > > static int init(EVP_MD_CTX *ctx) > { > if(ctx != 0L) > { > return SHA1_Init(ctx->md_data); > } > else > { > printf("ctx is NULL\n"); //Never to be seen though stack trace > says so. > while(1) > { > sleep(1); > } > return SHA1_Init(ctx->md_data); > } > } > > int HASH_INIT (SHA_CTX *c) > { > if(c == 0L) > { > printf("ctx is NULL -> SHA1_Init \n"); // This gets printed ,app > crashes and rightly so. > } > c->h0=INIT_DATA_h0; > c->h1=INIT_DATA_h1; > c->h2=INIT_DATA_h2; > c->h3=INIT_DATA_h3; > c->h4=INIT_DATA_h4; > c->Nl=0; > c->Nh=0; > c->num=0; > return 1; > } > > > Thanks, > Prabhu. S > > On 10/18/07, Prabhu S <[EMAIL PROTECTED]> wrote: > > > > At times The following traces as well are obtained: > > > > (gdb) bt > > #0 MD5_Init (c=0x0) at md5_dgst.c:75 > > #1 0x405b2a90 in init (ctx=0x0) at m_md5.c:73 > > #2 0x405afc91 in EVP_DigestInit_ex (ctx=0x8e29b44, type=0x4061f560, > > impl=0x0) at digest.c:207 > > #3 0x403819f5 in ssl3_init_finished_mac (s=0x8e298c8) at s3_enc.c:521 > > #4 0x4037d0bc in ssl3_connect (s=0x8e298c8) at s3_clnt.c:232 > > #5 0x4038feb8 in SSL_connect (s=0x8e298c8) at ssl_lib.c:850 > > (gdb) > > > > And: > > > > #0 X509_VERIFY_PARAM_new () at x509_vpm.c:91 > > 91 x509_vpm.c: No such file or directory. > > in x509_vpm.c > > (gdb) bt > > #0 X509_VERIFY_PARAM_new () at x509_vpm.c:91 > > #1 0x4038d978 in SSL_new (ctx=0x42f44448) at ssl_lib.c:297 > > #2 0x00000000 in ?? () > > (gdb) > > > > And: > > #0 SHA1_Init (c=0x0) at sha_locl.h:150 > > 150 sha_locl.h: No such file or directory. > > in sha_locl.h > > (gdb) bt > > #0 SHA1_Init (c=0x0) at sha_locl.h:150 > > #1 0x405b2bb0 in init (ctx=0x0) at m_sha1.c:72 > > #2 0x405afc91 in EVP_DigestInit_ex (ctx=0x8fb2ef4, type=0x4061f620, > > impl=0x0) at digest.c:207 > > #3 0x40381a15 in ssl3_init_finished_mac (s=0x8fad288) at s3_enc.c:522 > > #4 0x4037d0bc in ssl3_connect (s=0x8fad288) at s3_clnt.c:232 > > #5 0x4038feb8 in SSL_connect (s=0x8fad288) at ssl_lib.c:850 > > > > > > > > On 10/18/07, Prabhu S <[EMAIL PROTECTED] > wrote: > > > > > > David, > > > > > > The OpenSSL version that I use is openssl-0.9.8e. Your guess about > > > methods being called is right. It appears to be stack corruption. > > > > > > Gayathri, > > > > > > I don't suspect the gdb. I checked the CTX status in HASH_INIT > > > (SHA_CTX *c) under stress , 'c' was indeed NULL and the application > > > immediately dumped. > > > > > > Regards, > > > Prabhu. S > > > > > > > > > On 10/18/07, Gayathri S <[EMAIL PROTECTED] > wrote: > > > > > > > > > > > > The stack trace showing a null sha1 transform kindof caught my > > > > attention > > > > here, I wouldnt go by the the GDB call trace coz its obviously a > > > > memory > > > > leak and the gdb stack could have been corrupted, many a times I see > > > > 0x0 > > > > in the frames but when you actually try to print the ctx address it > > > > would > > > > be valid. CTX is definitely valid here, > > > > > > > > prabhu, earlier I was assuming you are using the linux sha1 in the > > > > kernel > > > > which is a loadable module, and I realise your just using plain > > > > openssl > > > > from userspace and linking with libcrypto. Linux sha1 has a > > > > limitation on > > > > the sha1_tfm structure, perhaps libcrypto sha1 is also the same way? > > > > > > > > Its obvious that you have ran out of sha1_tfms which is why when you > > > > > > > > actually sleep it helps as other threads would have released theirs. > > > > > > > > If you dont mind sending ur client code snipped, I could debug.. > > > > my email id would be [EMAIL PROTECTED] > > > > > > > > Thanks > > > > --Gayathri > > > > > > > > > > > > > > > > > Even reducing the thread stack size didn't help. > > > > > I observe that the thread creation as such is not a problem. I > > > > create > > > > > about 1000 threads , delay in each thread the SSL_connect for > > > > about 10 > > > > > sec. > > > > > Once the delay expires and each client make connections to the > > > > server > > > > > the seg fault occurs. > > > > > > > > You know, looking back at your original trace, it seems I may have > > > > jumped > > > > to conclusions. It's hard to be sure because I don't know what > > > > OpenSSL > > > > version you are using, so the line numbers don't tell me anything, > > > > but > > > > check this > > > > out: > > > > > > > > > #0 SHA1_Init (c=0x0) at sha_locl.h:150 > > > > > #1 0x405b2bb0 in init (ctx=0x0) at m_sha1.c:72 > > > > > #2 0x405afc91 in EVP_DigestInit_ex (ctx=0x4d606230, > > > > type=0x4061f620, > > > > > impl=0x0) at digest.c:207 > > > > > #3 0x405ac08e in ssleay_rand_add (buf=0x0, num=0, add= > > > > > 2.5863007356866632e-306) at md_rand.c:263 > > > > > #4 0x405ace6e in RAND_add (buf=0x8a269f8, num=144861688, > > > > entropy=0) > > > > > at rand_lib.c:151 > > > > > > > > I'm guessing frame #2 is this: > > > > > > > > return ctx->digest->init(ctx); > > > > > > > > Which calls this: > > > > > > > > static int init(EVP_MD_CTX *ctx) > > > > { return SHA1_Init(ctx->md_data); } > > > > > > > > Notice that 'init' was called with a NULL context. But the > > > > context > > > > cannot have been NULL in frame 2 because if it was ctx->digest would > > > > have > > > > faulted. > > > > So it looks like the stack in frame #2 cannot have lead to the stack > > > > in > > > > frame #1. > > > > > > > > This is not a memory exhaustion issue or a failure to check > > > > for > > > > NULL. It looks like stack corruption. The real puzzle is why stack > > > > corruption would only occur with a large number of threads. > > > > > > > > I'm thinking perhaps there's some concurrency issue with > > > > ssleay_rand_add, but I've been over it twice and I don't see any > > > > issue. > > > > The md context would be unique for each thread, so it should be > > > > safe. > > > > > > > > Maybe someone will read this and it will resonate with > > > > something > > > > they know? > > > > If you can, please tell us what version of OpenSSL this was. This > > > > will > > > > allow people to understand the line numbers better and make sure > > > > they're > > > > not looking at code that has whatever bit you already fixed. > > > > > > > > DS > > > > > > > > > > > > ______________________________________________________________________ > > > > > > > > OpenSSL Project > > > > http://www.openssl.org > > > > User Support Mailing List > > > > openssl-users@openssl.org > > > > Automated List Manager > > > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > > ******************************************************************************** > > > > This email message (including any attachments) is for the sole use > > > > of the intended recipient(s) > > > > and may contain confidential, proprietary and privileged > > > > information. Any unauthorized review, > > > > use, disclosure or distribution is prohibited. If you are not the > > > > intended recipient, > > > > please immediately notify the sender by reply email and destroy all > > > > copies of the original message. > > > > Thank you. > > > > > > > > Intoto Inc. > > > > > > > > > > > > ______________________________________________________________________ > > > > OpenSSL Project > > > > http://www.openssl.org > > > > User Support Mailing List > > > > openssl-users@openssl.org > > > > Automated List Manager > > > > [EMAIL PROTECTED] > > > > > > > > > > > > >