Re: Segmentation fault in application creating too many threads. - OPENSSL_malloc fails.

Prabhu S Mon, 03 Dec 2007 19:10:26 -0800

Hi,

I had a re- look into the segmentation fault issue in my application. What I
observe is that OPENSSL_malloc returns NULL in the  'int EVP_DigestInit_ex'
function in digest.c:


 if (ctx->digest != type)
 {
      if (ctx->digest && ctx->digest->ctx_size)
                OPENSSL_free(ctx->md_data);
      ctx->digest=type;
      if (type->ctx_size)
      {
            ctx->md_data=OPENSSL_malloc(type->ctx_size);
            if(*ctx->md_data* == 0L)
            {
                printf("OPENSSL_malloc has returned NULL!!\n");

             }

       }

 }

I observe that the "OPENSSL_malloc has returned NULL!!" gets printed .
The function later returns ctx->digest->init(ctx); The init in turn returns
SHA1_Init( *ctx->md_data*) where the application faults as shown in the
stack trace below:

> #0  SHA1_Init (c=0x0) at sha_locl.h:150
> #1  0x405b2bb0 *in* *init* (ctx=0x0) at m_sha1.c:72
> #2  0x405afc91 *in* EVP_DigestInit_ex (ctx=0x4d606230,
> type=0x4061f620, impl=0x0) at digest.c:207
> #3  0x405ac08e *in* ssleay_rand_add (buf=0x0, num=0,
> add= 2.5863007356866632e-306) at md_rand.c:263
> #4  0x405ace6e *in* RAND_add (buf=0x8a269f8,
> num=144861688, entropy=0) at rand_lib.c:151

Before running the application I set : ulimit -s unlimited. And while
creating the thread, I set the stack size as : .
    pthread_attr_setstacksize(&attr, 1024*1536);


Is synchronization or  memory constraints the issue here that OPENSSL_malloc
returns NULL when 1000 threads are active?

Thanks,
Prabhu. S

On Oct 20, 2007 12:27 AM, Prabhu S <[EMAIL PROTECTED]> wrote:

> SHA_CTX *c is getting corrupted.  GDB indicated ctx=0x0 in init(). However
> it was not the case.
>
> static int init(EVP_MD_CTX *ctx)
> {
>         if(ctx != 0L)
>         {
>           return SHA1_Init(ctx->md_data);
>         }
>         else
>         {
>            printf("ctx is NULL\n");  //Never to be seen though stack trace
> says so.
>            while(1)
>            {
>                 sleep(1);
>            }
>            return SHA1_Init(ctx->md_data);
>         }
> }
>
> int HASH_INIT (SHA_CTX *c)
>     {
>     if(c == 0L)
>     {
>       printf("ctx is NULL -> SHA1_Init \n"); // This gets printed ,app
> crashes  and rightly so.
>     }
>     c->h0=INIT_DATA_h0;
>     c->h1=INIT_DATA_h1;
>     c->h2=INIT_DATA_h2;
>     c->h3=INIT_DATA_h3;
>     c->h4=INIT_DATA_h4;
>     c->Nl=0;
>     c->Nh=0;
>     c->num=0;
>     return 1;
>     }
>
>
> Thanks,
> Prabhu. S
>
> On 10/18/07, Prabhu S <[EMAIL PROTECTED]> wrote:
> >
> > At times  The following traces as well are obtained:
> >
> > (gdb) bt
> > #0  MD5_Init (c=0x0) at md5_dgst.c:75
> > #1  0x405b2a90 in init (ctx=0x0) at m_md5.c:73
> > #2  0x405afc91 in EVP_DigestInit_ex (ctx=0x8e29b44, type=0x4061f560,
> > impl=0x0) at digest.c:207
> > #3  0x403819f5 in ssl3_init_finished_mac (s=0x8e298c8) at s3_enc.c:521
> > #4  0x4037d0bc in ssl3_connect (s=0x8e298c8) at s3_clnt.c:232
> > #5  0x4038feb8 in SSL_connect (s=0x8e298c8) at ssl_lib.c:850
> > (gdb)
> >
> > And:
> >
> > #0  X509_VERIFY_PARAM_new () at x509_vpm.c:91
> > 91      x509_vpm.c: No such file or directory.
> >         in x509_vpm.c
> > (gdb) bt
> > #0  X509_VERIFY_PARAM_new () at x509_vpm.c:91
> > #1  0x4038d978 in SSL_new (ctx=0x42f44448) at ssl_lib.c:297
> > #2  0x00000000 in ?? ()
> > (gdb)
> >
> > And:
> > #0  SHA1_Init (c=0x0) at sha_locl.h:150
> > 150     sha_locl.h: No such file or directory.
> >         in sha_locl.h
> > (gdb) bt
> > #0  SHA1_Init (c=0x0) at sha_locl.h:150
> > #1  0x405b2bb0 in init (ctx=0x0) at m_sha1.c:72
> > #2  0x405afc91 in EVP_DigestInit_ex (ctx=0x8fb2ef4, type=0x4061f620,
> > impl=0x0) at digest.c:207
> > #3  0x40381a15 in ssl3_init_finished_mac (s=0x8fad288) at s3_enc.c:522
> > #4  0x4037d0bc in ssl3_connect (s=0x8fad288) at s3_clnt.c:232
> > #5  0x4038feb8 in SSL_connect (s=0x8fad288) at ssl_lib.c:850
> >
> >
> >
> > On 10/18/07, Prabhu S <[EMAIL PROTECTED] > wrote:
> > >
> > > David,
> > >
> > > The OpenSSL version that I use is openssl-0.9.8e. Your guess about
> > > methods being called is right. It appears to be stack corruption.
> > >
> > > Gayathri,
> > >
> > > I don't suspect the gdb. I checked the CTX status in HASH_INIT
> > > (SHA_CTX *c) under stress , 'c' was indeed NULL  and  the application
> > > immediately dumped.
> > >
> > > Regards,
> > > Prabhu. S
> > >
> > >
> > > On 10/18/07, Gayathri S <[EMAIL PROTECTED] > wrote:
> > > >
> > > >
> > > > The stack trace showing a null sha1 transform kindof caught my
> > > > attention
> > > > here, I wouldnt go by the the GDB call trace coz its obviously a
> > > > memory
> > > > leak and the gdb stack could have been corrupted, many a times I see
> > > > 0x0
> > > > in the frames but when you actually try to print the ctx address it
> > > > would
> > > > be valid. CTX is definitely valid here,
> > > >
> > > > prabhu, earlier I was assuming you are using the linux sha1 in the
> > > > kernel
> > > > which is a loadable module, and I realise your just using plain
> > > > openssl
> > > > from userspace and linking with libcrypto. Linux sha1 has a
> > > > limitation on
> > > > the sha1_tfm structure, perhaps libcrypto sha1 is also the same way?
> > > >
> > > > Its obvious that you have ran out of sha1_tfms which is why when you
> > > >
> > > > actually sleep it helps as other threads would have released theirs.
> > > >
> > > > If you dont mind sending ur client code snipped, I could debug..
> > > > my email id would be [EMAIL PROTECTED]
> > > >
> > > > Thanks
> > > > --Gayathri
> > > >
> > > >
> > > >
> > > > > Even reducing the thread stack size didn't help.
> > > > > I observe that the thread creation as such is not a problem. I
> > > > create
> > > > > about 1000 threads , delay in each thread the SSL_connect for
> > > > about 10
> > > > > sec.
> > > > > Once the delay expires and each client make connections to the
> > > > server
> > > > > the seg fault occurs.
> > > >
> > > > You know, looking back at your original trace, it seems I may have
> > > > jumped
> > > > to conclusions. It's hard to be sure because I don't know what
> > > > OpenSSL
> > > > version you are using, so the line numbers don't tell me anything,
> > > > but
> > > > check this
> > > > out:
> > > >
> > > > > #0  SHA1_Init (c=0x0) at sha_locl.h:150
> > > > > #1  0x405b2bb0 in init (ctx=0x0) at m_sha1.c:72
> > > > > #2  0x405afc91 in EVP_DigestInit_ex (ctx=0x4d606230,
> > > > type=0x4061f620,
> > > > > impl=0x0) at digest.c:207
> > > > > #3  0x405ac08e in ssleay_rand_add (buf=0x0, num=0, add=
> > > > > 2.5863007356866632e-306) at md_rand.c:263
> > > > > #4  0x405ace6e in RAND_add (buf=0x8a269f8, num=144861688,
> > > > entropy=0)
> > > > > at rand_lib.c:151
> > > >
> > > > I'm guessing frame #2 is this:
> > > >
> > > >          return ctx->digest->init(ctx);
> > > >
> > > > Which calls this:
> > > >
> > > > static int init(EVP_MD_CTX *ctx)
> > > >          { return SHA1_Init(ctx->md_data); }
> > > >
> > > >         Notice that 'init' was called with a NULL context. But the
> > > > context
> > > > cannot have been NULL in frame 2 because if it was ctx->digest would
> > > > have
> > > > faulted.
> > > > So it looks like the stack in frame #2 cannot have lead to the stack
> > > > in
> > > > frame #1.
> > > >
> > > >         This is not a memory exhaustion issue or a failure to check
> > > > for
> > > > NULL. It looks like stack corruption. The real puzzle is why stack
> > > > corruption would only occur with a large number of threads.
> > > >
> > > >         I'm thinking perhaps there's some concurrency issue with
> > > > ssleay_rand_add, but I've been over it twice and I don't see any
> > > > issue.
> > > > The md context would be unique for each thread, so it should be
> > > > safe.
> > > >
> > > >         Maybe someone will read this and it will resonate with
> > > > something
> > > > they know?
> > > > If you can, please tell us what version of OpenSSL this was. This
> > > > will
> > > > allow people to understand the line numbers better and make sure
> > > > they're
> > > > not looking at code that has whatever bit you already fixed.
> > > >
> > > >         DS
> > > >
> > > >
> > > > ______________________________________________________________________
> > > >
> > > > OpenSSL Project
> > > > http://www.openssl.org
> > > > User Support Mailing List
> > > > openssl-users@openssl.org
> > > > Automated List Manager
> > > > [EMAIL PROTECTED]
> > > >
> > > >
> > > >
> > > > ********************************************************************************
> > > > This email message (including any attachments) is for the sole use
> > > > of the intended recipient(s)
> > > > and may contain confidential, proprietary and privileged
> > > > information. Any unauthorized review,
> > > > use, disclosure or distribution is prohibited. If you are not the
> > > > intended recipient,
> > > > please immediately notify the sender by reply email and destroy all
> > > > copies of the original message.
> > > > Thank you.
> > > >
> > > > Intoto Inc.
> > > >
> > > >
> > > > ______________________________________________________________________
> > > > OpenSSL Project
> > > > http://www.openssl.org
> > > > User Support Mailing List
> > > > openssl-users@openssl.org
> > > > Automated List Manager
> > > > [EMAIL PROTECTED]
> > > >
> > >
> > >
> >
>

Re: Segmentation fault in application creating too many threads. - OPENSSL_malloc fails.

Reply via email to