Re: [BUGS] BUG #2246: Bad malloc interactions: ecpg, openssl
On Mon, 13 Feb 2006, Stephen Frost wrote: > * Andrew Klosterman ([EMAIL PROTECTED]) wrote: > > (gdb) bt > > #0 0x401c3851 in kill () from /lib/libc.so.6 > > #1 0x40139dd5 in EF_Abort () from /usr/lib/libefence.so.0 > > #2 0x40139823 in memalign () from /usr/lib/libefence.so.0 > > #3 0x401399ad in malloc () from /usr/lib/libefence.so.0 > > #4 0x40139a10 in calloc () from /usr/lib/libefence.so.0 > > #5 0x404a182f in krb5_set_default_tgs_ktypes () from /usr/lib/libkrb5.so.3 > > #6 0x402c8b3f in ?? () from /usr/lib/libpq.so.4 > > #7 0x402ded88 in ?? () from /usr/lib/libpq.so.4 > > #8 0x in ?? () > > > > Looks like something fishy going on between libpq and libkrb5. I'm > > especially suspicious since I'm not using kerberos for authentication at > > all. > > Seems kind of unlikely... What exact (.deb) versions of libpq and > Postgres are you using? You originally posted w/ 8.1.0 but perhaps on > the client you had something more recent? > > Thanks, > > Stephen Running "aptitude show X" where "X" is the package name, and applying appropriate filtering gives the following results on my development systems: Package: libpq-dev Version: 8.1.0-3 Package: libpq3 Version: 1:7.4.9-2 Package: libpq4 Version: 8.1.0-3 Package: postgresql-8.1 Version: 8.1.0-3 Package: postgresql-contrib-8.1 Version: 8.1.0-3 Package: postgresql-server-dev-8.1 Version: 8.1.0-3 Package: postgresql-client-8.1 Version: 8.1.0-3 Package: postgresql-common Version: 39 (I frequently update and upgrade my installations...) --Andrew J. Klosterman [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [BUGS] BUG #2246: Bad malloc interactions: ecpg, openssl
--On Montag, Februar 13, 2006 21:25:30 -0500 Stephen Frost <[EMAIL PROTECTED]> wrote: * Andrew Klosterman ([EMAIL PROTECTED]) wrote: > Seems kind of unlikely... What exact (.deb) versions of libpq and > Postgres are you using? You originally posted w/ 8.1.0 but perhaps on > the client you had something more recent? aptitude install build-essential debhelper cdbs bison perl libperl-dev \ tk8.4-dev flex libreadline5-dev libssl-dev zlib1g-dev \ libpam0g-dev libxml2-dev libkrb5-dev libxslt1-dev python-dev \ gettext bzip2 fakeroot You might want to add valgrind to this list. It analyzes code on assembler basis and does a lot of memory checking / undefined variables checking while the program runs. Fixed all SIGSEGV I ever encoutered which were not infinite recursions. Mit freundlichem Gruß Jens Schicke -- Jens Schicke [EMAIL PROTECTED] asco GmbH http://www.asco.de Mittelweg 7 Tel 0531/3906-127 38106 BraunschweigFax 0531/3906-400 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [BUGS] BUG #2246: Bad malloc interactions: ecpg, openssl
On Feb 13 04:01, Andrew Klosterman wrote: > I threw in a pthread mutex around the code making the database connections > for each of my threads. The problem is still there ("corrupted > double-linked list"). > ... > Program received signal SIGILL, Illegal instruction. > [Switching to Thread 16384 (LWP 24753)] > 0x401c3851 in kill () from /lib/libc.so.6 > (gdb) bt > #0 0x401c3851 in kill () from /lib/libc.so.6 > #1 0x40139dd5 in EF_Abort () from /usr/lib/libefence.so.0 > #2 0x40139823 in memalign () from /usr/lib/libefence.so.0 > #3 0x401399ad in malloc () from /usr/lib/libefence.so.0 > #4 0x40139a10 in calloc () from /usr/lib/libefence.so.0 > #5 0x404a182f in krb5_set_default_tgs_ktypes () from /usr/lib/libkrb5.so.3 > #6 0x402c8b3f in ?? () from /usr/lib/libpq.so.4 > #7 0x402ded88 in ?? () from /usr/lib/libpq.so.4 > #8 0x in ?? () I met with some other thread-safety issues caused by libc used in Debian repos. For instance, getpwuid_r() is broken in Debian's current stable libc package and this causes a similar memory leak in the libpq code. IMHO, testing code with a newer libc version can be the solution. Otherwise, for an exact answer - as Tom said - we need libpq symbols in the backtrace. Regards. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [BUGS] BUG #2257: Can' stop server while autovacuum is running
"Evgeny Gridasov" <[EMAIL PROTECTED]> writes: > autovacuum process (when active) did not respond to kill (TERM). Only kill > -9 helped to stop autovacuum process. autovacuum does respond to shutdown requests, but in poking at this I found that btree index vacuuming may fail to notice a pending interrupt for long periods, if you've got large indexes. I've committed a fix for that. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [BUGS] BUG #2246: Bad malloc interactions: ecpg, openssl
* Andrew Klosterman ([EMAIL PROTECTED]) wrote: > Alright, I have built a system with the symbols left into the binaries. [...] > Again, it is showing a bad malloc in what appears to be some code using > kerberos. But there's nothing in my setup that I can think of right now > that should induce a connection to be set up using kerberos. The Kerberos libraries are still called when support for them has been compiled in. They generally don't cause any problems though. For some reason the line numbers in the backtrace line up but the function names don't quite (perhaps inlineing). Anyhow, the error is being reported down in 'krb5_init_context()' so either something strange is happening or it's actually a Kerberos bug after all. The reason the Kerberos libraries are called is to get the 'username' to use, which is determined prior to actually connecting to the backend (and finding out what authentication mechanism the backend thinks we should be trying). It's kind of a chicken-and-egg here because the backend decides what authentication mechanism to ask for based off the username (at least in part) through pg_hba.conf, so you can't find out the authentication method until you know the username so all methods to find the username have to be exhausted. You could avoid this by explicitly passing 'user=' into the connection parameters though... Would be interesting to know what happens then... Might also be interesting to look into the Kerberos libraries to see why they're attempting to malloc(0), perhaps there's a bug there when Kerberos isn't set up on the machine? Thanks, Stephen signature.asc Description: Digital signature
Re: [BUGS] BUG #2246: Bad malloc interactions: ecpg, openssl
* Andrew Klosterman ([EMAIL PROTECTED]) wrote: > On Tue, 14 Feb 2006, Stephen Frost wrote: > > > It's kind of a chicken-and-egg here because the backend decides what > > authentication mechanism to ask for based off the username (at least in > > part) through pg_hba.conf, so you can't find out the authentication > > method until you know the username so all methods to find the username > > have to be exhausted. You could avoid this by explicitly passing > > 'user=' into the connection parameters though... Would be interesting > > to know what happens then... > > When asking about "explicitly passing 'user=' in to the connection > parameters" do you mean that the EXEC SQL CONNECT line that ecpg parses > should specify a username? Oh, I see now. You're not using PQconnectdb but rather PQsetdbLogin, or at least, that's what ECPG is using. This ends up meaning that you can't pass in your own conninfo string and during the PQsetdbLogin call, libpq calls connectOptions1 with an empty conninfo string, which makes libpq think there's no set username which in turn makes it ask the Kerberos libraries for a username... As an initial comment, it seems like it'd be a good thing to update ECPG to use PQconnectdb. It's possible this is exposed already in some way but I'm not familiar enough with ECPG to know. Another approach would be to have PQsetdbLogin build up a conninfo string and pass that into connectOptions1 instead of calling connectOptions1 with an empty string and then changing the values afterwards. That'd probably be too large of a change to get in as a bugfix though. An alternative might be to move the pg_fe_getauthname() call to connectOptions2 as it's actually a bit more work than one might expect and if that can be avoided then that's probably all to the good. I'm a little worried about if that would work for all the various ways to use libpq to connect to the database... Sorry I don't have a simple answer. :/ In the end it seems like the Kerberos libraries should be able to survive Kerberos not being configured or whatever is going on to make it try to malloc 0-bytes... Thanks, Stephen signature.asc Description: Digital signature
Re: [BUGS] BUG #2246: Bad malloc interactions: ecpg, openssl
Stephen Frost <[EMAIL PROTECTED]> writes: > Another approach would be to have PQsetdbLogin build up a conninfo > string and pass that into connectOptions1 instead of calling > connectOptions1 with an empty string and then changing the values > afterwards. That'd probably be too large of a change to get in as a > bugfix though. An alternative might be to move the pg_fe_getauthname() > call to connectOptions2 as it's actually a bit more work than one might > expect and if that can be avoided then that's probably all to the good. Right offhand I like the idea of pushing it into connectOptions2 --- can you experiment with that? Seems like there is no reason to call Kerberos if the user supplies the name to connect as. > Sorry I don't have a simple answer. :/ In the end it seems like the > Kerberos libraries should be able to survive Kerberos not being > configured or whatever is going on to make it try to malloc 0-bytes... We may be spending too much time on this one point --- as long as Kerberos isn't *writing* into the zero-length alloc, there is nothing illegal immoral or fattening about malloc(0). Can you get ElectricFence to not abort right here but continue on to the real problem? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [BUGS] BUG #2246: Bad malloc interactions: ecpg, openssl
* Tom Lane ([EMAIL PROTECTED]) wrote: > Stephen Frost <[EMAIL PROTECTED]> writes: > > Another approach would be to have PQsetdbLogin build up a conninfo > > string and pass that into connectOptions1 instead of calling > > connectOptions1 with an empty string and then changing the values > > afterwards. That'd probably be too large of a change to get in as a > > bugfix though. An alternative might be to move the pg_fe_getauthname() > > call to connectOptions2 as it's actually a bit more work than one might > > expect and if that can be avoided then that's probably all to the good. > > Right offhand I like the idea of pushing it into connectOptions2 --- can > you experiment with that? Seems like there is no reason to call > Kerberos if the user supplies the name to connect as. Sure thing, I'll take a look at this probably tommorow night or thursday evening. > > Sorry I don't have a simple answer. :/ In the end it seems like the > > Kerberos libraries should be able to survive Kerberos not being > > configured or whatever is going on to make it try to malloc 0-bytes... > > We may be spending too much time on this one point --- as long as > Kerberos isn't *writing* into the zero-length alloc, there is nothing > illegal immoral or fattening about malloc(0). Can you get ElectricFence > to not abort right here but continue on to the real problem? Good point. Stephen signature.asc Description: Digital signature
Re: [BUGS] BUG #2246: Bad malloc interactions: ecpg, openssl
Andrew Klosterman <[EMAIL PROTECTED]> writes: > (gdb) print *conn > ... > allow_ssl_try = 1 '\001', wait_ssl_try = 0 '\0', ssl = 0x806d1d0, > peer = 0x807e430, > ... > *** glibc detected *** corrupted double-linked list: 0x0807e428 *** Hm, it looks like the problem is associated with whatever was allocated just before conn->peer (which is returned by SSL_get_peer_certificate called from open_client_SSL). Can you get efence or some other tool to produce a trace of malloc calls so we can determine what that is? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend