Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > I am still reading email from yesterday, but this is a new patch in the > > past 2 days. The problem is that time differences were overflowing int > > values if the vacuum took a long time, or something like that. The fix > > is to cast one to long long. > > That's no fix --- it will break the code on compilers without long long.
Here are the emails describing the problem. Seems they should see how we do time differences in the backend as an example. -- Bruce Momjian | http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
>From [EMAIL PROTECTED] Thu Dec 4 16:24:50 2003 Return-path: <[EMAIL PROTECTED]> Received: from hosting.commandprompt.com (192.commandprompt.com [207.173.200.192]) by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id hB4LOfJ07438 for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 16:24:49 -0500 (EST) Received: from postgresql.org (svr1.postgresql.org [200.46.204.71]) by hosting.commandprompt.com (8.11.6/8.11.6) with ESMTP id hB4LL6H26870; Thu, 4 Dec 2003 13:21:48 -0800 X-Original-To: [EMAIL PROTECTED] Received: from localhost (neptune.hub.org [200.46.204.2]) by svr1.postgresql.org (Postfix) with ESMTP id 3806AD1B491 for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 21:20:52 +0000 (GMT) Received: from svr1.postgresql.org ([200.46.204.71]) by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) with ESMTP id 26242-05 for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 17:20:23 -0400 (AST) Received: from yertle.kcilink.com (yertle.kcilink.com [216.194.193.105]) by svr1.postgresql.org (Postfix) with ESMTP id D9481D1B47A for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 17:20:20 -0400 (AST) Received: by yertle.kcilink.com (Postfix, from userid 100) id 5AF5A2178A; Thu, 4 Dec 2003 16:20:22 -0500 (EST) From: Vivek Khera <[EMAIL PROTECTED]> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <[EMAIL PROTECTED]> Date: Thu, 4 Dec 2003 16:20:22 -0500 To: "Matthew T. O'Connor" <[EMAIL PROTECTED]> cc: <[EMAIL PROTECTED]> Subject: Re: [PERFORM] autovacuum daemon stops doing work after about an hour In-Reply-To: <[EMAIL PROTECTED]> References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> X-Mailer: VM 7.17 under 21.4 (patch 14) "Reasonable Discussion" XEmacs Lucid X-Virus-Scanned: by amavisd-new at postgresql.org X-Mailing-List: pgsql-performance Precedence: bulk Sender: [EMAIL PROTECTED] Status: OR >>>>> "MTO" == Matthew T O'Connor <[EMAIL PROTECTED]> writes: MTO> Could this be the recently reported bug where time goes backwards on MTO> FreeBSD? Can anyone who knows more about this problem chime in, I know it MTO> was recently discussed on Hackers. Time does not go backwards -- the now and then variables are properly incrementing in time as you see from the debugging output. The error appears to be with the computation of the "diff". It is either a C programming error, or a compiler error. I'm not a C "cop" so I can't tell you which it is. Witness this program, below, compiled as "cc -g -o t t.c" and the output here: % ./t seconds = 3509 seconds1 = 3509000000 useconds = -452486 stepped diff = 3508547514 seconds2 = -785967296 seconds3 = 3509000000 diff = -786419782 long long diff = 3508547514 % apperantly, if you compute (now.tv_sec - then.tv_sec) * 1000000 all at once, it overflows since the RHS is all computed using longs rather than long longs. Fix is to cast at least one of the values to long long on the RHS, as in the computation of seconds3 below. compare that to the computation of seconds2 and you'll see that this is the cause. I'd be curious to see the output of this program on other platforms and other compilers. I'm using gcc 2.95.4 as shipped with FreeBSD 4.8+. That all being said, you should never sleep less than the base time, and never for more than a max amount, perhaps 1 hour? --cut here-- #include <sys/time.h> #include <stdio.h> int main() { struct timeval now, then; long long diff = 0; long long seconds, seconds1, seconds2, seconds3, useconds; now.tv_sec = 1070565077L; now.tv_usec = 216477L; then.tv_sec = 1070561568L; then.tv_usec = 668963L; seconds = now.tv_sec - then.tv_sec; printf("seconds = %lld\n",seconds); seconds1 = seconds * 1000000; printf("seconds1 = %lld\n",seconds1); useconds = now.tv_usec - then.tv_usec; printf("useconds = %lld\n",useconds); diff = seconds1 + useconds; printf("stepped diff = %lld\n",diff); /* this appears to be the culprit... it should be same as seconds1 */ seconds2 = (now.tv_sec - then.tv_sec) * 1000000; printf("seconds2 = %lld\n",seconds2); /* seems we need to cast long's to long long's for this computation */ seconds3 = ((long long)now.tv_sec - (long long)then.tv_sec) * 1000000; printf("seconds3 = %lld\n",seconds3); diff = (now.tv_sec - then.tv_sec) * 1000000 + (now.tv_usec - then.tv_usec); printf ("diff = %lld\n",diff); diff = ((long long)now.tv_sec - (long long)then.tv_sec) * 1000000 + (now.tv_usec - then.tv_usec); printf ("long long diff = %lld\n",diff); exit(0); } --cut here-- ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) >From [EMAIL PROTECTED] Thu Dec 4 16:24:16 2003 Return-path: <[EMAIL PROTECTED]> Received: from noon.pghoster.com (noon.pghoster.com [64.246.0.64]) by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id hB4LOFJ07391 for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 16:24:16 -0500 (EST) Received: from svr1.postgresql.org ([200.46.204.71] helo=postgresql.org) by noon.pghoster.com with esmtp (Exim 4.24) id 1AS0wV-0000VC-H8; Thu, 04 Dec 2003 15:23:23 -0600 X-Original-To: [EMAIL PROTECTED] Received: from localhost (neptune.hub.org [200.46.204.2]) by svr1.postgresql.org (Postfix) with ESMTP id 5A0AFD1B4AE for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 21:22:38 +0000 (GMT) Received: from svr1.postgresql.org ([200.46.204.71]) by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) with ESMTP id 27632-01 for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 17:22:09 -0400 (AST) Received: from yertle.kcilink.com (yertle.kcilink.com [216.194.193.105]) by svr1.postgresql.org (Postfix) with ESMTP id 921C3D1B47A for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 17:22:07 -0400 (AST) Received: by yertle.kcilink.com (Postfix, from userid 100) id 60C142178A; Thu, 4 Dec 2003 16:22:09 -0500 (EST) From: Vivek Khera <[EMAIL PROTECTED]> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <[EMAIL PROTECTED]> Date: Thu, 4 Dec 2003 16:22:09 -0500 To: "Matthew T. O'Connor" <[EMAIL PROTECTED]> cc: <[EMAIL PROTECTED]> Subject: Re: [PERFORM] autovacuum daemon stops doing work after about an hour In-Reply-To: <[EMAIL PROTECTED]> References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> X-Mailer: VM 7.17 under 21.4 (patch 14) "Reasonable Discussion" XEmacs Lucid X-Virus-Scanned: by amavisd-new at postgresql.org X-Mailing-List: pgsql-performance Precedence: bulk Sender: [EMAIL PROTECTED] X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - noon.pghoster.com X-AntiAbuse: Original Domain - candle.pha.pa.us X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - postgresql.org Status: OR Actually, you can simplify the fix thusly: diff = (long long)(now.tv_sec - then.tv_sec) * 1000000 + (now.tv_usec - then.tv_usec); ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] >From [EMAIL PROTECTED] Thu Dec 4 16:29:53 2003 Return-path: <[EMAIL PROTECTED]> Received: from hosting.commandprompt.com (192.commandprompt.com [207.173.200.192]) by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id hB4LTnJ12253 for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 16:29:52 -0500 (EST) Received: from postgresql.org (svr1.postgresql.org [200.46.204.71]) by hosting.commandprompt.com (8.11.6/8.11.6) with ESMTP id hB4LQUH28228; Thu, 4 Dec 2003 13:26:59 -0800 X-Original-To: [EMAIL PROTECTED] Received: from localhost (neptune.hub.org [200.46.204.2]) by svr1.postgresql.org (Postfix) with ESMTP id EBD27D1B4A6 for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 21:26:19 +0000 (GMT) Received: from svr1.postgresql.org ([200.46.204.71]) by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) with ESMTP id 28396-03 for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 17:25:49 -0400 (AST) Received: from lerami.lerctr.org (lerami.lerctr.org [207.158.72.11]) by svr1.postgresql.org (Postfix) with ESMTP id 073D7D1B491 for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 17:25:43 -0400 (AST) Received: from lerlaptop-red.iadfw.net ([207.136.3.72]) by lerami.lerctr.org with asmtp (Exim 4.30) id 1AS0yf-0004Vv-85; Thu, 04 Dec 2003 15:25:37 -0600 Date: Thu, 04 Dec 2003 15:25:36 -0600 From: Larry Rosenman <[EMAIL PROTECTED]> To: Vivek Khera <[EMAIL PROTECTED]>, "Matthew T. O'Connor" <[EMAIL PROTECTED]> cc: [EMAIL PROTECTED] Subject: Re: [PERFORM] autovacuum daemon stops doing work after about an Message-ID: <[EMAIL PROTECTED]> In-Reply-To: <[EMAIL PROTECTED]> References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> X-Mailer: Mulberry/3.1.0 (Linux/x86) X-PGP-Info: All other keys are old/dead. X-PGP-Key: 0x3c49bdd6 X-PGP-Fingerprint: D0D1 3C11 F42F 6B29 FA67 6BF3 AD13 4685 3C49 BDD6 MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="==========1B86D33B13306EE04F08==========" X-Virus-Scanned: by amavisd-new at postgresql.org X-Mailing-List: pgsql-performance Precedence: bulk Sender: [EMAIL PROTECTED] Status: OR --==========1B86D33B13306EE04F08========== Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: quoted-printable Content-Disposition: inline --On Thursday, December 04, 2003 16:20:22 -0500 Vivek Khera=20 <[EMAIL PROTECTED]> wrote: >>>>>> "MTO" =3D=3D Matthew T O'Connor <[EMAIL PROTECTED]> writes: > > MTO> Could this be the recently reported bug where time goes backwards on > MTO> FreeBSD? Can anyone who knows more about this problem chime in, I > know it MTO> was recently discussed on Hackers. > > > Time does not go backwards -- the now and then variables are properly > incrementing in time as you see from the debugging output. > > The error appears to be with the computation of the "diff". It is > either a C programming error, or a compiler error. I'm not a C "cop" > so I can't tell you which it is. > > Witness this program, below, compiled as "cc -g -o t t.c" and the > output here: > > % ./t > seconds =3D 3509 > seconds1 =3D 3509000000 > useconds =3D -452486 > stepped diff =3D 3508547514 > seconds2 =3D -785967296 > seconds3 =3D 3509000000 > diff =3D -786419782 > long long diff =3D 3508547514 > % > > apperantly, if you compute (now.tv_sec - then.tv_sec) * 1000000 all at > once, it overflows since the RHS is all computed using longs rather > than long longs. Fix is to cast at least one of the values to long > long on the RHS, as in the computation of seconds3 below. compare > that to the computation of seconds2 and you'll see that this is the > cause. > > I'd be curious to see the output of this program on other platforms > and other compilers. I'm using gcc 2.95.4 as shipped with FreeBSD > 4.8+. this is with the UnixWare compiler: $ cc -O -o testvk testvk.c $ ./testvk seconds =3D 3509 seconds1 =3D 3509000000 useconds =3D -452486 stepped diff =3D 3508547514 seconds2 =3D -785967296 seconds3 =3D 3509000000 diff =3D -786419782 long long diff =3D 3508547514 $ I think this is a C bug. > > That all being said, you should never sleep less than the base time, > and never for more than a max amount, perhaps 1 hour? > > > --cut here-- ># include <sys/time.h> ># include <stdio.h> > > int > main() > { > struct timeval now, then; > long long diff =3D 0; > long long seconds, seconds1, seconds2, seconds3, useconds; > > now.tv_sec =3D 1070565077L; > now.tv_usec =3D 216477L; > > then.tv_sec =3D 1070561568L; > then.tv_usec =3D 668963L; > > seconds =3D now.tv_sec - then.tv_sec; > printf("seconds =3D %lld\n",seconds); > seconds1 =3D seconds * 1000000; > printf("seconds1 =3D %lld\n",seconds1); > useconds =3D now.tv_usec - then.tv_usec; > printf("useconds =3D %lld\n",useconds); > > diff =3D seconds1 + useconds; > printf("stepped diff =3D %lld\n",diff); > > /* this appears to be the culprit... it should be same as seconds1 */ > seconds2 =3D (now.tv_sec - then.tv_sec) * 1000000; > printf("seconds2 =3D %lld\n",seconds2); > > /* seems we need to cast long's to long long's for this computation */ > seconds3 =3D ((long long)now.tv_sec - (long long)then.tv_sec) * 1000000; > printf("seconds3 =3D %lld\n",seconds3); > > > diff =3D (now.tv_sec - then.tv_sec) * 1000000 + (now.tv_usec - > then.tv_usec); printf ("diff =3D %lld\n",diff); > > diff =3D ((long long)now.tv_sec - (long long)then.tv_sec) * 1000000 + > (now.tv_usec - then.tv_usec); printf ("long long diff =3D %lld\n",diff); > > exit(0); > } > > > --cut here-- > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) > --=20 Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED] US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749 --==========1B86D33B13306EE04F08========== Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (FreeBSD) iD8DBQE/z6ZQrRNGhTxJvdYRAizdAKCJrroU/PruGlADjJEybSh+IhRHwQCffnpM rZH61B7ilXl1WNXE+fvLmCA= =SdUH -----END PGP SIGNATURE----- --==========1B86D33B13306EE04F08==========-- >From [EMAIL PROTECTED] Thu Dec 4 16:42:37 2003 Return-path: <[EMAIL PROTECTED]> Received: from hosting.commandprompt.com (192.commandprompt.com [207.173.200.192]) by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id hB4LgZJ13848 for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 16:42:36 -0500 (EST) Received: from postgresql.org (svr1.postgresql.org [200.46.204.71]) by hosting.commandprompt.com (8.11.6/8.11.6) with ESMTP id hB4LbvH31032; Thu, 4 Dec 2003 13:38:31 -0800 X-Original-To: [EMAIL PROTECTED] Received: from localhost (neptune.hub.org [200.46.204.2]) by svr1.postgresql.org (Postfix) with ESMTP id DC2BED1B4A6 for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 21:37:42 +0000 (GMT) Received: from svr1.postgresql.org ([200.46.204.71]) by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) with ESMTP id 28408-09 for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 17:37:13 -0400 (AST) Received: from yertle.kcilink.com (yertle.kcilink.com [216.194.193.105]) by svr1.postgresql.org (Postfix) with ESMTP id 0FDDCD1B484 for <[EMAIL PROTECTED]>; Thu, 4 Dec 2003 17:37:11 -0400 (AST) Received: by yertle.kcilink.com (Postfix, from userid 100) id 5D1622178A; Thu, 4 Dec 2003 16:37:12 -0500 (EST) From: Vivek Khera <[EMAIL PROTECTED]> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <[EMAIL PROTECTED]> Date: Thu, 4 Dec 2003 16:37:12 -0500 To: [EMAIL PROTECTED] Subject: Re: [PERFORM] autovacuum daemon stops doing work after about an In-Reply-To: <[EMAIL PROTECTED]> References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> X-Mailer: VM 7.17 under 21.4 (patch 14) "Reasonable Discussion" XEmacs Lucid X-Virus-Scanned: by amavisd-new at postgresql.org X-Mailing-List: pgsql-performance Precedence: bulk Sender: [EMAIL PROTECTED] Status: OR >>>>> "LR" == Larry Rosenman <[EMAIL PROTECTED]> writes: >> I'd be curious to see the output of this program on other platforms >> and other compilers. I'm using gcc 2.95.4 as shipped with FreeBSD >> 4.8+. LR> this is with the UnixWare compiler: LR> $ cc -O -o testvk testvk.c LR> $ ./testvk LR> seconds = 3509 LR> seconds1 = 3509000000 LR> useconds = -452486 LR> stepped diff = 3508547514 LR> seconds2 = -785967296 LR> seconds3 = 3509000000 LR> diff = -786419782 LR> long long diff = 3508547514 LR> $ LR> I think this is a C bug. Upon further reflection, I think so to. The entire RHS is long's so the arithmetic is done in longs, then assigned to a long long when done (after things have overflowed). Forcing any one of the RHS values to be long long causes the arithmetic to all be done using long longs, and then you get the numbers you expect. I think you only notice this in autovacuum when it takes a long time to complete the work, like my example of about 3500 seconds. ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html