Re: [GENERAL] Linux v.s. Mac OS-X Performance
On 11/13/07 10:02 AM, "Scott Ribe" <[EMAIL PROTECTED]> wrote: > What you're referring to must be that the kernel was essentially > single-threaded, with a single "kernel-funnel" lock. (Because the OS > certainly supported threads, and it was certainly possible to write > highly-threaded applications, and I don't know of any performance problems > with threaded applications.) > > This has been getting progressively better, with each release adding more > in-kernel concurrency. Which means that 10.5 probably obsoletes all prior > postgres benchmarks on OS X. While I've never seen this documented anywhere, it empirically looks like 10.5 also (finally) adds CPU affinity to better utilize instruction caching. On a dual CPU system under 10.4, one CPU bound process would use two CPU's at 50%. Under 10.5 it uses one CPU at 100%. I never saw any resolution to this thread - were the original tests on the Opteron and OS X identical, or were they two different workloads? Wes ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Linux v.s. Mac OS-X Performance
Regarding the various kernel bottlenecks, have there been any tests with Google's malloc (libtcmalloc)? Perhaps PostgreSQL isn't heavily threaded enough to make a difference, but on one of our heavily threaded applications (unrelated to Postgres), it made a night and day difference. Instead of memory and CPU usage growing and growing, both stabilized quickly at less than half of what the linux malloc produced. Linux (2.6) RH malloc stinks in heavily threaded applications. The benchmark that got us looking at this was a MySQL benchmark showing performance scaling by number of threads on various linux operating systems. The difference in our application by simply relinking at run time (LD_PRELOAD) with libtcmalloc was astounding. Wes ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] REINDEX on large DB vs. DROP INDEX/CREATE INDEX
Seems like it would be a common question, but I'm having problems finding an answer in the archives on this... I have a large database (now about 2 billion records), and about once a year I have been dropping and recreating the indexes. Recreating the foreign key constraints takes as long or longer than the index creation. Last year the whole process took about 72 hours. This year the DB is bigger. I'm running 8.1.4. Assume I have exclusive access to the DB. 1. Is there any advantage to doing the DROP/CREATE over just doing a REINDEX DATABASE. I seem to remember DROP/CREATE had some advantages, but can't find the information. Is there a performance hit with REINDEX during creation because of locking issues? 2. I'm assuming REINDEX would avoid the time involved in recreating the foreign key constraints? 3. With a REINDEX DATABASE, how can I monitor progress? Wes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] REINDEX on large DB vs. DROP INDEX/CREATE INDEX
On 1/24/08 12:48 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: > Wes <[EMAIL PROTECTED]> writes: >> I'm running 8.1.4. Assume I have exclusive access to the DB. > > You really ought to update to 8.1.something-newer, but I digress. I was planning on upgrading to 8.x at the same time as this reindex - just do a dump/reload, but... I guess I've been asleep at the wheel and didn't realize 8.1.11 was out. Since that wouldn't require a DB reload, I guess that would be highly recommended? >> 1. Is there any advantage to doing the DROP/CREATE over just doing a REINDEX >> DATABASE. > > No, not if you don't mind exclusive locks. DROP together with CREATE > INDEX CONCURRENTLY might be nicer if you were trying to do this without > completely shutting down the DB, but if you aren't running normal > operations then just use REINDEX. I was thinking there was some clean up that didn't happen with REINDEX, related to disk space allocation maybe? Perhaps this was just on older 7.x versions - something I vaguely recall back under 7.x when I was having some corruption issues. >> 3. With a REINDEX DATABASE, how can I monitor progress? > > It should give you a NOTICE after each table. Is there anything that shows up in ps for each index it is working on? > BTW, what have you got maintenance_work_mem set to? It is currently set to 983025. Not sure where I got that strange number from. It's a 2 GB machine. I've been trying to get more, but when it runs fine day to day, it's kind of hard to justify. Lots of disks, not so much memory. I guess I should also turn off fsync for the duration. Wes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] REINDEX on large DB vs. DROP INDEX/CREATE INDEX
On 1/25/08 5:40 AM, "Gregory Stark" <[EMAIL PROTECTED]> wrote: > It shouldn't make a big difference. fsync only happens at the end of a > transaction or at a checkpoint. > > Since you're concerned with very long operations the slowdown at the end of > the transaction won't make a big difference. Ok, I didn't realize that. > Checkpoints could be an issue, but you would be better off just raising > checkpoint_segments and/or checkpoint_timeout to make sure you don't get one > more often than once every few minutes when you're doing large operations like > this. I have checkpoint_segments set to 60, and no warnings showing up in the log. Wes ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] REINDEX on large DB vs. DROP INDEX/CREATE INDEX
Just a follow-up on this... The REINDEX took about 2 1/2 days. I didn't gain much disk space back - a full backup takes just as long as before, but the vacuum time dropped from 30 hours to 3 hours. Wes >> 1. Is there any advantage to doing the DROP/CREATE over just doing a REINDEX >> DATABASE. > > No, not if you don't mind exclusive locks. DROP together with CREATE > INDEX CONCURRENTLY might be nicer if you were trying to do this without > completely shutting down the DB, but if you aren't running normal > operations then just use REINDEX. > >> 2. I'm assuming REINDEX would avoid the time involved in recreating the >> foreign key constraints? > > Right, that's one reason to do it that way. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] REINDEX on large DB vs. DROP INDEX/CREATE INDEX
On 2/4/08 9:53 AM, "Vivek Khera" <[EMAIL PROTECTED]> wrote: > what you need to do is compare the relpages from the pg_class table > for that index before and after. > > if you didn't get much disk space back, make sure you have no long > running transactions that may have kept some older files open. I can check that next time, but I only reindex about once a year. There definitely should be no outstanding transactions. The reason for the huge change in the vacuum time is that the indexes are scanned in index order instead of disk order. I understand that is fixed in 8.2 or 8.3 (don't recall which I saw it in), but have never gotten confirmation from anyone on that. Wes ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Vacuum time degrading
We've been doing a vacuum every night. With about 254 million rows, it was taking abut 0.8 hours. Now a few months later with about 456 million rows it is taking 4.8 hours. I did check the archives and didn't find anything like this. Why is the vacuum time not going up linearly? Since vacuum has to sequentially read the entire database, I would have expected a linear increase - about 1.5 hours now. There are currently no deletes or modifies to the database - only inserts. This is on PostgreSQL 7.4.5, RedHat ES 3.0. Wes ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] Vacuum time degrading
On 2/28/05 6:53 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: > If you are suffering bloat, the fastest route to a solution would > probably be to CLUSTER your larger tables. Although VACUUM FULL > would work, it's likely to be very slow. How can there be bloat if there are no deletes or modifies? Even if there were deletes or modifies (there will be in about another year and a half), if a vacuum is being performed every night, how can there be bloat? The vacuum should release the dead space and it should be reused. Am I missing something? >> There are currently no deletes or modifies to the database - only inserts. > > You *certain* about that? It's hard to see how the vacuum time wouldn't > be linear in table size if there's nothing to do and no dead space. Absolutely sure. The only case that would approach a delete is if a batch load fails, the transaction is rolled back. That very seldom happens. Why am I running vacuum nightly if I have no deletes or updates, you ask? Two reasons - to have it in the cron schedule for when there are deletes (there will never be updates), and as a check on database integrity. If there is a database problem, vacuum at least has a chance of flagging it since it reads the entire database. This was instigated after we had a couple of instances of corruption a while back that went undetected for too long. I'm also doing a weekly pg_dumpall as an additional check/fallback. > Again, VACUUM VERBOSE info would be informative (it's sufficient to look > at your larger tables for this). I'll set that up to run tonight and see if it gives any clues. Last night, vacuum ran over 5 hours. Wes ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Vacuum time degrading
On 2/28/05 6:53 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: > It's hard to see how the vacuum time wouldn't > be linear in table size if there's nothing to do and no dead space. I am doing 'vacuum analyze' rather than just 'vacuum'. Could that have anything to do with the non-linear behavior? Wes ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] Vacuum time degrading
On 2/28/05 6:53 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: > Again, VACUUM VERBOSE info would be informative (it's sufficient to look > at your larger tables for this). It took 5.2 hours again tonight to do the vacuum. I don't see anything out of the ordinary - no explanation for the non-linear increases in vacuum time. This is what shows up at the end: INFO: free space map: 93 relations, 282 pages stored; 1712 total pages needed DETAIL: Allocated FSM size: 1000 relations + 100 pages = 5920 kB shared memory. The tables all show something like: INFO: "blah": found 0 removable, 366326534 nonremovable row versions in 3241829 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. Wes ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Vacuum time degrading
On 3/2/05 12:16 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: > Would you post the complete VACUUM VERBOSE log? The CPU/elapsed time lines > would help us identify where the time is going. I'll send it to you directly - its rather long. >> DETAIL: Allocated FSM size: 1000 relations + 100 pages = 5920 kB shared >> memory. > > Well, you don't have a problem with FSM being too small anyway ;-) Nope... Preparation for when deletes start kicking in down the road. If I can only do a vacuum once a week, I've got to have lots of space. Wes ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] Vacuum time degrading
On 3/2/05 12:16 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: > Would you post the complete VACUUM VERBOSE log? The CPU/elapsed time lines > would help us identify where the time is going. Mailed. I do see stats like: CPU 518.88s/25.17u sec elapsed 10825.33 sec. CPU 884.96s/64.35u sec elapsed 13793.13 sec. CPU 132.46s/6.66u sec elapsed 2435.42 sec. CPU 49.25s/4.15u sec elapsed 414.71 sec. This is a dual CPU hyperthreaded (which probably makes little difference here) 2.4Ghz RedHat 3.0. The database is on an 8-disk SCSI hardware RAID 5 with 10k rpm disks. Pg_xlog is on a separate volume. I thought it was a 2.6 kernel, but it looks like it is 2.4.20. I need to monitor the system when the vacuum is running to see if sar/top show anything. I wonder if it's hitting the kswapd thrashing problem? Wes ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] Vacuum time degrading
Watching the system as vacuum is running, I can see that we are encountering the kswapd/kscand problem in the 2.4.20 kernel. This could very well account for the non-linear increase in vacuum time. This problem is fixed in the 2.6 kernel, but we can't upgrade because DELL is dragging their feet in releasing hardware monitoring compatible with 2.6 kernel. So, we're going to try a 2.4.29 kernel and hope that the problem is fixed there. With any luck, by Friday I'll know if the kswapd problem is fixed in 2.4.29 and if that solves the excessive vacuum times. Wes ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Vacuum time degrading
Well, the good news is that the 2.24.29 kernel solved the kswapd problem. That bad news is that it didn't help the vacuum time. In fact, the vacuum time is now over 6 hours instead of 5 hours. Whether that is a direct result of the 2.24.29 kernel, or a coincidence, I don't know at this time. I guess the next step is to try reindexing a couple of the big indexes and see if that helps. Wes ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Recovering real disk space
On 3/30/05 12:09 PM, "Adam Siegel" <[EMAIL PROTECTED]> wrote: > How can we physically recover "real" disk space for the rows that were > deleted from the table? vacuum full Wes ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] Vacuum time degrading
On 3/2/05 10:50 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: > It wouldn't be easy --- there are some locking considerations that say > btbulkdelete needs to scan the index in the same order that an ordinary > scan would do. See the nbtree README for details. Just a follow-up on this.. The vacuum time has been steadily increasing at a seemingly increasing rate, although there are no deletes or updates to the database. The current DB size is just over 500 million rows. Last week it was up to 6.84 hours to do a vacuum. Over the weekend I reindexed all the major indexes. The two largest indexes took about 10 hours to reindex both. After the reindexing, the vacuum took only 1.44 hours. This is pretty much a linear scaling from the original vacuum time I reported. So, the increasing vacuum times would appear to be as Tom suggested - due to the fact that vacuum processes indexes in index order, not physical disk order. I guess we add a periodic reindex to our maintenance procedures... Wes ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Vacuum time degrading
On 4/4/05 8:50 AM, "Tom Lane" <[EMAIL PROTECTED]> wrote: > That doesn't follow from what you said. Did you check that the physical > sizes of the indexes were comparable before and after the reindex? No, how do I do that (or where is it documented how to do it)? How is it not consistent? I believe you suggested the reindex. The initial timing was shortly after a database reload. The index would have been built in sorted order, correct? This was the 1 hour time. After a period of months, the index values have been inserted in random order. The DB size is up 50% but the vacuum time is up a factor of 6+. Presumably the index is being read by moving the heads all over the place. I reindex, and the index is rebuilt in sorted order. Vacuum is now down to 1.5 hours - a linear scaling from the original numbers. The locality of reference in reading the indexes in order should be much better. Wes ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] Vacuum time degrading
On 4/5/05 11:15 AM, "Tom Lane" <[EMAIL PROTECTED]> wrote: > I didn't say it wasn't consistent, just that it doesn't prove the > point. The speedup you saw could have been from elimination of index > bloat more than from bringing the index into physically sorted order. > An estimate of the overall database size doesn't really tell us how > much this particular table's indexes changed in size. Ok, now I follow. Taking the biggest indexes: The weekend before: INFO: index "message_recipients_i_recip_date" now contains 393961361 row versions in 2435100 pages INFO: index "message_recipients_i_message" now contains 393934394 row versions in 1499853 pages After reindex: INFO: index "message_recipients_i_recip_date" now contains 401798357 row versions in 1765613 pages INFO: index "message_recipients_i_message" now contains 401787237 row versions in 1322974 pages Wes ---(end of broadcast)--- TIP 8: explain analyze is your friend
[GENERAL] To SPAM or not to SPAM...
It seems that for some time this list has been mirrored to Usenet without changing the originating email addresses. Since Usenet is spammers Nirvana, that one little act results in everyone that posts to the postgres lists becoming SPAM fodder. Can something not be changed in the mailinglist-Usnet gateway such that the originating email addresses are either removed or scrambled so that posting to the mailing list doesn't result in your email address being plastered all over Usenet? People that intentionally post to Usenet generally don't use a replyable email address. Wes ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[GENERAL] 8.1 'make check' fails
System: Mac OS X 10.4.2 PostgreSQL: 8.1 b2 Running 'make check', I get the following failure: rm -rf ./testtablespace mkdir ./testtablespace /bin/sh ./pg_regress --temp-install --top-builddir=../../.. --temp-port=55432 --schedule=./parallel_schedule --multibyte=SQL_ASCII --load-language=plpgsql == creating temporary installation== pg_regress: installation failed Examine ./log/install.log for the reason. make[2]: *** [check] Error 2 make[1]: *** [check] Error 2 make: *** [check] Error 2 Looking at the install log, I see it apparently does not handle directories that contain blanks - the paths are not being quoted: gzip -d -c ./postgres.tar.gz | ( cd /Volumes/G4 Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regress/./tmp_che ck/install/usr/local/pgsql8.1b2/doc/html && /usr/bin/tar xf - ) for file in man1/*.1 man7/*.7 ; do \ /bin/sh ../config/install-sh -c -m 644 $file /Volumes/G4 Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regress/./tmp_che ck/install/usr/local/pgsql8.1b2/man/$file || exit; \ done cp: /Volumes/G4 is a directory (not copied). make[4]: *** [install] Error 1 make[3]: *** [install] Error 2 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] 8.1 'make check' fails
On 10/2/05 7:48 AM, "William ZHANG" <[EMAIL PROTECTED]> wrote: > Yes, the Makefiles cannot deal with spaces correctly. > Seems we should avoid use the `complicated' path. Such paths are normal on systems with a GUI interface. They are not out of the ordinary nor complicated. Wes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] 8.1 'make check' fails
On 10/2/05 4:13 PM, "Martijn van Oosterhout" wrote: > Problem is, the space is also the word seperator. So if you have a > space seperated list of words (ie normal for make), names with spaces > are a pain. > > It occurred to me that you might be able to replace the spaces in the > paths with question marks. Then any invokation of the shell will expand > the question mark back to a space as part of parameter expansion. The normal build seems to work, as does 'make install' (although the install /path is /usr/local). I would have thought it was just a matter of quoting file paths in all the makefiles. Instead of gzip -d -c $(srcdir)/postgres.tar.gz | ( cd $(DESTDIR)$(docdir)/html... It should be gzip -d -c "$(srcdir)/postgres.tar.gz" | ( cd "$(DESTDIR)$(docdir)/html"... But it's not quite that simple. See below. To resolve my immediate problem, I can just temporarily rename my hard drive. I'm just reporting this as a problem that should be fixed. Wes install.log has: make -C doc install make[4]: Nothing to be done for `install'. make -C src install /bin/sh ../config/mkinstalldirs "/Volumes/G4 Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regress/./tmp_che ck/install/usr/local/pgsql8.1b2/lib/pgxs/src" mkdir /Volumes/G4/Boot mkdir /Volumes/G4/Boot/Users mkdir /Volumes/G4/Boot/Users/pgsql mkdir /Volumes/G4/Boot/Users/pgsql/src mkdir /Volumes/G4/Boot/Users/pgsql/src/postgres mkdir /Volumes/G4/Boot/Users/pgsql/src/postgres/postgresql-8.1beta2 mkdir /Volumes/G4/Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src mkdir /Volumes/G4/Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test mkdir /Volumes/G4/Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regre ss mkdir /Volumes/G4/Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regre ss/./tmp_check mkdir /Volumes/G4/Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regre ss/./tmp_check/install mkdir /Volumes/G4/Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regre ss/./tmp_check/install/usr mkdir /Volumes/G4/Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regre ss/./tmp_check/install/usr/local mkdir /Volumes/G4/Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regre ss/./tmp_check/install/usr/local/pgsql8.1b2 mkdir /Volumes/G4/Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regre ss/./tmp_check/install/usr/local/pgsql8.1b2/lib mkdir /Volumes/G4/Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regre ss/./tmp_check/install/usr/local/pgsql8.1b2/lib/pgxs mkdir /Volumes/G4/Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regre ss/./tmp_check/install/usr/local/pgsql8.1b2/lib/pgxs/src /bin/sh ../config/install-sh -c -m 644 Makefile.global "/Volumes/G4 Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regress/./tmp_che ck/install/usr/local/pgsql8.1b2/lib/pgxs/src/Makefile.global" /bin/sh ../config/install-sh -c -m 644 Makefile.port "/Volumes/G4 Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regress/./tmp_che ck/install/usr/local/pgsql8.1b2/lib/pgxs/src/Makefile.port" /bin/sh ../config/install-sh -c -m 644 "./Makefile.shlib" "/Volumes/G4 Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regress/./tmp_che ck/install/usr/local/pgsql8.1b2/lib/pgxs/src/Makefile.shlib" /bin/sh ../config/install-sh -c -m 644 "./nls-global.mk" "/Volumes/G4 Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regress/./tmp_che ck/install/usr/local/pgsql8.1b2/lib/pgxs/src/nls-global.mk" make -C port install /bin/sh ../../config/install-sh -c -m 644 libpgport.a /Volumes/G4 Boot/Users/pgsql/src/postgres/postgresql-8.1beta2/src/test/regress/./tmp_che ck/install/usr/local/pgsql8.1b2/lib cp: /Volumes/G4 is a directory (not copied). make[5]: *** [install] Error 1 make[4]: *** [install] Error 2 make[3]: *** [install] Error 2 ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Database-based alternatives to tsearch2?
I've seen questions asked on the list about alternatives to tsearch2, but not for the type of full text indexing I'm looking for. I'm looking for a non index-based full text indexing - one that stores the information as table data instead of index data. I do not need to implement SQL operators for searches. The application library would need to implement the actual word search. Indexes are too fragile. Our documents will be offline, and re-indexing would be impossible. Additionally, as I undertstand it, tsearch2 doesn't scale to the numbers I need (hundreds of millions of documents). Is anyone aware of any such solutions for PostgreSQL, open source or otherwise? Thanks Wes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] Index vacuum improvements in 8.2
>From 8.2 release notes: Speed up vacuuming of B-Tree indexes (Heikki Linnakangas, Tom) >From <http://archives.postgresql.org/pgsql-patches/2006-05/msg7.php> "2. Vacuum is now done in one phase, scanning the index in physical order. That significantly speeds up index vacuums of large indexes that don't fit into memory. However, btbulkdelete doesn't know if the vacuum is a full or lazy one. The patch just assumes it's a lazy vacuum, but the API really needs to be changed to pass that information earlier than at vacuum_cleanup." Are these talking about the same thing - 8.2 now vacuums indexes in disk order instead of index sort order? If so, that will make a tremendous difference in vacuum speed for us (when we can afford the downtime to do a dump/upgrade/reload). For us, vacuum starts at about 4 hours after a reindex, then goes downhill. We reindex when it starts taking 24 hours to vacuum. The above referenced posting is from May 1, 2006. Is pgsql 8.2 the first release to have this code? Wes ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Index vacuum improvements in 8.2
>From 8.2 release notes: Speed up vacuuming of B-Tree indexes (Heikki Linnakangas, Tom) >From <http://archives.postgresql.org/pgsql-patches/2006-05/msg7.php> "2. Vacuum is now done in one phase, scanning the index in physical order. That significantly speeds up index vacuums of large indexes that don't fit into memory. However, btbulkdelete doesn't know if the vacuum is a full or lazy one. The patch just assumes it's a lazy vacuum, but the API really needs to be changed to pass that information earlier than at vacuum_cleanup." Are these talking about the same thing - 8.2 now vacuums indexes in disk order instead of index sort order? If so, that will make a tremendous difference in vacuum speed for us (when we can afford the downtime to do a dump/upgrade/reload). For us, vacuum starts at about 4 hours after a reindex, then goes downhill. We reindex when it starts taking 24 hours to vacuum. The above referenced posting is from May 1, 2006. Is pgsql 8.2 the first release to have this code? Wes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Ridiculous load
On 12/9/04 9:23 AM, "Peter Haworth" <[EMAIL PROTECTED]> wrote: > It runs RHEL ES v3, kernel 2.4.21-20.ELsmp > It's generally a very stable box which runs a number of postgresql > instances. But last night we hit very high low averages - 10+, vs the > normal 0-2. > The culprit appeared to be kswapd, which was using huge amounts of cpu. > I'd like to know why! There were some major changes in swap management somewhere in the linux 2.4 kernel (also RH 3.0 ES). I don't off hand remember exactly which level. Under a heavy I/O load, we also saw kswapd going nuts with the 2.4.21 kernel, destroying system performance. The only solution we found was to upgrade to a 2.6 kernel. The problem has not reoccurred since then. Our database is currently about 67 gigabytes and is growing at about 1 million records per day. We are using Postgresql 7.4.x. Wes ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] OSX 10.3.7 broke Postgresql 8.0.0b5?
On 12/18/04 11:03 AM, "Scott Ribe" <[EMAIL PROTECTED]> wrote: > All the recent OS X .x updates have replaced /etc/rc. For all I know they > may always have done so, but of course prior to 10.3 we didn't have to > modify /etc/rc directly. You shouldn't be modifying /etc/rc directly anyway. Create a startup script in /Library/StartupItems, or /etc/mach_init.d. I believe the latter is new with 10.3 and is now the preferred method. Granted, modifying /etc/rc is a lot easier, but you pay for it later when an update wipes out your changes and you beat your head against the wall trying to find the problem. Wes ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] hundreds of millions row dBs
> We're getting about 64 million rows inserted in about 1.5 hrs into a > table with a multiple-column primary key - that's the only index. > That's seems pretty good to me - SQL Loader takes about 4 hrs to do the > same job. As I recall, the last time we rebuilt our database, it took about 3 hours to import 265 million rows of data. It then took another 16 hours to rebuild all the indexes. I think the entire pg_dumpall/reload process took about 21 hours +/-. I wonder what it will be like with 1.5 billion rows... Wes ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] hundreds of millions row dBs
> Out of curiosity, what value of sort_mem were you using? > > (In PG 8.0, the sort memory setting used by CREATE INDEX will be > maintenance_work_mem not work_mem, which should help in getting larger > values to be used. But in existing releases you usually need to think > about a manual tweak.) Normally it is set to 65535. However, during the load I bump it up to 655350. The system has 2GB ECC memory. > Here is an instance where a really big ram disk might be handy. > You could create a database on a big ram disk and load it, then build > the indexes. I'm afraid we don't have quite that much RAM... With just under 400 million rows right now, it is 74 GB. That will probably grow to around 300 GB or so before it stabilizes. > Actually, if you have a RAM disk, just change the $PGDATA/base/nnn/pgsql_tmp > subdirectory into a symlink to some temp directory on the RAM disk. > Should get you pretty much all the win with no need to move stuff around > afterwards. > > You have to be sure the RAM disk is bigger than your biggest index though. Hmm. That's a thought. I expect our largest index will still be bigger than available RAM though. How can I check index sizes? We already have pg_xlog on a dedicated mirrored disk. Would it help significantly to give pgsql_tmp its own mirrored disk? PGDATA is on an 8 disk hardware RAID 5. Wes ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] [HACKERS] Much Ado About COUNT(*)
On 1/13/05 9:50 AM, "Greg Stark" <[EMAIL PROTECTED]> wrote: > I think the Oracle syntax looks like > > SELECT * FROM foo SAMPLE (0.1) > > I don't think I would have picked this syntax but it seems like a better idea > to copy the existing practice rather than invent a new one. Of course, in Oracle 'count(*)' is instantaneous. It doesn't have to count the physical records one by one. Wes ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] [HACKERS] Much Ado About COUNT(*)
On 1/13/05 6:44 PM, "Greg Stark" <[EMAIL PROTECTED]> wrote: > That's simply false. Oracle does indeed have to count the records one by one. > > It doesn't have to read and ignore the dead records since they're in a > separate place (but on the other hand it sometimes have to go read that > separate place when it sees records that were committed after your > transaction). > > It can also do index-only scans, which often helps, but it's still not > instantaneous. Ok, I stand corrected - I was given some wrong information. However, my experience has been that count(*) on Oracle is a whole lot faster than PostgreSQL - what appeared instantaneous on Oracle took some time on PostgreSQL. That was one of the first things I noticed when moving a database application to PostgreSQL. I've since disposed of the Oracle database, so can't go back and retest. Wes ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] [HACKERS] Much Ado About COUNT(*)
On 1/14/05 12:47 PM, "Frank D. Engel, Jr." <[EMAIL PROTECTED]> wrote: > It's probably too messy to be worthwhile this > way, though. More trouble than it would be worth. It would be rather useful if there was a way to get a reasonably accurate count (better than analyze provides) in a very short period. When you've got a relatively wide table that has hundreds of millions to over a billion rows, and you need to report on how many rows in the table, that can take a long time. Wes ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] [HACKERS] Much Ado About COUNT(*)
On 1/14/05 5:37 PM, "Martijn van Oosterhout" wrote: > If that's all you want, what about the row estimate from pg_class? It > has the number of rows active at last vacuum... For really large tables > I imagine it'd be easily close enough... For showing the changes in a given day (or even week), that isn't accurate enough. Wes ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[GENERAL] Shared memory and Mac OS X
The problem with not being able to set shmmax and shmall in a startup script in Mac OS X is not that you are setting them too late in the boot process. It is that you can set them only once. In fact, you can set them from a terminal window after booting and logging in - as long as they haven't already been set. If you comment out the sysctl's for shmall and shmmax in /etc/rc, reboot, then log in and bring up a terminal window, you will see that they are set to -1. You can then set them to whatever you want (as root of course). However, if you try to set the value again, the first value remains set. Unfortunately, this doesn't help much, as Apple's default /etc/rc sets values. If you comment out these, you can set the values in a startup script, but you're still screwed when the next update re-enables the settings in /etc/rc, which are set before SystemStarter is called. I guess the best you can do is have a startup script that notifies you if the values are wrong. Wes ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] Shared memory and Mac OS X
On 1/20/05 10:27 PM, "Jonel Rienton" <[EMAIL PROTECTED]> wrote: > have you tried using /etc/sysctl.conf and saving the shmax value there? Unfortunately, the -p parameter does not appear to be valid, nor does 'strings -a' show 'conf' in the binary (unlike the RedHat sysctl). Wes ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] Shared memory and Mac OS X
On 1/21/05 1:50 AM, "Jonel Rienton" <[EMAIL PROTECTED]> wrote: > I have found this at the docs of Postgres: Yes, I knew about that. My email was in regards to an earlier discussion on why you had to update /etc/rc as of 10.3 (probably should have dug that one up and replied to it). The previous statement had been that they had to be in /etc/rc because startup scripts were now too late in the boot process. I was just clarifying the reason. Wes ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] How are foreign key constraints built?
How are foreign key constraints built? In loading my database into PostgreSQL 8.0, on the command: ALTER TABLE ONLY TABLEA ADD CONSTRAINT "$1" FOREIGN KEY (mkey) REFERENCES tableb(mkey) ON DELETE CASCADE; I ended up with the following in pg_tmp as it is adding the constraint: -rw---1 postgres admin 1073741824 23 Jan 06:09 pgsql_tmp2559.25 -rw---1 postgres admin 1073741824 23 Jan 06:24 pgsql_tmp2559.26 -rw---1 postgres admin 1073741824 23 Jan 06:39 pgsql_tmp2559.27 -rw---1 postgres admin 636526592 23 Jan 06:48 pgsql_tmp2559.28 -rw---1 postgres admin 1073741824 23 Jan 11:51 pgsql_tmp2559.29 -rw---1 postgres admin 1073741824 23 Jan 11:34 pgsql_tmp2559.30 -rw---1 postgres admin 1073741824 23 Jan 11:36 pgsql_tmp2559.31 -rw---1 postgres admin 1073741824 23 Jan 11:37 pgsql_tmp2559.32 -rw---1 postgres admin 1073741824 23 Jan 11:38 pgsql_tmp2559.33 -rw---1 postgres admin 1073741824 23 Jan 11:39 pgsql_tmp2559.34 -rw---1 postgres admin 1073741824 23 Jan 11:41 pgsql_tmp2559.35 -rw---1 postgres admin 1073741824 23 Jan 11:42 pgsql_tmp2559.36 -rw---1 postgres admin 1073741824 23 Jan 11:43 pgsql_tmp2559.37 -rw---1 postgres admin 1073741824 23 Jan 11:45 pgsql_tmp2559.38 -rw---1 postgres admin 1073741824 23 Jan 11:46 pgsql_tmp2559.39 -rw---1 postgres admin 1073741824 23 Jan 11:47 pgsql_tmp2559.40 -rw---1 postgres admin 1073741824 23 Jan 11:49 pgsql_tmp2559.41 -rw---1 postgres admin 1073741824 23 Jan 11:50 pgsql_tmp2559.42 -rw---1 postgres admin 603136000 23 Jan 11:51 pgsql_tmp2559.43 I believe files 25-28 are about the size of one of the indexes, but 29-43 is about 35% larger than the tmp files when building the other index (the other index is about 10GB). There's no problem here, I'd just like to understand what it is doing. I expected adding the foreign key constraint would just use the existing indexes to verify the database is currently consistent. Is this just working space to more efficiently build the initial constraint, or does it actually write this to the database? Wes ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Tablespaces and primary keys
Implicit indexes created by a constraint do not appear to honor the default tablespace. The index gets created in the "null" tablespace. I took pg_dumpall output and modified the schema to place everything in specific table spaces. When the statement: ALTER TABLE ONLY addresses ADD CONSTRAINT addresses_pkey PRIMARY KEY (address_key); is executed, this results in an implicitly created index: NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index "addresses_pkey" for table "addresses" However, it ignores my previous command: SET default_tablespace = indexes; and creates the index without a tablespace. Of course, I can do an ALTER INDEX addresses_pkey SET tablespace indexes; but that's a bit of a hassle where there are a number of very large indexes. I can't pre-create the index, as the constraint will try to create it's own index anyway. Is there something I'm overlooking, or is this a bug? Wes ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] How are foreign key constraints built?
On 1/23/05 1:01 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: > If you like you can try the operation with "set > enable_seqscan = off", but I bet it will take longer. Ouch! That was a big mistake. Since inquiring minds want to know, I decided to give that a try. The expected outcome is to beat the heck out of the index disks as it read one index and referenced the other to see if the value existed. What appears to have happened is that it went through the same process as before, but read each data record via the index. It still created all the pgsql_tmp files, the data disk was still the heaviest hit (expected no or little access there), and it beat the living daylights out of my swap - pageins/outs like crazy. The I/O on the index disks was negligible compared to the data and swap disks. I won't try that again... Wes ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Indexed leading substring searches - worked, now doesn't
I know my leading substring searches used to be done via indexes. We specifically tested that. Since the last time I tested it, the database has probably been reloaded to fix a corruption problem. Now all I can get is sequential leading substring searches. In the examples below, the database was vacuumed last night. The database is very large (currently about 100 GB, and will be 100's of gigabytes), so performance is important. This particular table in the example has only about 2.8 million rows. PostgreSQL version is 7.4.5. Prior to the reload, it was 7.4.1. The locale is showing up as en_US.iso885915. As far as I know, it was always this (default RedHat install), so I don't understand why it worked before. Did something change between 7.4.1 and 7.4.5? I supposed it's possible that I specified locale=C on the original database and don't remember that... I'm not going to have to "initdb --locale=C" and am I? I looked at index classes, and that doesn't appear to be something I want to do, due to performance. What kind of performance hit do you actually take by using an index class? Wes Pg_controldata shows: Maximum length of locale name:128 LC_COLLATE: en_US.iso885915 LC_CTYPE: en_US.iso885915 narc=> \d addresses Table "public.addresses" Column| Type | Modifiers -++--- address_key | numeric(12,0) | not null address | character varying(255) | not null Indexes: "addresses_pkey" primary key, btree (address_key) "addresses_i_address" btree (address) narc=> select count(*) from addresses; count - 2829640 (1 row) narc=> explain select * from addresses where address = 'blah'; QUERY PLAN -- Index Scan using addresses_i_address on addresses (cost=0.00..2.81 rows=1 width=40) Index Cond: ((address)::text = 'blah'::text) (2 rows) narc=> explain select * from addresses where address like 'blah%'; QUERY PLAN -- Seq Scan on addresses (cost=0.00..61244.68 rows=2 width=40) Filter: ((address)::text ~~ 'blah%'::text) (2 rows) narc=> explain analyze select * from addresses where address like 'blah%'; QUERY PLAN -- Seq Scan on addresses (cost=0.00..61244.68 rows=2 width=40) (actual time=1445.386..8913.435 rows=6 loops=1) Filter: ((address)::text ~~ 'blah%'::text) Total runtime: 8913.504 ms (3 rows) Something else doesn't make sense.. I did the same query a few minutes ago and the cost was totally different: narc=> explain select * from addresses where address like 'blah%'; QUERY PLAN -- Seq Scan on addresses (cost=1.00..100061244.67 rows=2 width=40) Filter: ((address)::text ~~ 'blah%'::text) (2 rows) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] Indexed leading substring searches - worked, now
On 2/3/05 2:29 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: > You could use the alternative operator classes for pattern searches, but > if you want the same indexes to also serve for normal text sorting, > C locale is the better bet. As an interim solution until I can reload the database (takes a weekend), can I define two indexes on the same field, one using operator classes and one not (and have them automatically used as appropriate)? Because of the time involved, I'm trying to hold off on another reload until we upgrade to 8.x. Wes ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[GENERAL] Deadlock Detected (revisited)
I haven't seen anything to indicate that 8.x improves foreign key refererence locking and fixes the foreign key 'deadlock detected' issue. Has that been addressed in 8.1? I sort my records before loading, but am still getting 'deadlock detected' (still running 7.3.4). I have B references C B references A A references C Any new records for C are loaded first (both A and B depend on C). A is loaded next (B depends on A). B is loaded last. Both A and B are loaded in 'Table C' sorted order. However, a deadlock can occur with one process's A records and another's B records, due to the fact that both have a foreign key reference to C. The only way I know of to eliminate the deadlocks without serializing the processes is to remove the foreign key reference from either A or B. However, jeopardizes referential integrity if something goes wrong with the load process. I've never quite understood why a READ of a record with a foreign key reference results in the referenced record being locked with more than a shared lock. Wes ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Deadlock Detected (revisited)
On 11/25/05 2:40 AM, "Stephan Szabo" <[EMAIL PROTECTED]> wrote: > 8.1 should be using the new shared row locks for doing the checks. This > should fix the case mentioned. Hmm. I'm looking in the "What's new" for 8.1 and see that now. I could have sworn it wasn't there the last time I looked. Sorry. > Up until now, we didn't have one to get on a per-record basis. Ok, I guess I'm confused... From the 7.3.2 manual (6.5 updates): "Multiversion concurrency control(MVCC) This removes our old table-level locking, and replaces it with a locking system that is superior to most commercial database systems. In a traditional system, each row that is modified is locked until committed, preventing reads by other users. MVCC uses the natural multiversion nature of PostgreSQL to allow readers to continue reading consistent data during writer activity. Writers continue to use the compact pg_log transaction system. This is all performed without having to allocate a lock for every row like traditional database systems. So, basically, we no longer are restricted by simple table-level locking; we have something better than row-level locking." So, until 8.1 PostgreSQL had "something better than row-level locking" for some things, but no row locking when needed? Or was it row locking is there, but just no shared row locking? Wes ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Deadlock Detected (revisited)
On 11/25/05 9:12 PM, "Stephan Szabo" <[EMAIL PROTECTED]> wrote: > The latter, the row locks before were single owner and were such that a > second lock request for the same row would wait for the first to be > released. Now effectively you have two levels of locks at the row level, > the weaker of which conflicts with the stronger but not with itself. Now I understand. Thanks for clarifying. It sounds like there may be some performance improvements that will be beneficial to us also. Hopefully I can convince them to agree to the 8.1 upgrade. Besides verifying our application, it takes a full weekend to do the dump/load - the database is just under a billion rows. Wes ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] Excessive vacuum times
Some time ago I reported a problem where right after an index rebuild vacuum might take a couple of hours, but gets slower and slower at a seeming increasing rate as time goes on. What takes a couple of ours after a full index rebuild now takes 12 hours or more. The problem was determined to be due to the fact that indexes are vacuumed in index order, not in disk storage order. I don't see anything about this in the "What's new" for 8.1. Has anything been done to resolve this? Scheduled index rebuilds are not feasible. I really don't even want to take a chance on starting one. The database is just a hair under one billion rows, and could take the entire weekend or more to rebuild. Wes ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Excessive vacuum times
On 12/12/05 5:26 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: >> The problem was determined to be due to the fact that indexes are vacuumed >> in index order, not in disk storage order. I don't see anything about this >> in the "What's new" for 8.1. Has anything been done to resolve this? > > No. Avoiding that would require a new approach to > vacuum-vs-ordinary-indexscan interlocking, so it won't happen until > someone has a Bright Idea (tm). Any ideas on how I might I reconfigure to mitigate the issue? Separating the most offending indexes to separate drives probably isn't an option. Wes ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] ./configure --with-openssl=path fails
Because of our build environment, we need to build PostgreSQL specifying the location of the openssl libraries to insure everyone is using the same libraries, regardless of the system software is built and executed on. At 7.4.5, we used: ./configure --with-openssl=path This no longer works with 8.x. See: <http://archives.postgresql.org/pgsql-committers/2003-11/msg00278.php> What is the correct way to work around this restriction? Set LD_LIBRARY_PATH? This is on a variety of unix platforms. Wes ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] ./configure --with-openssl=path fails
On 1/6/06 2:47 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: >> ./configure --with-openssl=path >> This no longer works with 8.x. See: >> <http://archives.postgresql.org/pgsql-committers/2003-11/msg00278.php> >> What is the correct way to work around this restriction? > > Use --with-includes and --with-libraries as needed. That allows me to build, but doesn't set the library search path in the binaries ("-Wl,-rpath /blah/blah" on linux or "-R /blah/blah" for Solaris). This prevents programs from loading, since they can't find the libraries if they are not in the default library search path of the user. Wes ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] ./configure --with-openssl=path fails
On 1/6/06 2:47 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: >> ./configure --with-openssl=path >> This no longer works with 8.x. See: >> <http://archives.postgresql.org/pgsql-committers/2003-11/msg00278.php> >> What is the correct way to work around this restriction? > > Use --with-includes and --with-libraries as needed. That doesn't get the library paths into the binary. If the libraries are not in the default system search path, the user is screwed. Is there a way to solve this? I think the previous --with-openssl=path set the search path so the libraries would always be found. Wes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Finding orphan records
I'm trying to find/delete all records in table A that are no longer referenced by tables B or C. There are about 4 million records in table A, and several hundred million in tables B and C. Is there something more efficient than: select address_key, address from addresses where ( not exists(select 1 from B where BField=addresses.address_key limit 1) ) and ( not exists(select 1 from C where CField=addresses.address_key limit 1) ) Of course, all fields above are indexed. There are foreign key references in B and C to A. Is there some way to safely leverage that? Wes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Finding orphan records
On 1/12/06 12:23 AM, "Jonel Rienton" <[EMAIL PROTECTED]> wrote: > Resending sample query, darn where clause didn't wrap > > select a.*,b.* from a > left outer join b on a.id = b.a_id > where b.id is null; I tried something along those lines a while back, and it was orders of magnitude slower. The above produces: explain select address_key, address from addresses left outer join messages on addresses.address_key=originator where originator is null; QUERY PLAN - Merge Left Join (cost=35684870.14..38457295.97 rows=4090203 width=40) Merge Cond: ("outer".address_key = "inner".originator) Filter: ("inner".originator IS NULL) -> Index Scan using addresses_pkey on addresses (cost=0.00..97213.17 rows=4090203 width=40) -> Sort (cost=35684870.14..36129462.74 rows=177837040 width=11) Sort Key: messages.originator -> Seq Scan on messages (cost=0.00..7215718.40 rows=177837040 width=11) This appears to be very inefficient. B is almost two orders of magnitude larger than A. C is about 3-4 times as big as B (record count). My statement (with the same single 'B' table as above) produces: narc=> explain select address_key, address from addresses where ( not exists(select 1 from messages where originator=addresses.address_key limit 1) ); QUERY PLAN Seq Scan on addresses (cost=0.00..3398462.98 rows=2045102 width=40) Filter: (NOT (subplan)) SubPlan -> Limit (cost=0.00..0.81 rows=1 width=0) -> Index Scan using messages_i_orig_mdate on messages (cost=0.00..35148.46 rows=43301 width=0) Index Cond: (originator = $0) Which seems like it should be much more efficient. Wes ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Finding orphan records
On 1/12/06 10:37 AM, "Greg Stark" <[EMAIL PROTECTED]> wrote: > If it's only a factor of 3-4 then the merge join should be faster. If it's > really two orders of magnitude (100x?) then the nested loop below would be > faster. I think in 8.1 (and I think in 8.0 too) the planner is capable of > coming up with both plans for the NOT IN query though. Yep, it's really two orders of magnitude. A is about 4 million records. B is around 200 million, and C is about 800 million records. That's why I figured a sequential pass on A with index lookups on B and C would be fastest. It took about 3.25 hours to complete the SELECT. I'm working on the 8.1 upgrade now. My biggest concern is making sure no bugs have crept in to ecpg that would affect us (bitten once before, so being careful). So far everything else is looking good with 8.1. I'm pondering dumping the keys for A to a file, and B and C to another file, sorting with uniqueness on the B/C file, then programmatically determining which keys have been freed. I'm pretty sure this will be much faster, but I had hoped to avoid an external process like this. Wes ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] ECPG and COPY
I found this in the archives: <http://archives.postgresql.org/pgsql-interfaces/2004-04/msg5.php> With 8.x, is it still true that with ECPG, it is not possible to use COPY inline - I'd have to write the data to a file, then 'COPY FROM filename' (or use direct libpq API calls instead of ECPG)? Wes ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] ECPG and COPY and PQputCopyData - don't get errors
Since I can't use ECPG COPY, I'm trying to use libpq PQputCopyData. Not realizing I had to use CSV format to include column header names, I used: PQexec(conn, "COPY detail FROM stdin") I sent the column headers as the first line of the input. While this I now know this is wrong, the problem here is that I got no error response back. In the postmaster log, I see: ERROR: invalid input syntax for type numeric: "messageid" CONTEXT: COPY detail, line 1, column messagesize: "messageid" However, every call to PQputCopyData, as well as PQputCopyEnd returned a result of 1. Ok, so I fixed the above error, but any other data load errors are only showing up in the postgres log file. I tried defining PQsetNoticeReceiver(conn, &pgsqlNoticeReceiver, (void *) &dummy); For testing, pgsqlNoticeReceiver just writes a message and aborts. However, pgsqlNoticeReceiver is never called. What am I missing? I need to abort the transaction on any errors in the COPY. Wes ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] ECPG and COPY and PQputCopyData - don't get errors
On 2/26/06 12:12 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: > That just indicates that the data was sent to the backend without error. > You forgot to check the copy command result afterward. Ok, after re-reading the docs for the 10th time, I see I missed that I need to paragraph about calling 'PQgetResult' after the PQputCopyEnd. I was just checking the result and PQerrorMessage. I think what threw me off was: "If a COPY command is issued via PQexec in a string that could contain additional commands, the application must continue fetching results via PQgetResult after completing the COPY sequence. Only when PQgetResult returns NULL is it certain that the PQexec command string is done and it is safe to issue more commands." (looked like this was needed only if multiple commands were in the Pqexec) and "PQgetResult Waits for the next result from a prior PQsendQuery, PQsendQueryParams, PQsendPrepare, or PQsendQueryPrepared call, and returns it." (doesn't mention COPY). I have it working now. Thanks. Wes ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Wisconsin Circuit Court Access (WCCA) on
On 3/13/06 5:50 PM, "Kevin Grittner" <[EMAIL PROTECTED]> wrote: > The license for the commercial product contains a clause which > prohibits disclosing benchmarks of their product without their written > permission. (Heaven only knows why they would include such a clause.) > My comment is not based on any formal benchmarks, but on the graphs > produced by our monitoring of the production software under real loads, > and on the query metrics from our middle tier software in the production > environment. Even though it would be a huge stretch to call the > comparison a benchmark under these conditions, this is a litigious > society. I'm sure you understand my concern. > > Short of being compelled by law to open our records, I'm not > comfortable providing any performance comparison which names the > vendor. Hmm.. What vendor throws such a clause into all their licensing agreements. Sounds suspiciously like Microsoft... I'm not a lawyer, but I believe things such as what vendors a government entity is using is required by law to be public information. Wes ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Build failures on RedHat 3.0 with openssl/kerberos
I try to build 8.1.3 with: ./configure --prefix=/usr/local/pgsql8.1.3 --with-openssl --with-pam --enable-thread-safety It fails the openssl test, saying openssl/ssl.h is unavailable. Digging deeper, I find that it is because the test program with #include is failing because it can't include krb5.h. Based on another post, I tried adding "--with-krb5". That explicitly aborted with it unable to find krb5.h. I then tried: ./configure --prefix=/usr/local/pgsql8.1.3 --with-openssl --with-pam --enable-thread-safety --with-krb5 --with-includes=/usr/kerberos/include Now it gets past both the openssl and kerberos, but bites the dust with: configure: error: *** Thread test program failed. Your platform is not thread-safe. *** Check the file 'config.log'for the exact reason. *** *** You can use the configure option --enable-thread-safety-force *** to force threads to be enabled. However, you must then run *** the program in src/tools/thread and add locking function calls *** to your applications to guarantee thread safety. If I remove the --with-krb5, it works. Why does enabling Kerberos break threads? I haven't been able to find any issues in the archives with krb5 and threads. Am I missing something here? Wes ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Build failures on RedHat 3.0 with openssl/kerberos
On 3/14/06 2:55 PM, "Louis Gonzales" <[EMAIL PROTECTED]> wrote: > Did you try to ./configure w/out "--enable-thread-safety?" I recently > compiled postgreSQL 8.0.1 on Solaris and _needed_ --enable-thread-safety > strictly for building Slony-I against postgresql with that feature enabled. > > What is the reason you are compiling this _with_ the feature? > If it's necessary, then you may need to --with-includes= and/or --with-libs= > with additional include directories, such as /usr/include:/usr/include/sys > where-ever the thread .h files are for your OS. > > This configure attempt could be failing, because it can't locate the > correct thread headers and/or libraries Why would I not want to specify enable-thread-safety? I want to be able to write threaded programs. --enable-thread-safety works fine until I enable --with-krb5, so it is finding the thread libraries. Wes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] invalid UTF-8 byte sequence detected
Based on a couple of other posts, I think I'm out of luck, but I'm hoping something might have changed recently. I'm loading a very high volume of data with COPY using libpq - about 100+ million rows per day. The problem is that the COPY sometimes aborts with invalid UTF-8 byte sequence detected I'd prefer not to add to the overhead by pre-validating every string, since PostgreSQL validates it already. Is there a way to get the server to replace invalid characters with something like blank instead of generating a fatal error? Wes ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] invalid UTF-8 byte sequence detected
On 3/15/06 11:42 PM, "mike" <[EMAIL PROTECTED]> wrote: > If you don't care to store those characters then maybe you should change > the database character set to use SQL_ASCII instead of UTF-8. I believe > ASCII will quietly discard those characters when converting from UTF-8. I thought about that, but I do want to allow UTF-8 to be stored. I just want it to replace illegal characters with some valid character so that invalid records will load. Even if I preprocess the data, I can do no more than that. Wes ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] invalid UTF-8 byte sequence detected
On 3/16/06 12:13 AM, "Michael Glaesemann" <[EMAIL PROTECTED]> wrote: > SQL_ASCII is *not* ASCII. See the "What's a good default encoding?" > thread on this same list from today. > > http://archives.postgresql.org/pgsql-general/2006-03/msg00685.php > > I don't believe it will discard anything on import if the database is > SQL_ASCII encoded. That might be worth a shot. I don't really understand the ramifications, though, especially given Tom's warning. I guess as long as I don't care about things like sort order for those fields, it may not matter much. Wes ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] invalid UTF-8 byte sequence detected
On 3/16/06 12:41 AM, "Michael Glaesemann" <[EMAIL PROTECTED]> wrote: > I hope you didn't take my comments as a suggestion: I think you > *should* preprocess your data and use UTF8 as the default encoding > (or another encoding that includes all of the characters you hope to > use) for your database. Yeah, after thinking about it some more, it probably isn't such a good idea - it would make the database act rather strange with non U.S. Data. I really hate to have to burn the cycles to validate anything that looks like extended UTF-8 characters, only to have Postgres do it again. But maybe there's no choice. Wes ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] ERROR: end-of-copy marker corrupt
Version: 8.1.3 System: RedHat Linux 3.0 ES In doing a bulk load with libpq (PQexec, PQputCopyData, PQputCopyEnd), the COPY is failing with: COPY (d) command did not complete successfully: ERROR: end-of-copy marker corrupt CONTEXT: COPY detail, line 47201: "" There are 50,000 lines in this transaction, so this is not the last line. Other postings indicate this is from a '\.'. On line 47201, I do see a '\.' embedded in the middle of the line. The documentation states that this should be a problem only if those are the only two characters on a line. A posting from December 2005 against 8.1.0 (bug #2114) seems to indicate this should be fixed. Am I missing something, or is there still a problem? Do I still need to change '\.\ to '\\.'? Should I be using PQescapeString on strings being passed to COPY? Wes ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Partitioning - when is it too many tables?
I'm working on a database that will (at current load) add 100 million records per day to a database, and needs to keep around 6 months of data online. Of course, we don't want the loads to be running all day while queries are going on. Using COPY with indexes active runs great with an empty database, then drops precipitously as the index size grows. It looks like I should be able to hit the numbers if I insert the data using COPY with no indexes, then add the indexes. I'm looking at partitioning with one table per day. So, we'd be looking at about 180 tables with 100 million rows each. Searches would typically be within a single day, although they could span multiple days. This keeps the indexes a more or less reasonable size, and allows quick deleting of the old data. Is there any problem with 180 child tables? How many would be too many (e.g. if I did one table per 6 hours?) Thanks Wes ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Where is client/server compatibility documented?
Where can I find documentation on what client library version interoperate with which server versions? I've been unable to find anything in the manual or archives. The only thing I've found is some information in the archives on certain psql versions. Is there a compatibility matrix somewhere for the client libraries? Specifically, I have a C application (ecpg/libpq), as well as perl DBI, built against PostgreSQL 7.4.1. Am I safe in upgrading the server to 8.1.3 before the application can be rebuilt? I've seen no problems on the test server, but... There's no system catalog monkey business - just plain old SQL. Wes ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Unexplained lock creating table
I have a C application (libpq) that uses partitioning. I create parent tables 'header' and 'detail'. The application reads opens multiple connections, creates the child tables, and uses COPY to import the data: open connection 1 begin create table header_1 COPY into header_1 open connection 2 begin create table header_2 COPY into header_2 open connection 3 begin create table header_3 COPY into header_3 [ potentially more connections ] end copy 1 end copy 2 end copy 3 commit 1 commit 2 commit 3 After the database is initialized, I run the application. It creates table header_1 and initiates the copy (transaction still pending). However, when it tries to create table header_2, it hangs. A ps shows postgres: test trace [local] COPY postgres: test trace [local] CREATE TABLE waiting However, if I force table header_1 to be created outside the COPY transaction (using psql, manually committing the transaction from within gdb, etc.), then run the application, it works regardless of the number of open connections/transactions. I then drop all the child tables, leaving the parent table, and rerun the application. It again works for all connections. The problem occurs only when the database has been freshly initialized and no child table has ever existed. I confirm this by: 1. Reinitialize database. 2. Run application. Verify hang occurs. I can rerun step 2 any number of times and it continues to hang. 3. Create header_dummy using psql 4. Drop header_dummy 5. Run application - works. I can repeat this with the 'detail' table. It is 100% reproducible. What's going on? Wes ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Unexplained lock creating table
> If this is the first child table of "header", which it presumably is, then > the pg_class row for "header" has to be updated to set relhassubclass. Since > you do not commit the create before continuing, this backend continues to > hold a row lock on the updated, uncommitted pg_class row while you do... [snip] > relhassubclass isn't reset to false when all child tables of a parent > table are removed. So next time through there is no need to update the > pg_class row for the parent table. > > So the simplest workaround is probably to ensure that you create at least > one partition in each table at the outset, before trying to actually load > any data. That's exactly what I was looking for. Now I know it is safe to do a 'create table', 'drop table' in my database creation script to circumvent the problem - or to be 100% safe should I not do the 'drop table'?. I was afraid I might have to move the 'create table' outside of the transactions. Many thanks Wes ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Concurrency problem building indexes
Postgres: 8.1.3 OS: Mac OS X 10.4.6 I've run into another concurrency issue - parallel building of indexes. When I try to build multiple indexes at a time, I randomly get: ERROR: tuple concurrently updated The following thread talks about this, but there is no answer. <http://archives.postgresql.org/pgsql-hackers/2002-07/msg00969.php> How can I safely build indexes in parallel? At this point, I'm only trying to build two at a time. I will be building indexes for tables with any where from a few rows to 100 million rows on a daily basis - I need to maximize performance. Wes ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] Leading substrings - alternatives with 8.1.3?
Back in the 7.4 era, to make leading substrings be indexed, I had to set locale=C. See thread: <http://archives.postgresql.org/pgsql-general/2005-02/msg00159.php> I couldn't find any updates to this issue in the archives since then. With 8.1.3, the default locale for my system is en_US.UTF8. Leading substring searches result in a sequential search instead of an indexed search. Do I still have to initdb to locale=C, or is there a better option now? Wes ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Leading substrings - alternatives with 8.1.3?
On 5/1/06 12:47 PM, "Martijn van Oosterhout" wrote: > Check the documentation, but if you declare a index with (for example) > text_pettern_ops, then LIKE will work even if the rest of your DB is > UTF-8. My understanding of using operator classes is that I'd have to create two indexes for each column - one with and one without the operator class. That is also what was indicated in the original thread. Defining multiple indexes on a given column isn't feasible, due to the database size (100 million rows per day). Wes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Online backups and tar
1. Section 23.3.2 (Making a Base backup) of the 8.1.0 doc says: "Also, some versions of GNU tar consider it an error if a file is changed while tar is copying it. There does not seem to be any very convenient way to distinguish this error from other types of errors, other than manual inspection of tar¹s messages. GNU tar is therefore not the best tool for making base backups." What then, on linux (Red Hat 3.0 ES), is the recommended backup program for online backups? It appears cpio does the same thing. 2. During a restore, postmaster tries to restore a file '0001.history'. Neither that file, nor any *.history file, is anywhere to be found. I can not find this documented anywhere. What is this file? Does it have to do with timelines? Wes ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Adding foreign key constraints without integrity check?
Is there a way to add a foreign key constraint without having to wait for it to check the consistency of all existing records? If a database is being reloaded (pg_dumpall then load), it really shouldn't be necessary to check the referential integrity - or at least I should be able to stipulate that I am accepting that risk. My database reload is currently taking about 6 hours to load the data, 42 hours to reindex, and about another 40 hours or so to check the foreign key constraints (about 1.2 billion rows). That's a very long time to be out of commission. I'd really like to eliminate that second 40 hours so I can get it down to a normal weekend. Wes ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Adding foreign key constraints without integrity
On 6/18/06 10:48 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: >> My database reload is currently taking about 6 hours to load the data, 42 >> hours to reindex, and about another 40 hours or so to check the foreign key >> constraints (about 1.2 billion rows). > > What PG version is this, and what have you got maintenance_work_mem set to? > Undersized m_w_m would hurt both index build and FK checking ... Sorry, forgot that info.. PG is 8.1.4 (I'm testing an upgrade from 7.3.4 to 8.1.4). maintenance_work_mem is set to 983025 - 1.5 times the previous value of 655350. Current RSS of postmaster is about 1.3 GB. System memory is 2GB (would like more, but...). Data is on one array (hardware RAID 5 of eight 73 GB 10k SCSI drives), indexes on another (hardware RAID 5 of five 15K 146 GB SCSI drives on a separate channel), and pg_xlog on a third RAID 1 on a third channel). There are two 2.4 GHz Xeon processors). Wes ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Adding foreign key constraints without integrity
> You could create the fk-constraints _first_, then disable them, load > the data, reindex, and reenable them afterwards. > > pg_dump/pg_restore can enable and disable fk-constraints before restoring > the data, I believe. It does so by tweaking the system catalogs. Are referring to '--disable-triggers' on pg_dump? Will this work for foreign key constraints? The doc talks about triggers, but doesn't say anything about FK constraints (are these implemented as triggers?) I don't use pg_restore, just psql. > The only problem I can see is that creating the fk-constraints might create > some indices too. But maybe you can manually drop those indices afterwards - I > don't know if the fk really _depends_ on the index, or if it creates it only > for convenience. I don't see any indexes being added to the table beyond what I add, and those added as a primary key constraint. Currently, pg_dump outputs the FK constraints after the indexes are built, as the last steps. If I try to add the FK constraints after loading the database definitions, but without any indexes, I'm not sure what would happen. Wes ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Adding foreign key constraints without integrity
> What PG version is this, and what have you got maintenance_work_mem set to? > Undersized m_w_m would hurt both index build and FK checking ... Looking at the stats again, I did see something unexpected. With MWM set to 983025, index builds were running about 1.3 GB for both RSS and virtual. Adding foreign key constraints, RSS is about 1.1 GB but virtual is slightly over 2 GB. Wes ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Adding foreign key constraints without integrity
On 6/19/06 3:24 PM, "Scott Marlowe" <[EMAIL PROTECTED]> wrote: > Are you sure that's really the problem? Do you have indexes on the > referring tables (i.e. the foreign key that points to the other table's > primary key). Not having an index on the subordinate table makes each > and every check on the FK->PK relationship require a seq scan of the > subordinate table. Yes, I am sure I have the indexes for both sides. (I just double checked). If not, wouldn't I have abysmal load performance in production, or does that function differently? The constraint that just finished after 18+ hours has a non-composite index on both sides. The other constraint: ADD CONSTRAINT "$2" FOREIGN KEY (recipient) REFERENCES addresses(address_key); has a normal index on address_key. The recipient index is (recipient, date) composite index. This index has always been used when searching on just recipient, and the last time I rebuilt the database (a year ago), the FK addition was inline with expectations. In every case, it takes about the same time to add the foreign key constraint as to create the index, maybe a little more. This is true regardless of whether one of the indexes is composite or not. One constraint build just finished after a little more than 18 hours. The table where the constraint is being created has about 900 million rows. The 'references x' table has about 200 million rows. It churns along eventually eating up about 50 GB or so in the 'base' directory. When that stops growing, it took maybe another hour or two to complete. I'd just like to be able to tell it to simply add the constraint without doing any checking. I know the data is consistent. I just dumped it from a database that has referential integrity enabled. Even if there were an error that crept in to the old database, I don't care - just add the constraint so I can get back online. Right now I'm looking at around 80-90 hours total to do a reload. What's it going to be a year or two from now? I could be out of commission for a week. Wes ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Adding foreign key constraints without integrity
On 6/19/06 3:47 PM, "Jim Nasby" <[EMAIL PROTECTED]> wrote: > FWIW, RAID5 isn't normally a good recipe for good database performance. Understood, but RAID 1 isn't always feasible. The database performs very well with the current hardware configuration - I have no complaints. The only time we have issues is during index rebuilds or a reload. > This won't help with constraint checking, but you should consider > turning fsync off during your restore. I didn't think about doing that, but I wonder how much it would help? The actual data load only takes about 6 hours or so. It's all the index building and constraint checking that takes the time. This time seems to be spent mostly writing temp files and sorting. > Also, why are you frequently dumping and restoring? We have to periodically reindex due to the fact that vacuum processes indexes in index order instead of disk space order. A 'vacuum full' is currently taking about 24 hours. After a reindex, I expect that to drop to 3-4 hours until the index becomes fragmented. I don't think much can be done about the time to reindex (or the order vacuum processes indexes). I wouldn't say we dump/reload frequently. Right now I'm doing a trial run for the 8.x upgrade - I have to make sure I can complete it in the allotted time. However, about a year ago, I had to do multiple reloads in a 2 week period. We had database corruption, and had to reload from source to get a good database - multiple times. We had a defective disk array that wasn't reporting any errors but kept corrupting the database. Besides PG upgrades, I have to consider how long it takes to recover from system failures if I can't use the offline file system backup. I'll be switching to online backups once we get upgraded, but if a reload fails there, I'll again have to fall back to the weekly source backup. Wes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Adding foreign key constraints without integrity
On 6/19/06 4:48 PM, "Scott Marlowe" <[EMAIL PROTECTED]> wrote: > > I think Jim was thinking of RAID 1+0. If you gotta use RAID 5, at least > have a contoller with several hundred megs of battery backed cache. > Better yet, use a RAID controller with a gig or so of BBU cache and run > RAID 1+0 on it. If you can. Yes, I realize that is ideal (I meant to write RAID 10 not 1). That takes hardware I don't have, and can't justify spending the money on right now. I/O doesn't appear to be a major issue, although I wasn't constantly monitoring it. Most of the time, the I/O load was not that high - CPU was pegged. Normal production updates and queries are well within (not at all low) expectations. > Actually, the same question about why your frequently dumping and > restoring applies to full vacuums. Why are you doing them? A properly > running database should not need vacuum fulls. Nor reindexing. Simple... VACUUM FULL reads the entire database. Between a full source dump and a full vacuum, I've got a better chance of finding and recovering from corruption sooner. Once bit... There is no (last time I checked) utility to verify full database integrity, and there are no CRC's on data blocks. A pg_dumpall doesn't touch the indexes. While this won't catch everything, seems pretty good at catching corrupted indexes. I've seen no more effective ways of verifying the entire database. The kind of hardware problem we encountered last year is rare, but it does happen. I've seen similar things many times over the years. RAID doesn't help you when a controller mirrors garbage. > You should look into slony. You can replicate from one version of pgsql > to another, a feature intended to be used for updating. I'd love to, but that requires hardware I don't have. Wes ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Adding foreign key constraints without integrity
BTW, I do appreciate any and all suggestions. I hope my last message didn't come across otherwise. That's not what was intended. It's just that adding/replacing hardware is not an option right now. Maybe next year... I'm still trying to dig up another 2GB memory. The database actually performs extremely well with the current hardware in the vast majority of cases. I just wish I could cut off the wasted 40+ hours adding the foreign key constraints - it would cut the reload time in half. It's too bad the suggestion of creating the FK before hand didn't pan out. That looked like a good shot. Wes ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Adding foreign key constraints without integrity
On 6/20/06 5:07 AM, "Florian G. Pflug" <[EMAIL PROTECTED]> wrote: > My suggestion was to create the fk _before_ loading the data, and disable it > similarly to what "--disable-triggers" doest. It turned out, however, that a > FK always depends on a unique index (be it a primary key, or not), which > prevents > my plan from working :-( That was a great idea - too bad it didn't pan out. I don't suppose there's any (reasonable) way to directly insert into the system tables to create the constraint? I could knock almost 2 days off of the almost 4 days to reload if I could solve this. Wes ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Adding foreign key constraints without integrity
On 6/20/06 8:17 PM, "Florian G. Pflug" <[EMAIL PROTECTED]> wrote: > Maybe you could ask at some postgresql support companies how much effort it > would > be to add a "without check" flag to "alter table add constraint foreign key", > and > how much they'd charge for it... Or if I get ambitious, dig into the code myself if I can figure out where to start... Wes ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Adding foreign key constraints without integrity
On 6/22/06 2:57 PM, "Jim Nasby" <[EMAIL PROTECTED]> wrote: > If nothing else, you should bring it up on -hackers and ask to have > this added as a TODO. It seems like a worth-while addition to pg_dump/ > restore to me... Thanks for the suggestion. > To answer another of your emails in this thread... a LAZY vacuum of > the entire database will read every table and index in the database. > Wanting to read the entire database to check for corruption is no > reason to do a VACUUM FULL. Sorry, I misspoke. I do not do a VACUUM FULL. I do a VACUUM VERBOSE ANALYZE on a weekly basis. The verbose allows me to see which tables/indexes are getting bad (vacuum time wise). This is currently taking about 24 hours on a weekend (very little production load competing with it). The time drops dramatically after a reindex, then creeps up again as the indexes are updated in random order. > Also, if you're curious about restore > time for your upgrade, you should be doing the restore to an 8.1.4 > database, not to your current version. There's been a ton of > performance improvements made. In fact, I'm wondering if constraint > checking in restore has been improved... Yes, that is what I did. I'm in the process of testing an upgrade from 7.3.x to 8.1.4 - export from 7.3.x and import into 8.1.4. Unfortunately, I'm sitting at about 90 hours when I've got about an 80 hour window on a long weekend... Wes ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] Lock changes with 8.1 - what's the right lock?
8.1 improved locking for foreign key references but had an unexpected consequence to our application - no parallel loads in our application. The application does an EXCLUSIVE lock on 'addresses'. It then gets all of the keys from 'addresses' it needs, and adds new ones encountered in this load. It then completes the transaction, releases the exclusive lock, and inserts the other table's records using the values read from/inserted into 'addresses'. There are foreign key constraints between the various tables and 'addresses' to insure referential integrity. Previously (pgsql 7.4.5), multiple loads would run simultaneously - and occasionally got 'deadlock detected' with the foreign key locks even though they were referenced in sorted order. When loading tables other than 'addresses', foreign key locks did not prevent other jobs from grabbing the exclusive lock on 'addresses'. With 8.1.4, the foreign key locks prevent other instances from grabbing the lock, so they wait until the first job is complete - only one job loads at a time. About EXCLUSIVE locks, the manual says: "...only reads from the table can proceed in parallel with a transaction holding this lock mode." What is now the appropriate lock? It needs to: 1. Prevent others from updating the table 2. Block other jobs that are requesting the same lock (if job 2 does a SELECT and finds nothing, it will try to create the record that job 1 may already have created in its transaction). 3. Not conflict with foreign key reference locks SHARE does not appear to be appropriate - it would fail #2. Maybe "SHARE UPDATE EXCLUSIVE"? Wes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Lock changes with 8.1 - what's the right lock?
> Did you try the example I posted? What happens if you do the > following? First set up a couple of tables: I no longer have 7.4 running. I'll have to check and see if can test. > Is it possible that your foreign key constraints are DEFERRED in > 7.4 but not in 8.1? That would change transaction behavior that > depends on locks. No, there was no change to the database other than to start using table spaces in 8.1. Previously when multiple jobs were running concurrently, they'd block while updating 'addresses', then take off and complete - unless they hit a foreign key that someone else had referenced in which case they'd block again. Now one job waits at the initial addresses lock until the other is completely done running (i.e. not updating tables that reference addresses). > An ordinary SELECT acquires ACCESS SHARE, which conflicts only with > the strongest lock, ACCESS EXCLUSIVE. If you're modifying a table > that has a foreign key constraint then 8.1 does a SELECT FOR SHARE > on the referenced table and earlier versions do a SELECT FOR UPDATE; > both cases acquire ROW SHARE on the referenced table. Two lock > types are stronger than SHARE ROW EXCLUSIVE: EXCLUSIVE and ACCESS > EXCLUSIVE, both of which conflict with ROW SHARE. So instead of > saying that SHARE ROW EXCLUSIVE is the weakest lock that meets your > requirements, I should have said that it's the only lock that does. That's what I thought from reading the manual, but wasn't sure. Thanks. I found the information that led me to believe the locking was fixed in 8.1. The problem I had with 7.x was the 'deadlock detected' if the foreign keys weren't referenced in sorted order, as the records were locked with more than a share lock. >> I haven't seen anything to indicate that 8.x improves foreign key >> refererence locking and fixes the foreign key 'deadlock detected' issue. >> Has that been addressed in 8.1? > > 8.1 should be using the new shared row locks for doing the checks. This > should fix the case mentioned. > >> I've never quite understood why a READ of a record with a foreign key >> reference results in the referenced record being locked with more than a >> shared lock. > > Up until now, we didn't have one to get on a per-record basis. and >> So, until 8.1 PostgreSQL had "something better than row-level locking" for >> some things, but no row locking when needed? Or was it row locking is >> there, but just no shared row locking? > > The latter, the row locks before were single owner and were such that a > second lock request for the same row would wait for the first to be > released. Now effectively you have two levels of locks at the row level, > the weaker of which conflicts with the stronger but not with itself. The > thing about MVCC is that readers do not have to get either lock if they > aren't trying to prevent modifications. Wes ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Lock changes with 8.1 - what's the right lock?
> Unless I'm misunderstanding you or a bug was fixed between 7.4.5 > and 7.4.13 (the version I'm running), I'm not convinced that last > statement is true. EXCLUSIVE conflicts with all lock types except > ACCESS SHARE; foreign key references prior to 8.1 use SELECT FOR > UPDATE and in 8.1 they use SELECT FOR SHARE, but in both cases they > acquire ROW SHARE on the referenced table, which conflicts with > EXCLUSIVE. My apologies for being so unclear. I had intended to just indicate that the problem occurred when we upgraded from 7.4.5 to 8.1, as opposed to this being fixed specifically in 8.1. I didn't realize this was fixed in a 7.4 release - I thought it was 8.x. The last time I'd checked, it hadn't been fixed in 7.x. >> What is now the appropriate lock? It needs to: >> >> 1. Prevent others from updating the table >> 2. Block other jobs that are requesting the same lock (if job 2 does a >> SELECT and finds nothing, it will try to create the record that job 1 may >> already have created in its transaction). >> 3. Not conflict with foreign key reference locks > > SHARE ROW EXCLUSIVE is the weakest lock that meets these requirements. > It conflicts with itself (#2) and with ROW EXCLUSIVE, which UPDATE, > DELETE, and INSERT acquire (#1), but doesn't conflict with ROW SHARE, > which is what SELECT FOR UPDATE/SHARE acquire (#3). Thanks for the confirmation. Is there any stronger lock that would not block SELECT foreign key references? I didn't find any documentation on what type of lock is grabbed by a when a foreign key is referenced during SELECT (or other). Wes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] How to determine a database is intact?
On a nightly basis, we shut the database down and do a file system backup. A short chronology of our database problem: 8/21 - count(*) of user tables succeeded (done once a week to get statistics) 8/23 - A specific search on a specific value (one out of over 2 million) caused postmaster to SEGV. I dropped the index in question and rebuilt it. All appeared ok. 8/28 - count(*) failed - postmaster aborted on a SEGV. I attempted a pg_dumpall. Postmaster aborted with SEGV about 240 MB into the dump (the database is about 50 GB). I reloaded 8/25 database and attempted a pg_dumpall. Same failure. I reloaded from the 8/21 file system dump. Pg_dumpall worked on this one. I spent the next almost 48 hours dumping and reloading 240 million rows and reprocessing the several million additions since 8/21. I didn't dare use the 8/21 database without reloading because I didn't know if it was good or not. Question: How can we tell that a database is intact? In the above example, pg_dumpall worked on the 8/21 database. Did it become corrupt between 8/21 and 8/23, or was it already corrupt and got worse? Pg_dumpall tells you nothing about the condition of indexes. Could a corrupt index corrupt data blocks? I'm looking at doing a pg_dumpall on a weekly basis so that we have a point in time where we know we have a recoverable database. When the database reaches several hundred GB and over over a billion rows, this isn't a great solution, and doesn't address the overall database integrity. Back to the original question... How can I verify the complete integrity of a database - especially a very large one where a reload or full index rebuild could take on the order of days? Wes ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] How to determine a database is intact?
On 9/3/04 3:11 AM, "Richard Huxton" <[EMAIL PROTECTED]> wrote: > You shouldn't have to verify anything. PG's job is to never corrupt your > data, and providing your hardware is good it should do so. If you are > getting problems almost daily that would suggest a RAM/disk problem to > me (sig 11 usually implies RAM). Can't guarantee it's not PG but it's > record of reliability is pretty good. I believe SEGV typically just indicates it de-referenced a bad pointer (i.e. NULL or out of range). The problem is not occurring on a daily basis. The database has been in service since December of last year. It's just that the symptoms progressed from no apparent symptoms, to a clearly corrupt DB. My guess is that some minor corruption fed upon itself until the DB couldn't even be dumped. > Steps I'd take: > 1. Check your version number against the release notes and see if you > should upgrade. You don't mention your version, but it's always worth > having the last dot-release (7.2.5, 7.3.7, 7.4.5) > 2. Schedule time to run memory/disk tests against your hardware. Finding > 48 hours might not be easy, but you need to know where you stand. > 3. Setup slony or some other replication so I can schedule my downtime. I thought I mentioned the level in my original mail - 7.4.1. We are planning on running some diagnostics. Whether there is a bug in PostgreSQL, or there was a memory hit, or whatever doesn't really matter to the original question. The database can become corrupt. How can I tell that a database is fully intact at any given point in time? If I reload from a system backup before the known corruption, how can I be sure that the original corruption that precipitated the failure is not still there and will again rear its ugly head? Wes ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] How to determine a database is intact?
On 9/5/04 9:04 AM, "Jan Wieck" <[EMAIL PROTECTED]> wrote: > 24 hours to do what? The pg_dumpall, the restore or both? There's more than 250 million rows. If I remember right, it's ballpark 25% data reload, 75% index/foreign constraint rebuild. Pg_dumpall is something like 3 hours or so. Wes ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] How to determine a database is intact?
On 9/9/04 11:07 AM, "Bruce Momjian" <[EMAIL PROTECTED]> wrote: > Believe it or not, we haven't gotten many requests for this feature, > partly because such corruption is so rare. Also, any checker isn't > going to find a change from "Baker" to "Faker" in a text field. Yep, unless you added a CRC (and accepted the performance hit) to each record the best you could do is verify that the database is consistent. That would still be quite valuable, though - all block headers are valid, indexes don't point out into oblivion, etc. I expect there are only a handful of huge databases running a heavy load - the vast majority are probably tens no larger than 10's (maybe hundreds) of megabytes, or do not experience a heavy update load? Wes ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] How to determine a database is intact?
On 9/9/04 9:27 AM, "Vivek Khera" <[EMAIL PROTECTED]> wrote: > I shaved significant time from 7.4.x restores by bumping up the > checkpoint_segments to 50. My dumps currently take about 1.5 hours > over a 100Mbit ethernet (server is on gig ethernet, dumping client is > on 100Mbit). Mine are already set to 30, but in a recent re-index, I saw warnings in the log so need to bump it up some more. I'm going to likely be doing a reload this weekend, so will include this in the run. Wes ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[GENERAL] Returning a row from a function with an appended array field
I have an adjacency list kind of table CREATE TABLE thingy ( id int, parent int ); I'd like to be able to write a procedural function that returns a row or rows from this table with an appended field that represents the children. Something like this pseudo code: FOR row IN SELECT * FROM thingy LOOP RETURN NEXT row,[SELECT id FROM thingy WHERE parent_id = id] END LOOP, RETURN Any help much appreciated, Wes -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Returning a row from a function with an appended array field
On 11/9/2011 7:34 PM, David Johnston wrote: > On Nov 9, 2011, at 20:19, Wes Cravens wrote: > >> I have an adjacency list kind of table >> >> CREATE TABLE thingy ( >>id int, >>parent int >> ); >> >> I'd like to be able to write a procedural function that returns a row or >> rows from this table with an appended field that represents the children. >> >> Something like this pseudo code: >> >> FOR row IN SELECT * FROM thingy >> LOOP >> RETURN NEXT row,[SELECT id FROM thingy WHERE parent_id = id] >> END LOOP, >> RETURN >> >> Any help much appreciated, >> >> Wes >> > > Use "WITH RECURSIVE" instead of a function. > I apologize but I don't know how that would work. An example would help. Also... my pseudo code above was a little flawed: RETURN NEXT row,[SELECT id FROM thingy WHERE parent_id = id] should be... RETURN NEXT row,[SELECT id FROM thingy WHERE parent_id = row.id] -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Returning a row from a function with an appended array field
On 11/9/2011 7:19 PM, Wes Cravens wrote: > I have an adjacency list kind of table > > CREATE TABLE thingy ( > id int, > parent int > ); > > I'd like to be able to write a procedural function that returns a row or > rows from this table with an appended field that represents the children. Just in case someone else want's an answer to this tread... and despite the fact that the postgresql documentation is excellent and has plenty of examples, WITH RECURSIVE is still a bad solution... I already needed an independent get_children function: CREATE OR REPLACE FUNCTION get_children ( lookup_id INT ) RETURNS int[] AS $$ SELECT array_agg( id ) FROM ( SELECT id FROM thingy WHERE parent_id = $1 ORDER BY id ) t; $$LANGUAGE 'sql'; And I just used that in a view to get what I wanted: CREATE OR REPLACE VIEW thingy_view AS SELECT *,get_children(id) AS children FROM thingy; I then updated all of my other get_ accessor postgresql functions to use the view instead of the base table. FTW Wes -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Returning a row from a function with an appended array field
On 11/10/2011 12:05 PM, David Johnston wrote: > -Original Message- > From: pgsql-general-ow...@postgresql.org > [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Wes Cravens > Sent: Thursday, November 10, 2011 11:54 AM > To: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Returning a row from a function with an appended > array field > > On 11/9/2011 7:19 PM, Wes Cravens wrote: >> I have an adjacency list kind of table >> >> CREATE TABLE thingy ( >> id int, >> parent int >> ); >> >> I'd like to be able to write a procedural function that returns a row >> or rows from this table with an appended field that represents the > children. > > > If you only care about one level of hierarchy then, yes, WITH RECURSIVE is > overkill. You want to use WITH RECURSIVE in those situations where the > depth of the hierarchy is unknown. Yes agreed... WITH RECURSIVE would be handy for something like get_ancestors or get_descendents. Wes -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Vacuum time degrading
On 3/2/05 3:51 PM, "Tom Lane" <[EMAIL PROTECTED]> wrote: > I was going to suggest > REINDEXing those indexes to see if that cuts the vacuum time at all. The problem with that is it takes a very long time. I've got a couple of things to try yet on the kswapd problem. If that doesn't work, maybe I can rebuild one of the indexes and see how much that one improves. I wasn't aware that the indexes were scanned non-sequentially. The under one hour time was probably shortly after a full reload. Any chance of change that behavior to scan in physical storage order? The index from the largest table that has: CPU 216.15s/18.13u sec elapsed 2110.84 sec. is inserted in sequential order. The index CPU 518.88s/25.17u sec elapsed 10825.33 sec. has records inserted in essentially a random order, and is also something like twice as large (key size). We're going to try to test the 2.4.29 kernel tomorrow. Wes ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] password
What platform? Solaris, FreeBSD, Linux, Windows -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Michael UvhagenSent: Friday, October 21, 2005 6:10 AMTo: pgsql-general@postgresql.orgSubject: [GENERAL] passwordHi.I was installing postgresql 8.0 and let it generate a password for me, then i changed my mind and went back to change the password, but then i was stuck. How do i find the password it generated for me? How do i change it? How can i start over and install postgresql somehow? Thanks!/Mike
Re: [GENERAL] PostgreSQL vs mySQL, any performance difference for
For what it may be worth, executing the same commands into MySQL 5.0.15-nt-max (Win XP Pro) the following it received: mysql> create table test (i1 int); Query OK, 0 rows affected (0.41 sec) mysql> insert into test values (123913284723498723423); ERROR 1264 (22003): Out of range value adjusted for column (i1) at row 1 mysql> select * from test; Empty set (0.03 sec) Finally an improvement! -Original Message- mysql> create table test (i1 int); Query OK, 0 rows affected (0.07 sec) mysql> insert into test values (123913284723498723423); Query OK, 1 row affected, 2 warnings (0.07 sec) mysql> select * from test; ++ | i1 | ++ | 2147483647 | ++ 1 row in set (0.00 sec) And other fun things. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] PostgreSQL vs mySQL, any performance difference for
set sql_mode='MYSQL323'; Query OK, 0 rows affected (0.00 sec) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Scott Marlowe Sent: Tuesday, October 25, 2005 3:24 PM To: Wes Williams Cc: 'Jan'; pgsql-general@postgresql.org Subject: Re: [GENERAL] PostgreSQL vs mySQL, any performance difference for Cool. Does it still error out after issueing: set sql_mode='MYSQL323'; ??? Just wondering if bounds checking is still optional but is now op-out instead of opt-in, or if it's something that you can no longer turn off. The whole idea of correct behaviour being an option is pretty bad, but hopefully at least the defaults are for proper behaviour, or are heading that way. On Tue, 2005-10-25 at 13:52, Wes Williams wrote: > For what it may be worth, executing the same commands into MySQL > 5.0.15-nt-max (Win XP Pro) the following it received: > > mysql> create table test (i1 int); > Query OK, 0 rows affected (0.41 sec) > > mysql> insert into test values (123913284723498723423); > ERROR 1264 (22003): Out of range value adjusted for column (i1) at row 1 > > mysql> select * from test; > Empty set (0.03 sec) > > > Finally an improvement! > > > -Original Message- > mysql> create table test (i1 int); > Query OK, 0 rows affected (0.07 sec) > > mysql> insert into test values (123913284723498723423); > Query OK, 1 row affected, 2 warnings (0.07 sec) > > mysql> select * from test; > ++ > | i1 | > ++ > | 2147483647 | > ++ > 1 row in set (0.00 sec) > > And other fun things. > ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Why database is corrupted after re-booting
Type the following at the Windows command prompt (start, run, "cmd"): convert c: /fs:ntfs /v It will complain about locked files and perform the convert at the next reboot, which you should do immediately. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Joshua D. Drake Sent: Wednesday, October 26, 2005 1:10 PM To: Andrus Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Why database is corrupted after re-booting On Wed, 2005-10-26 at 19:14 +0300, Andrus wrote: > > To change partition types you need to re-format (resetting partitions > > will lose data structure - reformat required). > > Troy, > > Whole my IDE drive is 20 GB FAT32 C: drive booting XP > I have a lot of data in this drive so it is not possible to re-format. Also > I do'nt want to create two logical disks in single drive. > > Is this prevents data corruption for Postgres, is there some utility which > can convert C: drive to NTFS ? > Can Partition Magic help ? XP at least on install I believe has the ability to convert to NTFS. Have you tried just right clicking on your C: selecting properties and then seeing if there is a convert option? > > Andrus > > > > ---(end of broadcast)--- > TIP 9: In versions below 8.0, the planner will ignore your desire to >choose an index scan if your joining column's datatypes do not >match -- The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster