Re: [GENERAL] PostgreSQL with Oracle OC4J Application server
You should try directing this request to the postgres jdbc list. Have you looked at http://jdbc.postgresql.org/documentation/80/datasource.html? John Sidney-Woollett Sharon Abu wrote: Dear Postgresql experts, First i want to mention that i'm new to postgreSQL DB. I have a J2EE application which is deployed on Oracle OC4J application server and Oracle 9i DB i'm trying to migrate to PostgreSQL DB (cost reduction issues) so I have installed PostgreSQL 8.0.1 on win 2K platform and and iv'e defined a small DB (just for test...). my next step is to configure it to work with OC4J (DataSource configuration).and i couldn't find any documentation on it does anyone know how to do it ? appreciate any help -:) thanks in advance -sharon ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Nagios plugin to check slony replication
I've finally got around to writing the two nagios plugins which I am using to check our slony cluster (on our linux servers). I'm posting them in case anyone else wants them or to use them as a basis for something else. These are based on Christopher Browne's scripts that ship with slony. The two scripts perform different tasks. check_slon checks to see that the slon daemon is in the proces list and optionally checks for any error or warning messages in the slon log file it is called using two or three parameters; the clustername, the dbname and (optionally) the location of the log file. This script is to be executed on each node in the cluster (both master and slaves) check_sloncluster checks that active receiver nodes are comfirming sync within 10 seconds of the master. I'm not entirely sure that this is the best strategy, and if you know otherwise, I'd love to hear. Requires two parameters; the clustername and the dbname. This script is executed on the master database only. These scripts are designed to run on the host on which they are checking. With a little modification, they could check remote servers on the network. They are quite simplistic and may not be suitable for your environment. You are free to modify the code to suit your own needs. John Sidney-Woollett check_slon == #!/bin/sh # nagios plugin that checks whether the slon daemon is running # if the 3rd parameter (LOGFILE) is specified then the log file is # checked to see if the last entry is a WARN or FATAL message # # three possible exit statuses: # 0 = OK # 1 = Warning (warning in slon log file) # 2 = Fatal Error (slon not running, or error in log file) # # script requires two or three parameters: # CLUSTERNAME - name of slon cluster to be checked # DBNAME - name of database being replicated # LOGFILE - (optional) location of the slon log file # # Author: John Sidney-Woollett # Created: 26-Feb-2005 # Copyright 2005 # check parameters are valid if [[ $# -lt 2 && $# -gt 3 ]] then echo "Invalid parameters need CLUSTERNAME DBNAME [LOGFILE]" exit 2 fi # assign parameters CLUSTERNAME=$1 DBNAME=$2 LOGFILE=$3 # check to see whether the slon daemon is running SLONPROCESS=`ps -auxww | egrep "[s]lon $CLUSTERNAME" | egrep "dbname=$DBNAME" | awk '{print $2}'` if [ ! -n "$SLONPROCESS" ] then echo "no slon process active" exit 2 fi # if the logfile is specified, check it exists # and check for the word ERROR or WARN in the last line if [ -n "$LOGFILE" ] then # check for log file if [ -f "$LOGFILE" ] then LOGLINE=`tail -1 $LOGFILE` LOGSTATUS=`tail -1 $LOGFILE | awk '{print $1}'` if [ $LOGSTATUS = "FATAL" ] then echo "$LOGLINE" exit 2 elif [ $LOGSTATUS = "WARN" ] then echo "$LOGLINE" exit 1 fi else echo "$LOGFILE not found" exit 2 fi fi # otherwise all looks to be OK echo "OK - slon process $SLONPROCESS" exit 0 check_sloncluster = #!/bin/sh # nagios plugin that checks whether the slave nodes in a slony cluster # are being updated from the master # # possible exit statuses: # 0 = OK # 2 = Error, one or more slave nodes are not sync'ing with the master # # script requires two parameters: # CLUSTERNAME - name of slon cluster to be checked # DBNAME - name of master database # # Author: John Sidney-Woollett # Created: 26-Feb-2005 # Copyright 2005 # check parameters are valid if [[ $# -ne 2 ]] then echo "Invalid parameters need CLUSTERNAME DBNAME" exit 2 fi # assign parameters CLUSTERNAME=$1 DBNAME=$2 # setup the query to check the replication status SQL="select case when ttlcount = okcount then 'OK - '||okcount||' nodes in sync' else 'ERROR - '||ttlcount-okcount||' of '||ttlcount||' nodes not in sync' end as syncstatus from ( -- determine total active receivers select (select count(distinct sub_receiver) from _$CLUSTERNAME.sl_subscribe where sub_active = true) as ttlcount, ( -- determine active nodes syncing within 10 seconds select count(*) from ( select st_received, st_last_received_ts - st_last_event_ts as cfmdelay from _$CLUSTERNAME.sl_status where st_received in ( select distinct sub_receiver from _$CLUSTERNAME.sl_subscribe where sub_active = true ) ) as t1 where cfmdelay < interval '10 secs') as okcount ) as t2" # query the master database CHECK=`/usr/local/pgsql/bin/psql -c "$SQL" --tuples-only -U postgres $DBNAME` if [ ! -n "$CHECK" ] then echo "ERROR querying $DBNAME" exit 2 fi # echo the result of the query echo $CHECK # and check the return status STATUS=`echo $CHECK | awk '{print $1}'` if [ $STATUS = "OK" ] then exit 0 else exit 2 fi ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] problem with distinct rows
If the created_by table includes an artist number/position to indicate the first, second artist etc, eg create table created_by ( work_id integer, artist_id integer, position integer, primary key (work_id, artist_id, position) ); then you can simply use the following query select a.artist_name, w.title, w.inventory_number from artist a, works w, created_by cb where w.work_id = cb.work_id and a.artist_id = cb.artist_id and cb.position = 1 order by a.artist_name, w.title; If you don't have a position or similar field in the created_by table, you will have more difficulty as you're finding. An alternative approach is to create a function which arbitrarily returns one artist name for a work, and then sort on that but it wont be as efficient. select artist_name, title, inventory_number from ( select GetArtistName(w.inventory_number) as artist_name, w.title, w.inventory_number from works w ) as t order by artist_name, title BTW, I haven't checked any of this, but hopefully it will give you some pointers or ideas. John Sidney-Woollett tony wrote: Hello, I am having a problem with returning distinct rows this is probably a newbie question but here goes: Tables are artist, created_by and works the join is on created_by.work_id and created_by.artist_id A work of art can have two or more artists listed as creators in created by. In the inventory lists we don't care we only need one reference to each work AND (this is the part that hurts) they must be ordered alphabetically by the _first_ artists name. example: artist_name : title : inventory_number Bernadette Genée et Alain Le Borgne : Pièce à conviction : 004090101 Should be after "F" and before "H" But if I do DISTINCT ON inventory_number I must order by inventory_number then artist_name which totally defeats my purpose. I have also played with GROUP BY and HAVING which Clues much appreciated Tony Grant ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] problem with distinct rows
Add a SORTNAME field to the artist table and use that for sorting. This will help you deal with diacrtics and accented characters by transposing them to a regular character instead. The sortname value for Genée would be "genee". With a few changes to your data model (and possibly application) you will probably end up making your life a little easier. John Sidney-Woollett Martijn van Oosterhout wrote: You can put the DISTINCT ON() query as a subquery, and an ORDER BY in the outer query. Sorting by surname is tricky since you need to tell the computer how to find it... Hope this helps, On Tue, Mar 08, 2005 at 10:03:48AM +0100, tony wrote: Hello, I am having a problem with returning distinct rows this is probably a newbie question but here goes: Tables are artist, created_by and works the join is on created_by.work_id and created_by.artist_id A work of art can have two or more artists listed as creators in created by. In the inventory lists we don't care we only need one reference to each work AND (this is the part that hurts) they must be ordered alphabetically by the _first_ artists name. example: artist_name : title : inventory_number Bernadette Genée et Alain Le Borgne : Pièce à conviction : 004090101 Should be after "F" and before "H" But if I do DISTINCT ON inventory_number I must order by inventory_number then artist_name which totally defeats my purpose. I have also played with GROUP BY and HAVING which Clues much appreciated Tony Grant ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] normal user dump gives error because of plpgsql
I'm pretty sure I had the same problem when using pg_restore. If pl/pgsql is installed in template1, then the restore fails. And I couldn't find any solution to this on the list either. I definitely want pl/pgsql in template1 so that any databases I create have access to pl/pgsql without having to explicitly install it in each new db I create in the cluster. Recreating the database using psql dbname < dump.sql does work OK, but not pg_restore. Can pg_restore be made to ignore the error? John Sidney-Woollett Martijn van Oosterhout wrote: On Thu, Mar 10, 2005 at 01:31:26PM +0100, Janning Vygen wrote: Hi, i have a normal user with rights to create a db. template1 contains language plpgsql. the user wants to - dump his db - drop his db - create it again - and use the dump file to fill it. it gives errors because of CREATE LANGUAGE statements inside the dump. I don't beleive there is. But it's not really needed, you get and error and the restore continues. The only thing different is that you don't get a message, the result is the same. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] normal user dump gives error because of plpgsql
Tom Lane wrote: John Sidney-Woollett <[EMAIL PROTECTED]> writes: I'm pretty sure I had the same problem when using pg_restore. If pl/pgsql is installed in template1, then the restore fails. And I couldn't find any solution to this on the list either. You're supposed to restore into a database cloned from template0, not template1. That's interesting because I normally create my databases using template1 so that I don't have to install pl/pgsql before I start adding pgsql functions. So what you're saying is that if you know you're just about to restore into a new DB (using pg_restore), then create the new DB from template0 instead of template1. I've just spotted the Notes section for pg_restore in the 7.4.x docs which confirms this. Strange that I didn't see it before... need to RTFM better! Thanks. John Sidney-Woollett ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Download mirrors not found?
Can anyone explain why none of the links to download the 8.0.0 source work? I trying to use links on this page: http://wwwmaster.postgresql.org/download/mirrors-ftp?file=source/v8.0.3/postgresql-8.0.3.tar.gz All the hostnames fail to resolve. Can anyone help me to get a 8.0.3 set of source files? Thanks John Sidney-Woollett ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] Download mirrors not found?
Hmm you're right, they do seem to be working OK now. Thanks for your response - maybe it was just a temporary glitch. John Sidney-Woollett Michael Fuhr wrote: On Mon, Jun 06, 2005 at 08:13:10AM +0100, John Sidney-Woollett wrote: Can anyone explain why none of the links to download the 8.0.0 source work? I trying to use links on this page: http://wwwmaster.postgresql.org/download/mirrors-ftp?file=source/v8.0.3/postgresql-8.0.3.tar.gz All the hostnames fail to resolve. I just ran a script to check all of them and they all resolved for me on Mon 6 Jun around 12:50 UTC. Are you still having trouble? Might you be experiencing local DNS problems? Maybe somebody who runs the postgresql.org DNS can say whether there were any problems around the time you checked (presumably around 07:00 UTC, based on your message time). ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] UNSUBSCRIBE
UNSUBSCRIBE ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] Please HELP - URGENT - transaction wraparound error
Oh my god! DB is pg 7.4.6 on linux 2005-10-27 05:55:55 WARNING: some databases have not been vacuumed in 2129225822 transactions HINT: Better vacuum them within 18257825 transactions, or you may have a wraparound failure. 2005-10-28 05:56:58 WARNING: some databases have not been vacuumed in over 2 billion transactions DETAIL: You may have already suffered transaction-wraparound data loss. We have cronscripts that perform FULL vacuums # vacuum template1 every sunday 35 2 * * 7 /usr/local/pgsql/bin/vacuumdb --analyze --verbose template1 # vacuum live DB every day 35 5 * * * /usr/local/bin/psql -c "vacuum verbose analyze" -d bp_live -U postgres --output /home/postgres/cronscripts/live/vacuumfull.log Questions: 1) Why do have we data corruption? I thought we were doing everything we needed to stop any wraparound... Are the pg docs inadequate, or did I misunderstand what needed to be done? 2) What can I do to recover the data? I have full daily backups from midnight each day using /usr/local/pgsql/bin/pg_dump $DATABASE > $BACKUPFILE plus I have this database replicated using Slon 1.1.0 to another 7.4.6 database. I can failover to the slave server, but what do I need to do to rebuild the original database? Should I failover now?!! And then start rebuilding the old master database (using slon, I presume)? How do I stop this EVER happening again??!!! Thanks for help John ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Please HELP - URGENT - transaction wraparound error
Martin, thanks for the feedback. I had a look around and couldn't see any data loss (but wasn't really sure where to start looking). I decided to switch over to the slave which is now our live database. the old master with the problem has already been re-inited (although I have a cold backup of the data dir), plus dump files that I can restore from. I checked pg_database (on the new master) and I don't really understand what it is saying. Is the datvacuumxid vs 3553547043 the significant information? I see in our new database: datname| datdba | encoding | datistemplate --++--+--- bp_live | 1 |6 | f bp_test | 1 |6 | f template1| 1 |0 | t template0| 1 |0 | t datname| datallowconn | datlastsysoid | datvacuumxid --+--+---+-- bp_live | t| 17140 |332321570 bp_test | t| 17140 |332265474 template1| t| 17140 |332241177 template0| f| 17140 | 464 datname| datfrozenxid | datpath | datconfig --+--+-+--- bp_live | 3553547043 | | bp_test | 3553490947 | | template1| 3553466653 | | template0| 464 | | datname| datacl --+-- bp_live | bp_test | template1| {postgres=C*T*/postgres} template0| {postgres=C*T*/postgres} Are we going to get the same problem with this database? What's also worrying me is that the warning message is in fact misleading!!?? 2005-10-28 05:56:58 WARNING: some databases have not been vacuumed in over 2 billion transactions DETAIL: You may have already suffered transaction-wraparound data loss. And I'm wondering if I have in fact destroyed a perfectly good database and data set... I read the link you gave (before) but found it hard to work out what you actually need to do to protect yourself. We DID vacuum the databases nightly, and template1 once a week. So I still don't understand why we got this error. Can someone explain in simple language? Can someone also give me a detailed "you need to do this, and this and this..." explanation to prevent this happening again (either on our master or slave databases). For example, must you do a vacuum full instead of a vacuum analyze on a 7.4.x database to prevent wraparound issues? BTW, for those not using **Slony** - you should check it out. It has saved my bacon three times this year! Due to: 1) server failure - hardware crash, needed BIOS flash, complete OS reinstall etc 2) disk full - corrupted pg data 3) oid wraparound (today's problem) Any further help that anyone can give is much appreciated. Thanks John Martijn van Oosterhout wrote: On Sun, Oct 30, 2005 at 08:50:18AM +, John Sidney-Woollett wrote: Oh my god! DB is pg 7.4.6 on linux Firstly, check pg_database, it should tell you which databases need to be vacuumed. Any database you regularly vacuumed is fine so maybe the corruption is in some other database you don't remember? 1) Why do have we data corruption? I thought we were doing everything we needed to stop any wraparound... Are the pg docs inadequate, or did I misunderstand what needed to be done? You *may* have corruption. Anything you vacuumed recently should be fine. 2) What can I do to recover the data? Check whether anything is lost first. How do I stop this EVER happening again??!!! Have you read this: http://www.postgresql.org/docs/7.4/static/maintenance.html#VACUUM-FOR-WRAPAROUND Hope this helps, ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Please HELP - URGENT - transaction wraparound error
Hi Tom You're not wrong about panicking! This is the worst Sunday I've had in a while... No sunday lunch or time with the kids... :( This database supports a (normally 24/7) website and we couldn't tolerate any possibility of data corruption. I had to make a judgement call on preventing any/further data loss or corruption, and switching over to the slave seemed the safest thing to do (based on my ignorance of the wraparound problem). I can restore the file system backup of pgsql/data to another database server and then get the info from pg_database. Or I can import a dump file from 15 minutes before I re-inited the database... What exactly am I looking for though? We don't use OIDs when creating tables... Could Slon 1.1.0 be causing a problem for us? It must be creating and deleting bucket loads of records as part of its regular activity... What am I likely to have missed in my vacuuming? Because whatever I did wrong is going to break our current live database at some point soon. Thanks John Tom Lane wrote: John Sidney-Woollett <[EMAIL PROTECTED]> writes: I decided to switch over to the slave which is now our live database. the old master with the problem has already been re-inited (although I have a cold backup of the data dir), plus dump files that I can restore from. You panicked much too quickly and destroyed the evidence ... unless by "cold backup" you mean a filesystem backup, in which case what you should do is restore that and take a look at what's in its pg_database. I think there's no question that there is some omission in your vacuuming procedures, and you need to find out what it is. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Please HELP - URGENT - transaction wraparound error
OK, I restored the pgsql/data to another server and started up postgres and this is what I got: SELECT datname, age(datfrozenxid) FROM pg_database; datname| age --+- mail_lxtreme | -2074187459 bp_live | 1079895636 template1| 1076578064 template0| -2074187459 (4 rows) mail_lxtreme is a test mail db and I don't care about it. So it could have been deleted without any worries... Which databases are a problem? Is it template0 or bp_live and template1? Thanks John Tom Lane wrote: John Sidney-Woollett <[EMAIL PROTECTED]> writes: I can restore the file system backup of pgsql/data to another database server and then get the info from pg_database. Or I can import a dump file from 15 minutes before I re-inited the database... Importing a dump will tell you nothing at all, as all the data will be freshly loaded. What exactly am I looking for though? SELECT datname, age(datfrozenxid) FROM pg_database; where the second column approaches 2 billion. Alternatively, wait a few weeks and note which entries in your live database are increasing rather than staying near 1 billion. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Please HELP - URGENT - transaction wraparound error
Hmm. I'm pretty sure that database mail_lxtreme was unused (no connections/activity) - I didn't think that it would need to be vacuumed at all... Just out of curiousity would the wraparound error (for mail_lxtreme) actually have affected data in bp_live? Could I just have deleted mail_lxtreme and then continued to use bp_live as though nothing had happened? Or had database bp_live already been damaged by the wraparound? Thanks for your great help/advice - it's much appreciated. John Tom Lane wrote: John Sidney-Woollett <[EMAIL PROTECTED]> writes: OK, I restored the pgsql/data to another server and started up postgres and this is what I got: SELECT datname, age(datfrozenxid) FROM pg_database; datname| age --+- mail_lxtreme | -2074187459 bp_live | 1079895636 template1| 1076578064 template0| -2074187459 (4 rows) mail_lxtreme is a test mail db and I don't care about it. So it could have been deleted without any worries... Which databases are a problem? Is it template0 or bp_live and template1? mail_lxtreme is exactly the problem. You weren't vacuuming it... (template0 is a special case and can be ignored.) regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Please HELP - URGENT - transaction wraparound error
Martijn Thanks for the answers/thoughts... Vacuumuming the databases hammers the server so the vacuums are spread out at different times during the night/morning. Plus template1 is vacuumed once a week. I guess I was unlucky to have missed the vacuum on that unused database (due to my misunderstanding), and not to have been scanning the serverlog more frequently (if at all recently!). My solution is to create a nagios script that scans entries in serverlog loking for WARN or ERROR messages in the past xx minutes. With this in place, I would have caught this error weeks ago before it bit me in the ass! Stressful day, but learnt a lot... Thanks for everyone for their input - great product and great support! John Martijn van Oosterhout wrote: On Sun, Oct 30, 2005 at 06:41:45PM +, John Sidney-Woollett wrote: Hmm. I'm pretty sure that database mail_lxtreme was unused (no connections/activity) - I didn't think that it would need to be vacuumed at all... A database that is never used still needs to be vacuumed. The only exception is if you VACUUM FREEZE which puts the entire database in a frozen state which will never need vacuuming. This is how template0 is configured. Ofcourse, once you make changes... Just out of curiousity would the wraparound error (for mail_lxtreme) actually have affected data in bp_live? I doubt it but (thinking shared tables) I'll have to defer to someone more knowledgable. Could I just have deleted mail_lxtreme and then continued to use bp_live as though nothing had happened? Or had database bp_live already been damaged by the wraparound? Everything would probably have been fine. BTW, I would have thought this message would have been appearing the last billion transactions or so, didn't anyone notice? To solve this forever, setup a cronjob for once a month: vacuumdb -a This will vacuum every database, even if you don't know the names or where they came from. AIUI when you vacuum a database whose transactions are over billion transactions old it automatically puts it in "frozen" state. If someone had happened to run "vacuumdb -a" anytime in the last few months, you might never have noticed the wraparound... Hope this helps, ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Please HELP - URGENT - transaction wraparound error
"Panic" - that's my middle name. ;) Had I known how to identify the database at fault, and that it would have had no effect on the other databases, then I would have handled this episode differently. In the event, things seem to be OK. Our old slave db is now acting as master and the old master rebuilt as the new slave ... courtesy of slon. I'd like to move to 8.1 but I'm waiting for a quiet period when there's less development/fire fighting so that I can test all the java components of our webapp and then manage the upgrade properly. Maybe suppressing other vacuums once a month, and running the "vacuumdb -a" option instead wouldn't be a bad idea... Many thanks for all your support and advice - you've been great help (and comfort). John Tom Lane wrote: John Sidney-Woollett <[EMAIL PROTECTED]> writes: Just out of curiousity would the wraparound error (for mail_lxtreme) actually have affected data in bp_live? Could I just have deleted mail_lxtreme and then continued to use bp_live as though nothing had happened? No, and yes, which is why panicking was not warranted ;-) Martijn's advice to be using "vacuumdb -a" every so often is well given, though. You could also consider switching over to autovacuum, particularly as of 8.1. (I'm not sure how much I trust the contrib version that exists in 8.0, and 7.4's is definitely pretty buggy, but I believe 8.1's can be relied on to prevent this sort of thing.) regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Please HELP - URGENT - transaction wraparound error
Lincoln Yeoh said: > At 07:48 PM 10/30/2005 +0000, John Sidney-Woollett wrote: > >>"Panic" - that's my middle name. ;) >> >>Had I known how to identify the database at fault, and that it would have >>had no effect on the other databases, then I would have handled this >>episode differently. > > Wonder if it would be a good idea for the error messages to identify which > databases might have lost data. > > However if you have a fair number of databases involved you might get a > fair number of log messages. Still, I think I wouldn't mind 100 lines in > the logs if I had 100 databases at risk... > Agreed! John ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Disappearing Records
Execute the following SQL on your pg cluster: SELECT datname, age(datfrozenxid) FROM pg_database; datname|age --+ bp_live | 1075940691 template1| 1130066178 template0| 56361936 (3 rows) Apart from template0 which is a special case (provided its frozen and readonly (which it is by default)), you want the numbers in the age column to be less than 2 billion. This is achieved by vacuuming EACH database including template1 regularly. You don't need to perform a full vacuum either. You just need to do it regularly. From my understanding, if numbers in that column have gone negative than you have already experienced transaction wraparound. This may then be seen as "data loss" or missing records. It's also worth checking your pg server log in pgdata/data/serverlog to see if you are seeing messages like this: 2005-10-27 05:55:55 WARNING: some databases have not been vacuumed in 2129225822 transactions HINT: Better vacuum them within 18257825 transactions, or you may have a wraparound failure. Here's a excerpt from the pg 7.4 manual: With the standard freezing policy, the age column will start at one billion for a freshly-vacuumed database. When the age approaches two billion, the database must be vacuumed again to avoid risk of wraparound failures. Recommended practice is to vacuum each database at least once every half-a-billion (500 million) transactions, so as to provide plenty of safety margin. To help meet this rule, each database-wide VACUUM automatically delivers a warning if there are any pg_database entries showing an age of more than 1.5 billion transactions, for example: play=# VACUUM; WARNING: some databases have not been vacuumed in 1613770184 transactions HINT: Better vacuum them within 533713463 transactions, or you may have a wraparound failure. VACUUM If you have suffered data loss for this reason, then you'll need to get help from the developers to see whether it can be recovered, or what you can do to reconstruct the data. Good luck! John Venki wrote: Hi, Has your system been used long enough that it could be subject to transaction ID wraparound? what is this can you give me more information on this or some pointers from where I can get more information on this and how to solve this. because I too has experinced this problem disappearinf records. regards Venki ---Original Message--- From: Tom Lane Date: 11/01/05 20:30:51 To: Rory Browne Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Disappearing Records Rory Browne <[EMAIL PROTECTED]> writes: What is the first thing you would do, when you find that your system has been losing information? Information is there at one stage, and later it's not. Has your system been used long enough that it could be subject to transaction ID wraparound? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] joining a query with a select count(*)
I think that something like this should work for you SELECT child_pk, coalesce(cnt, 0) FROM childtable c left outer join ( SELECT child_fk, count(*) as cnt FROM grandchildtable GROUP BY child_fk ) t ON (c.child_pk= t.child_fk); IMHO, if Postgres had Oracle's (+) notation these would be a lot easier... John Matthew Terenzio wrote: I want a query to return all the rows from one table along with a count of rows in another table that reference each returned row from the first table. For instance, if you had a table of children and another table of grandchildren that had a foreign key on the children table, I'd want to return all children with a count of the number of kids they had. I hope I can figure it out before I get a reply. Thanks ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Selecting from two unrelated tables
You can use an union if the column types match {...off the top of my head...} select id, title, added as sortcol from table1 union select id, headline, datetime as sortcol from table2 order by sortcol limit 25; John CSN wrote: I have two tables: items: id, title, added, ... news: id, headline, datetime, ... I'd like to select the latest 25 combined records from both tables. Is there a way to do this using just select? Thanks, csn __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] cyclical redundancy checksum algorithm(s)?
Stepping back a bit... Why not use an update trigger on the affected tables to record a lastupdated timestamp value when the record is changed. Surely this is simpler thanks computing some kind of row hash? John Karen Hill wrote: Tom Lane wrote: "Karen Hill" <[EMAIL PROTECTED]> writes: Ralph Kimball states that this is a way to check for changes. You just have an extra column for the crc checksum. When you go to update data, generate a crc checksum and compare it to the one in the crc column. If they are same, your data has not changed. You sure that's actually what he said? A change in CRC proves the data changed, but lack of a change does not prove it didn't. On page 100 in the book, "The Data Warehouse Toolkit" Second Edition, Ralph Kimball writes the following: "Rather than checking each field to see if something has changed, we instead compute a checksum for the entire row all at once. A cyclic redundancy checksum (CRC) algorithm helps us quickly recognize that a wide messy row has changed without looking at each of its constituent fields." On page 360 he writes: "To quickly determine if rows have changed, we rely on a cyclic redundancy checksum (CRC) algorithm. If the CRC is identical for the extracted record and the most recent row in the master table, then we ignore the extracted record. We don't need to check every column to be certain that the two rows match exactly." People do sometimes use this logic in connection with much wider "summary" functions, such as an MD5 hash. I wouldn't trust it at all with a 32-bit CRC, and not much with a 64-bit CRC. Too much risk of collision. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] cyclical redundancy checksum algorithm(s)?
Ah, good point! Missed the subtlety of what was being asked. John Joachim Wieland wrote: On Thu, Sep 28, 2006 at 07:09:43AM +0100, John Sidney-Woollett wrote: Why not use an update trigger on the affected tables to record a lastupdated timestamp value when the record is changed. Surely this is simpler thanks computing some kind of row hash? It depends on how you define "change". With the triggers you propose an UPDATE table SET col = col; is a change because there was a write operation. Any hash function's output would be "no change" because the actual data did not change. An update might entail an expensive update of some external data so you might want to make sure that data really got modified. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Storing images in PostgreSQL databases (again)
Merlin Moncure wrote: i'm wondering if anybody has ever attempted to manage large collections of binary objects inside the database and has advice here. We have designed and built an image library using Postgres and NFS servers which currently holds 1.4 million images totalling more than 250Gb. Not having the images in the database keeps the database lightweight, and allows us to use Slony to replicate - something we could not do with blob data (I believe). If you want to scale up, I think it is worth keeping the images out of the database. Just use the database to store the filename/location and meta data associated with each image. Otherwise the images bloat the database... Backups are small (the meta data is lightweight), we can use slony for data redundancy. The NFS servers are rsnyc'd to physically back the images up. This setup means that we have to have a layer above the database which keeps the database and images on the file system in sync. The database and java layer above with the NFS servers have effectively allowed us to build a virtualized file system for storing images and meta data. We can plug in more NFS shares as our storage requirements grow, and the database keeps tracks of disk usage within each physical disk volume. This setup appears to give us good performance and hasn't been too difficult to maintain or administer. For a setup this size which is growing daily, storing the images in the database was not really a sensible option. Hoever, with a smaller setup, keeping the images in the database definitely keeps things simpler though... John ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] adjusting primary key
If you could alter the foreign key constraint such that the update on t1's primary key cascaded to t2, that would help. However, I'm not sure that you alter the constraint in postgres once it's created. Hopefully someone more knowledgeable will be able to respond. John Rafal Pietrak wrote: Hi All, I have two tables: CREATE TABLE t1 (id int not null unique, info text); CREATE TABLE t2 (id int, grp int references t1(id), info text); Now, at certain point (both tables populated with tousends of records, and continuesly referenced by users), I need to adjust the value of an ID field of table T1. How can I do that? On the life system? Obvious solution like: UPDATE t1 SET id=239840 where id=9489; or in fact: UPDATE t1 SET id=id+1 where id<1000; wouldn't work, regretably. Naturally I need to have column t2(grp) adjusted accordingly - within a single transaction. Asking this, because currently I've learned, that I can adjust the structure of my database (add/remove columns at will, reneme those, etc.), but I'm really stuck with 'looking so simple' task. Today I dump the database and perl-edit whatever's necesary and restore the database. But that's not a solution for life system. Is there a way to get this done? life/on-line? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Upgrade 7.4 to 8.1 or 8.2?
We're looking to upgrade from 7.4 - should we go to 8.1.5 or 8.2? We have two databases; 7.4.6 and 7.4.11 in a master slave config using Slony. Both databases use the C locale with UTF-8 encoding on unix. We've dumped and loaded the data into an 8.1.4 database and have seen no problems with invalid UTF-8 sequences. So we're fairly happy that we can upgrade to 8.1.5 pretty easily using Slony. We're really looking for some extra performance right now. Are the differences between 8.1.5 and 8.2 significant? Is 8.2 more about speed or new features? John ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Upgrade 7.4 to 8.1 or 8.2?
No I cannot risk live data... But I (think I) read that 8.2 was expected to go final in November/December. So my question was is it worth waiting for 8.2 final or to go with 8.1.5 now. I guess going with 8.1.5 is what we should do. Thanks John Ron Johnson wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 10/20/06 10:09, John Sidney-Woollett wrote: We're looking to upgrade from 7.4 - should we go to 8.1.5 or 8.2? We have two databases; 7.4.6 and 7.4.11 in a master slave config using Slony. Both databases use the C locale with UTF-8 encoding on unix. We've dumped and loaded the data into an 8.1.4 database and have seen no problems with invalid UTF-8 sequences. So we're fairly happy that we can upgrade to 8.1.5 pretty easily using Slony. We're really looking for some extra performance right now. Are the differences between 8.1.5 and 8.2 significant? Is 8.2 more about speed or new features? 8.2 is not released yet. Can you risk your app on beta software? - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFOOk3S9HxQb37XmcRAmznAKCUDHP7Vh+RKOr8VLwfi/38ceucqwCfXW2M J1DNE9Ph7hgyBDWjjJUPWLI= =g5EN -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Upgrade 7.4 to 8.1 or 8.2?
Is there a page that describes the new features/improvements between 8.1.5 and 8.2? I couldn't find one on the postgres website. If there aren't *big* performance gains between 8.1.5 and 8.2 then we'll go with 8.1.5 and leave the 8.2 upgrade till sometime next summer. John Tom Lane wrote: John Sidney-Woollett <[EMAIL PROTECTED]> writes: Are the differences between 8.1.5 and 8.2 significant? Is 8.2 more about speed or new features? 8.2 is still all about beta testing. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Upgrade 7.4 to 8.1 or 8.2?
Thanks for the link - that was very useful. John Alvaro Herrera wrote: John Sidney-Woollett wrote: Is there a page that describes the new features/improvements between 8.1.5 and 8.2? I couldn't find one on the postgres website. If there aren't *big* performance gains between 8.1.5 and 8.2 then we'll go with 8.1.5 and leave the 8.2 upgrade till sometime next summer. There are a lot of performance improvements. The release notes are here: http://momjian.us/main/writings/pgsql/sgml/release-8-2.html ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Upgrade 7.4 to 8.1 or 8.2?
Thanks for the info. I think that we'll move to 8.1.5 first, and then 8.2 when it's stable. We have to keep our test and production systems in sync (version-wise). John Merlin Moncure wrote: I've been developing against 8.2 for months without a single 8.2 specific problem. I run both linux and windows in high load environments (not so much windows these days tho). I'm going to go against the grain here and say go for it: yes, it is faster. It's up to you to determine a test period long enough to build up a trust level. possible downsides: * catalog changes during beta period. (dump/reload to update to 8.2 proper, etc). * last minute feature change. note recent small change in insert returning as an example. * risk of undiscovered bug ideally, i'd develop/test vs. 8.2 beta, and switch production system at 8.2 release. merlin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] grouping excluding some text results
Off the top of my head (and not sure how optimized)... select t1.id, t1.score, t1.description from scoretable t1, ( select id, max(score) as score from scoretable group by id ) as t2 where t1.id = t2.id and t1.score = t2.score order by t1.id If you get duplicated rows back, then try using select distinct t1.id, t1.score, t1.description ... instead. Hope that helps. John Fip wrote: Hi, ho can I select some one restult for type, excluding the rest? I have something like, as result of a join: | ID | score| description | - MG01 56 "textual description1..." MG02 47 "another text ..." MG02 55 "textual description, text"< note this MG02 55 "textual description, text"< note this MG01 35 "this is a different text" MG02 61 "random chars..." (...) I know that is possible selecting by grouping if I use an aggregate function: select ID,max(score) by table group by ID but I want the description too, only one description, and I have a very large dataset, I cannot cycle with more subselections, this require too much time; also I want to exclude duplicates: only one ID. What can I do? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] ORDER BY
Alternative options for what they're worth - you'd have to explain to see how efficient they are select id, name from ( select lower(name) as sortkey, id, name from table where name != 'Other' union select 'z' as sortkey, id, name from table where name = 'Other' ) as t order by sortkey select id, name from ( select case when name='Other' then 'z' else lower(name) end as sortkey, id, name from table ) as t order by sortkey Notice that the sort will be case insensitive in these examples which may be something that you also want. John George Pavlov wrote: For larger tables, you may have to resort to a union: select * from foo where name != 'Other' order by name union select * from foo where name = 'Other' Alas, this suggestion is wrong on two counts: (a) UNION expects a single ORDER BY that applies to the whole recordset and which has to come at the end; (b) UNION re-sorts anyway (it needs to eliminate the dupes) -- maybe you are thinking UNION ALL? So, to follow your advice he may want a query like this, although it seems quite silly and there still isn't an ironclad guarantee re. the final result sorting: select * from (select * from foo where name != 'Other' order by name) x union all select * from foo where name = 'Other' ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] HELP: Urgent, Vacuum problem
To recover disk space, reindex the heavily updated tables. You can do this while the database is in production. Check the REINDEX command. John Schwenker, Stephen wrote: Hello, I'm having a major Vacuuming problem. I used to do a full vacuum every morning on my postgres database to clean up empty space on a table but because of it's size, the locking of the database causes my application server to max out the database connections and causes database errors. To fix that problem, I have turned off the full vacuum and are just doing a standard analyze vacuum. No I'm getting very close to running out of space on my disks because the table keeps on growing and the database is not re-using deleted record space. I know this because I delete 99% of the records from the table after I have exported them but the size of the database tables are not decreasing. Now I can't shrink the size of the tables because the full vacuum takes too long to run Over 2 hours and locks the table for too long. Can anyone help me with fixing my problem with vacuuming and disk space? I'm using version 7.4.2 on solaris. Thank you, Steve. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Command "connect by prior" in PostgreSQL ?
Take a look at contrib/tablefunc there is a function called connectby() that should do what you're looking for. Otherwise roll your own... http://archives.postgresql.org/pgsql-sql/2003-10/msg00374.php John Thomas Kellerer wrote: On 08.01.2007 17:24 Scott Marlowe wrote: On Mon, 2007-01-08 at 07:38, Leandro Repolho wrote: Hello everybody, In Oracle i use the command "connect by prior" and i need to use it in PostgreSQL, what is the sintax? What does connect by prior do? Reuse a connection from one db to another? I don't think there's anything like that in postgresql. You might wanna tell us what an oracle command does next time, since many folks here aren't that familiar with Oracle. Heck, I use oracle everyday and I'm not familiar with connect by prior... It's used to retrieve hierarchical data e.g. a parent/child relationship. The connect by operator defines the columns which define the hierarchy. Thomas ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Anyone know a good opensource CRM that actually installs with Posgtres?
centric crm works with postgres John Mario Guenterberg wrote: On Fri, Mar 09, 2007 at 01:22:22AM +, Bradley Kieser wrote: I hope that someone has cracked this one because I have run into a brick wall the entire week and after 3 all-nighters with bad installations, I would appreciate hearing from others! I am looking for a decent OpenSource CRM system that will run with Postgres. SugarCRM seems to be the most popular but it's MySQL-centric and its opensource parts are very restricted. Hi... lxOffice runs with PostgreSQL. http://www.lx-office.org regards Mario ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] UNION with ORDER BY -allowed?
wrap the whole statement in another select select col1, col2, col5, col6 from ( SELECT col1, col2, col5, col6 FROM table WHERE col2 = 'X' UNION SELECT col3, col4, col5, col6 FROM table WHERE col4 = 'X' ) as t order by coalesce(col1, col3); John Sidney-Woollett Chris Green wrote: It's not quite clear (to me at least) whether I can have a UNION and an ORDER BY in a SELECT statement. What I want to do is:- SELECT col1, col2, col5, col6 FROM table WHERE col2 = 'X' UNION SELECT col3, col4, col5, col6 FROM table WHERE col4 = 'X' ORDER BY coalesce(col1, col3) Is this valid syntax allowed by postgresql? (I'm not at the system where postgresql is installed at the moment so I can't just try it) col1 and col3 are both DATE columns. col2 and col4 are both varchar(1). I want the ORDER BY to order the result of the UNION. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] pgplsql SELECT INTO ... FOR UPDATE (transaction/locking
I've got a few plpgsql stored functions (in 7.4.x) that use the select x into y from table where condition for update syntax without any problem. Maybe there's something else going on? John Sidney-Woollett Eric Brown wrote: I'm trying to write a stored procedure in plpgsql that selects a row and possibly increments one of its fields. I thought I would do SELECT INTO my_record * FROM my_table FOR UPDATE WHERE ..., but apparently plpgsql doesn't like the FOR UPDATE in a stored procedure. Does plpgsql automatically lock any rows I read until the stored procedure exits? I'm just not sure how to get the functionality I'm looking for and not have to concern myself with concurrency. Example: create table t_test (x int, y int); create or replace function f_test(int) returns void as ' declare r record; begin select into r *, oid from t_test -- FOR UPDATE where x = $1; if found then update t_test set y=y+1 where oid = r.oid; end if; return; end' language plpgsql; insert into t_test values (1,1); select f_test(1); ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL]
Useful to add a title to your messages before you post... How about: select parentid, count(*) as number_of_children from childtable group by parentid order by parentid; If there are parent records that have no children then these will be omitted. The query above totally ignores the parent table (which you may not want it to do). John Sidney-Woollett Joost Kraaijeveld wrote: Hi all, Is it possible to count and display the number of children of a parent in a generic query? parent table: id child table: id, parent_id Example output of the query: parentidnumber_of_children parent1 2 parent2 6 parent3 0 Groeten, Joost Kraaijeveld Askesis B.V. Molukkenstraat 14 6524NB Nijmegen tel: 024-3888063 / 06-51855277 fax: 024-3608416 e-mail: [EMAIL PROTECTED] web: www.askesis.nl ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] Select number of children of a parent query
You might need to read a good SQL primer to get a full explanation of this feature. Reading from the docs, http://www.postgresql.org/docs/7.4/interactive/sql-select.html The optional GROUP BY clause has the general form GROUP BY expression [, ...] GROUP BY will condense into a single row all selected rows that share the same values for the grouped expressions. expression can be an input column name, or the name or ordinal number of an output column (SELECT list item), or an arbitrary expression formed from input-column values. In case of ambiguity, a GROUP BY name will be interpreted as an input-column name rather than an output column name. Aggregate functions, if any are used, are computed across all rows making up each group, producing a separate value for each group (whereas without GROUP BY, an aggregate produces a single value computed across all the selected rows). When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions, since there would be more than one possible value to return for an ungrouped column. John Sidney-Woollett Joost Kraaijeveld wrote: Hi John, John Sidney-Woollett schreef: Useful to add a title to your messages before you post... It escaped before finishing. How about: select parentid, count(*) as number_of_children from childtable group by parentid order by parentid; It works but can you tell me why this works? Is the count(*) over the "group by parentid"? Groeten, Joost Kraaijeveld Askesis B.V. Molukkenstraat 14 6524NB Nijmegen tel: 024-3888063 / 06-51855277 fax: 024-3608416 e-mail: [EMAIL PROTECTED] web: www.askesis.nl ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[GENERAL] [OT] Dell Perc 3/di raid monitoring for PG db server
I know that this is off topic, but I'm not getting anywhere and wondered if anyone has come across and solved this problem before. We use Nagios to monitor our servers. Our Dell PE 1550s and 1750s are monitored using a check_megaraid nagios plugin which uses SNMP and works fine. Our postgres databases are running on two Dell PE 2650s which use an aacraid (not megaraid) 3/di controller. This means that we need a different toolset to check the array. We currently appear to have three options: 1) Use the afacli (command line tool) to output the current array status and diff it against a previous known good state and report back accordingly 2) or setup snmp on the db server and modify the check_megaraid nagios plugin to check the response from the aacraid snmp reply (problem is that there is virtually no documentation on getting snmp configured for the aacraid controller) 3) or find someone who has already solved this problem Does anyone have a reliable way of monitoring the state of a Perc 3/di array which works with nagios that they'd like to share? Thanks to anyone who can help. John Sidney-Woollett ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] [OT] found nagios plugins for monitoring Dell server
For anyone who needs nagios plugins to monitor the state of their raid controllers and disk arrays for their Dell db servers There is a nagios plugin for the megaraid controllers at http://www.ibiblio.org/john/megaraid/ And a plugin for the accraid controllers at http://www.iamafreeman.com/projects/nagios_plugins/ The developers of these plugins have both been really helpful, and the plugins appear to work well. You need to have snmp installed on the server to be monitored. The accraid plugin gives info like: OK: sda 52GB RAID 5 temp: 77F: 1 global hotspare, 4 ok while the megaraid plugin gives an OK or Error message. Hope that helps someone else in the future. Johh Sidney-Woollett John Sidney-Woollett wrote: I know that this is off topic, but I'm not getting anywhere and wondered if anyone has come across and solved this problem before. We use Nagios to monitor our servers. Our Dell PE 1550s and 1750s are monitored using a check_megaraid nagios plugin which uses SNMP and works fine. Our postgres databases are running on two Dell PE 2650s which use an aacraid (not megaraid) 3/di controller. This means that we need a different toolset to check the array. We currently appear to have three options: 1) Use the afacli (command line tool) to output the current array status and diff it against a previous known good state and report back accordingly 2) or setup snmp on the db server and modify the check_megaraid nagios plugin to check the response from the aacraid snmp reply (problem is that there is virtually no documentation on getting snmp configured for the aacraid controller) 3) or find someone who has already solved this problem Does anyone have a reliable way of monitoring the state of a Perc 3/di array which works with nagios that they'd like to share? Thanks to anyone who can help. John Sidney-Woollett ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] ORDER BY in UNION query
Try select a.col1 as ename from a union select b.othercolumn as ename from b order by ename Give the columns you want to order on the same name using the "as XXX" syntax, and remove the "a." prefix from the order statement. John Sidney-Woollett Antony Paul wrote: Hi, I need to use ORDER BY clause in a UNION query and the Order BY columns are not included in the SELECT statement. I tried like this (select from a) UNION (select . from b) order by a.ename; It says that ERROR: Attribute "ename" not found How to do this. rgds Antony Paul ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] SELECT from multiple tables (not join though)
I hope I've read your question properly - I seem to be giving answers to un-asked questions lately! ;) How about... SELECT file_name from file_info_1 WHERE file_name='/' union SELECT file_name from file_info_2 WHERE file_parent_name='/' union SELECT file_name from file_info_3 WHERE file_parent_name='/' order by file_name; Does that do what you want? John Sidney-Woollett Madison Kelly wrote: Hi all, I have another question, I hope it isn't too basic. ^.^ I want to do a select from multiple tables but not join them. What I am trying to do is something like this (though this doesn't work as I need): SELECT a.file_name, b.file_name, c.file_name FROM file_info_1 a, file_info_2 b, file_info_3 c WHERE a.file_name='/' AND b.file_parent_dir='/' AND c.file_parent_dir='/'; That returns every combination of the results from the three tables which is a huge number. What I need is to return all of the matches in all of the tables in a single column. Once I have all the matches in one column the next trick is to sort the combined results (any tips there?). I hope the question was clear. Please let me know if it wasn't. Thanks! Madison ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[GENERAL] how to determine initdb locale (after the event)?
I need to create a new database v7.4.6 on a new server that will act as a copy of our current live 7.4.1 database. Ultimately replicated using slony (we hope)... I have installed 7.4.6 on the new server - no problems so far. Now I'm trying to figure out what locale options were passed to initdb for our current live server when it was created. (And whether they are in fact correct for our needs - see below?!) pg_controldata gives the following output: pg_control version number:72 Catalog version number: 200310211 Database cluster state: in production pg_control last modified: Thu 13 Jan 2005 12:26:47 GMT Current log file ID: 18 Next log file segment:133 Latest checkpoint location: 12/84411BBC Prior checkpoint location:12/843E0D78 Latest checkpoint's REDO location:12/84411BBC Latest checkpoint's UNDO location:0/0 Latest checkpoint's StartUpID:167 Latest checkpoint's NextXID: 392845890 Latest checkpoint's NextOID: 400628 Time of latest checkpoint:Thu 13 Jan 2005 12:26:45 GMT Database block size: 8192 Blocks per segment of large relation: 131072 Maximum length of identifiers:64 Maximum number of function arguments: 32 Date/time type storage: floating-point numbers Maximum length of locale name:128 LC_COLLATE: C LC_CTYPE: C and locale -a offers the following (on both servers): locale -a C en_AU en_AU.utf8 en_BW en_BW.utf8 en_CA en_CA.utf8 en_DK en_DK.utf8 en_GB en_GB.iso885915 en_GB.utf8 en_HK en_HK.utf8 en_IE [EMAIL PROTECTED] en_IE.utf8 [EMAIL PROTECTED] en_IN en_NZ en_NZ.utf8 en_PH en_PH.utf8 en_SG en_SG.utf8 en_US en_US.iso885915 en_US.utf8 en_ZA en_ZA.utf8 en_ZW en_ZW.utf8 POSIX Listing the databases (on our live server) gives the following output: Name | Owner | Encoding ---+--+-- db_live | postgres | UNICODE template0 | postgres | UNICODE template1 | postgres | UNICODE Basically our database serves a java web application where we want to store data from several european languages (mixed in the same tables). We're happy to accept "english" sorting rules... Question 1) Should we try to determine what locale setting was used for our live database and use that for the new server, or should we be using the following for the new db (because we're in the UK): initdb --locale en_GB.utf8 Question 2) Or should be go for the C locale regardless of the current locale setting on the live server (since we happy with "english" sorting and indexing)? And does that work with UNICODE encoding? Question 3) If the new database locale is different from the current live one, as we going to get problems when dumping and restoring between the two databases? Question 4) Because we want to make use of slony - could differences in the locale cause problems down the line - if we use a different locale for initdb for the new db? Sorry if I'm asking dumb questions, but this whole locale thing and its implications has me totally baffled. I need to get this right because I can't afford to trash our db or data. Thanks for any help anyone can give. John Sidney-Woollett ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] C locale + unicode
Does anyone know if it's permitted to use the 'C' locale with a UNICODE encoded database in 7.4.6? And will it work correctly? Or do you have to use a en_XX.utf8 locale if you want to use unicode encoding for your databases? John Sidney-Woollett ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] C locale + unicode
Tom, thanks for the info. Do upper() and lower() only work correctly for postgres v8 UTF-8 encoded databases? (They don't seem to work on chars > standard ascii on my 7.4.6 db). Is this locale or encoding specific issue? Is there likely to be a significant difference in speed between a database using a UTF-8 locale and the C locale (if you don't care about the small issues you detailed below)? Thanks. John Sidney-Woollett Tom Lane wrote: John Sidney-Woollett <[EMAIL PROTECTED]> writes: Does anyone know if it's permitted to use the 'C' locale with a UNICODE encoded database in 7.4.6? Yes. And will it work correctly? For suitably small values of "correctly", sure. Textual sort ordering would be by byte values, which might be a bit unintuitive for Unicode characters. And I don't think upper()/lower() would work very nicely for characters outside the basic ASCII set. But AFAIR those are the only gotchas. People in the Far East, who tend not to care about either of those points, use 'C' locale with various multibyte character sets all the time. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] C locale + unicode
Thanks for the info - to the point and much appreciated! John Sidney-Woollett Tom Lane wrote: John Sidney-Woollett <[EMAIL PROTECTED]> writes: Do upper() and lower() only work correctly for postgres v8 UTF-8 encoded databases? (They don't seem to work on chars > standard ascii on my 7.4.6 db). Is this locale or encoding specific issue? Before 8.0, they don't work on multibyte characters, period. In 8.0 they work according to your locale setting. Is there likely to be a significant difference in speed between a database using a UTF-8 locale and the C locale (if you don't care about the small issues you detailed below)? I'd expect the C locale to be materially faster for text sorting. Don't have a number offhand. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] Splitting queries across servers
We configured a slony1 cluster with a master and slave (both 7.4.6), and used the slave to serve read-only queries thereby offloading some of the work from the master database. This worked well for us. You could also take a look at pg_pool to distribute your load - but I haven't actually used it, so can't give any advice... Hope that helps. John Sidney-Woollett Max wrote: Hello, Our postgresql database is getting too big to be handled by one server. We need the database to be in RAM and cannot afford swapping. At the moment, we're using only 3GB or RAM, however our business growth is going to drive this number into the double digits zone, maybe triple digits. What are our options ? I am a little bit ignorant in this part of RDBM. Can we distribute our tables across multiple servers (read server+postgres) and perform distributed SQL queries ? If so, how does that work ? I am totally resourceless! Thank you in advance Max. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Moving from Sybase to Postgres - Stored Procedures
Have you tried looking at this section of the manual? http://www.postgresql.org/docs/7.4/interactive/plpgsql.html It details all the PL/pgSQL language constructs - I found it fine when converting from Oracle to Postgres... Just make sure you have installed the pl/pgsql language in template1 or your database before you try using it - see http://www.postgresql.org/docs/7.4/interactive/app-createlang.html or http://www.postgresql.org/docs/7.4/interactive/sql-createlanguage.html Or type /usr/local/pgsql/bin/createlang plpgsql template1 to install the language into template1, then create your database. Or install directly into your database... Hope that helps. John Sidney-Woollett Andre Schnoor wrote: "Joshua D. Drake" wrote: Andre Schnoor wrote: Hi, I am moving from Sybase to pgsql but have problems with stored procedures. The typical procedure uses a) named parameters, b) local variable declarations and assignments c) transactions d) cursors, views, etc. I can't seem to find these things in the Postgres function syntax. [...] Perhaps if you provided the actual problem? Is there a specific procedure that you are trying to port that you do not understand in the PgSQL sense? Thank you for asking, Joshua. I've put an example procedure skeleton here: CREATE PROCEDURE do_something @song_id int, @user_id int, @method int, @length int = 0, @date_exact datetime, @default_country int = null AS -- temporary variables DECLARE @artist int, @sample int, @date varchar(32), @country int BEGIN -- assign temporary variables select @date = convert(varchar(32),@date_exact,101) select @artist = user_id, @sample = is_sample from sto_song where song_id = @song_id -- perform conditional code if (@sample = 1) begin begin transaction ... do something ... commit transaction end else begin ... do something else ... end -- return results select result1 = ... some expression ..., result2 = ... another expression ... END I could not yet translate this to PgSQL, as I can't find any control structures, variable declaractions, etc. I assume this can be done through the Perl module, but I find this rather strange. I'm afraid that Perl requires to have the queries parsed and passed down each and every time, instead of having them compiled once. I also can't see the benefit of converting data objects back and forth to/from Perl while everything actually happens within Postgres. Am I missing something important? Greetings, Andre ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] PL/pgSQL functions and RETURN NEXT
Here's an example that I butchered to cut it down to size that should illustrate what you need to do (basically use a LOOP construct) CREATE TYPE customer.InvoiceItem AS ( WCCustOrderID varchar(16), OrderDate date, Currency varchar(3), TaxCode varchar(3), TaxRate numeric(10,3), Net numeric(10,2), Tax numeric(10,2), Gross numeric(10,2) ); CREATE OR REPLACE FUNCTION CUSTOMER.GetInvoiceStats(integer, integer) RETURNS setof customer.InvoiceItem AS ' -- generates invoice info for the year and month from the -- completed orders DECLARE vInv customer.InvoiceItem%rowtype; vCustOrderrecord; vStartDatedate; vEndDate date; BEGIN -- build the start and end dates vStartDate := to_date(''1-''||pMonth||''-''||pYear,''dd-mm-''); IF (pMonth > 11) then vEndDate := to_date(''1-1''||''-''||1+pYear,''dd-mm-''); ELSE vEndDate := to_date(''1-''||1+pMonth||''-''||pYear,''dd-mm-''); END IF; FOR vCustOrder IN SELECT * from customer.WCCustOrder WHERE OrderDate >= vStartDate AND OrderDate < vEndDate AND WCCustOrderStatusID = 9 LOOP vInv.WCCustOrderID := vCustOrder.WCCustOrderID; vInv.OrderDate := vCustOrder.OrderDate::date; vInv.Currency := vCustOrder.Currency; vInv.TaxCode := vCustOrder.WSTaxCode; vInv.TaxRate := vCustOrder.TaxRate; vInv.Gross := round(vCustOrder.Gross,2); vInv.Net := round(vCustOrder.Net,2); vInv.Tax := round(vCustOrder.Gross - vInv.Net,2); RETURN NEXT vInv; END LOOP; return; END; ' LANGUAGE 'plpgsql'; Hope that helps. John Sidney-Woollett Craig Bryden wrote: Hi Firstly, let me say that I am a newbie to PostgreSQL. I have written a PL/pgSQL function that will return a set of results. I have included the code below *** CREATE TYPE pr_SomeFunction_ReturnType as (ID smallint, TypeID smallint, Name varchar(50), Description varchar(500), TypeName varchar(20)); CREATE OR REPLACE FUNCTION pr_SomeFunction (p_TypeID smallint) RETURNS setof pr_SomeFunction_ReturnType AS $$ DECLARE r_Return pr_SomeFunction_ReturnType; BEGIN SELECT l.ID, l.TypeID, l.Name, l.Description, lt.Name as TypeName INTO r_Return FROM tb_Item l JOIN tb_ItemType lt ON l.TypeID = lt.TypeID; RETURN NEXT r_Return; RETURN; END; $$ LANGUAGE 'plpgsql'; *** When I run "select * from pr_SomeFunction(1::smallint);", I only get one record back, instead of two. In the tb_Items table, there are two records that meet the criteria, and if I run the query on it's own (ouside a function), I do get two records in the results. Any help with understanding the usage of RETURN NEXT will be greatly appreciated. Thanks Craig ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Dynamic SQL
Yes, it does. And it's a lot easier than DBMS_SQL too! Look at the EXECUTE command in the pl/pgsql programming language. See http://www.postgresql.org/docs/7.4/interactive/plpgsql-statements.html John Sidney-Woollett Mark Battersby wrote: Hi When looking at PostGres through the eyes of an Oracle Developer I was wondering if PostGres supports a feature called Dynamic SQL. Of course even better would be the ability to run PostGress/PL dynamically too. Dynamic SQL and Dynamic PL/SQL are useful when you don't know until runtime how your SQL or PL/SQL looks like. Any tips appreciated. /Mark ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] How to delete duplicate rows?
Can you not use your table's primary key value instead? If you table is created with OIDs you may be able to use those - although I don't know if that this advisable or not since I never use OIDs... John Sidney-Woollett Clodoaldo Pinto wrote: This one must be obvious for most here. I have a 170 million rows table from which I want to eliminate duplicate "would be" keys and leave only uniques. I found a query in http://www.jlcomp.demon.co.uk/faq/duplicates.html for the oracle database but can't figure out how to refer to the row id in postgresql: delete from test where rowid not in (select min(rowid) from test group by a,b); How to refer to the row id? Any better way to do it? Regards, Clodoaldo Pinto ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] How to delete duplicate rows?
Doh, sorry - you're completely correct! Silly me... Can you not add a serial or sequence column to the table for the purposes of the de-dupe? Then create an index on that column in one operation at the end and use that in the way that you would use Oracle's rowid from the examples? John Sidney-Woollett Clodoaldo Pinto wrote: On Fri, 04 Feb 2005 07:38:26 +0000, John Sidney-Woollett <[EMAIL PROTECTED]> wrote: Can you not use your table's primary key value instead? John, If a primary key existed there would be no duplicates. The row insertion is made in 650k rows batches 8 times a day and a primary key make it very slow. Found also methods 2 and 3 in http://www.orafaq.com/faq/Server_Utilities/SQL/faq55.htm Method 3 also relies in the row id. If no one can help I will do this: Insert the distinct rows in a temporary table. Drop the index. Insert into the original from the temporary. Clodoaldo ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[GENERAL] vacuum confusion
I've got a 7.4.6 db running on linux where we've been vacuuming "full" the database each night, and have recently switched to "vacuum analyse" instead. We keep seeing this message at the end of the vacuum run: WARNING: some databases have not been vacuumed in 2013308218 transactions HINT: Better vacuum them within 134175429 transactions, or you may have a wraparound failure. VACUUM Why are we seeing this message when the only databases in this cluster is the one we are vacuuming (each night), and template0 and template1? Is there something that we're not doing right? John Sidney-Woollett ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] vacuum confusion
Ah.., no we're not vacuuming template0 or 1. I didn't realise that the transaction ID would be stored here - I assumed that they'd be in our database. Do I need to need to (plain) vacuum, or vacuum full these template0 and template1? And is this something that can be done once a week rather than every night (like our main database)? Thanks for your help. John Sidney-Woollett Richard Huxton wrote: John Sidney-Woollett wrote: I've got a 7.4.6 db running on linux where we've been vacuuming "full" the database each night, and have recently switched to "vacuum analyse" instead. We keep seeing this message at the end of the vacuum run: WARNING: some databases have not been vacuumed in 2013308218 transactions HINT: Better vacuum them within 134175429 transactions, or you may have a wraparound failure. VACUUM Why are we seeing this message when the only databases in this cluster is the one we are vacuuming (each night), and template0 and template1? Is there something that we're not doing right? Are you doing template0/1 too? The transaction IDs are shared between all databases AFAIK. Before the numbers wrap-around any "low" numbers need to be replaced by a "frozen" marker (I think it's 0 or 1). -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] vacuum confusion
Thanks Richard, I found the page too... However the implication (and my confusion) is that you need to vacuum your own databases only. It's not clear (to me) that you have to do the same for template0 and template1 as well. Perhaps when someone is updating the docs, something more explicit than this Recommended practice for most sites is to schedule a database-wide VACUUM once a day at a low-usage time of day, supplemented by more frequent vacuuming of heavily-updated tables if necessary. (If you have multiple databases in a cluster, don't forget to vacuum each one; the program vacuumdb may be helpful.) Use plain VACUUM, not VACUUM FULL, for routine vacuuming for space recovery. could be replaced by this Recommended practice for most sites is to schedule a database-wide VACUUM once a day at a low-usage time of day, supplemented by more frequent vacuuming of heavily-updated tables if necessary. (If you have multiple databases in a cluster, don't forget to vacuum each one (INCLUDING template0 and template1); the program vacuumdb may be helpful.) Use plain VACUUM, not VACUUM FULL, for routine vacuuming for space recovery. And perhaps an explicit reference to vacuuming template0/1 in the section on "Preventing transaction ID wraparound failures" would be helpful. I'll add a weekly cron job to vacuum these two template databases. Thanks for your help again. John Sidney-Woollett Richard Huxton wrote: John Sidney-Woollett wrote: Ah.., no we're not vacuuming template0 or 1. I didn't realise that the transaction ID would be stored here - I assumed that they'd be in our database. Do I need to need to (plain) vacuum, or vacuum full these template0 and template1? And is this something that can be done once a week rather than every night (like our main database)? Ah! Found the section of the manuals - see "Routine Database Maintenance Tasks" for details. AFAIK it's a simple vacuum and once a week is more than enough. The manual recommends once every 500million transactions, though you can leave it longer. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] vacuum confusion
I've just noticed that the v8 docs are MUCH better in explaining this than the 7.4.6 docs that I'm using (since I'm using 7.4.6 in production). Perhaps if the same texts play=# VACUUM; WARNING: some databases have not been vacuumed in 1613770184 transactions HINT: Better vacuum them within 533713463 transactions, or you may have a wraparound failure. VACUUM and warning To be sure of safety against transaction wraparound, it is necessary to vacuum every table, including system catalogs, in every database at least once every billion transactions. We have seen data loss situations caused by people deciding that they only needed to vacuum their active user tables, rather than issuing database-wide vacuum commands. That will appear to work fine ... for a while. were added to the 7.4.x docs that would help others too. Thanks John Sidney-Woollett Richard Huxton wrote: John Sidney-Woollett wrote: Ah.., no we're not vacuuming template0 or 1. I didn't realise that the transaction ID would be stored here - I assumed that they'd be in our database. Do I need to need to (plain) vacuum, or vacuum full these template0 and template1? And is this something that can be done once a week rather than every night (like our main database)? Ah! Found the section of the manuals - see "Routine Database Maintenance Tasks" for details. AFAIK it's a simple vacuum and once a week is more than enough. The manual recommends once every 500million transactions, though you can leave it longer. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] vacuum confusion
(Apologies if this message comes twice - my imap server and mail client had a little problem)... I've just noticed that the v8 docs are MUCH better in explaining this than the 7.4.6 docs that I'm using (since I'm using 7.4.6 in production). Perhaps if the same texts play=# VACUUM; WARNING: some databases have not been vacuumed in 1613770184 transactions HINT: Better vacuum them within 533713463 transactions, or you may have a wraparound failure. VACUUM and warning To be sure of safety against transaction wraparound, it is necessary to vacuum every table, including system catalogs, in every database at least once every billion transactions. We have seen data loss situations caused by people deciding that they only needed to vacuum their active user tables, rather than issuing database-wide vacuum commands. That will appear to work fine ... for a while. were added to the 7.4.x docs that would help others too. Thanks John Sidney-Woollett John Sidney-Woollett wrote: Thanks Richard, I found the page too... However the implication (and my confusion) is that you need to vacuum your own databases only. It's not clear (to me) that you have to do the same for template0 and template1 as well. Perhaps when someone is updating the docs, something more explicit than this Recommended practice for most sites is to schedule a database-wide VACUUM once a day at a low-usage time of day, supplemented by more frequent vacuuming of heavily-updated tables if necessary. (If you have multiple databases in a cluster, don't forget to vacuum each one; the program vacuumdb may be helpful.) Use plain VACUUM, not VACUUM FULL, for routine vacuuming for space recovery. could be replaced by this Recommended practice for most sites is to schedule a database-wide VACUUM once a day at a low-usage time of day, supplemented by more frequent vacuuming of heavily-updated tables if necessary. (If you have multiple databases in a cluster, don't forget to vacuum each one (INCLUDING template0 and template1); the program vacuumdb may be helpful.) Use plain VACUUM, not VACUUM FULL, for routine vacuuming for space recovery. And perhaps an explicit reference to vacuuming template0/1 in the section on "Preventing transaction ID wraparound failures" would be helpful. I'll add a weekly cron job to vacuum these two template databases. Thanks for your help again. John Sidney-Woollett Richard Huxton wrote: John Sidney-Woollett wrote: Ah.., no we're not vacuuming template0 or 1. I didn't realise that the transaction ID would be stored here - I assumed that they'd be in our database. Do I need to need to (plain) vacuum, or vacuum full these template0 and template1? And is this something that can be done once a week rather than every night (like our main database)? Ah! Found the section of the manuals - see "Routine Database Maintenance Tasks" for details. AFAIK it's a simple vacuum and once a week is more than enough. The manual recommends once every 500million transactions, though you can leave it longer. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] vacuum confusion
Thanks Tom, I figured it out after I saw the error message when trying to vacuum template0. I think the 7.4.x docs could do with a more explanation. The 8.0 docs are better although a more explicit explanation stating that you MUST vacuum analyze template1 (periodically) would be useful. Thanks John Sidney-Woollett Tom Lane wrote: John Sidney-Woollett <[EMAIL PROTECTED]> writes: However the implication (and my confusion) is that you need to vacuum your own databases only. It's not clear (to me) that you have to do the same for template0 and template1 as well. You have to vacuum template1, but not template0 because the latter is marked not datallowconn. Not sure if this is adequately explained anywhere. The next-to-last para in section 21.1.3 does mention the datallowconn exception, but perhaps doesn't spell it out well enough. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Slony uninstall info/warning
Hopefully this will prevent data loss or problems for others using slony 1.0.5 and pg 7.4.6... We just got bitten by something we didn't foresee when completely uninstalling a slony replication cluster from the master and slave... MAKE SURE YOU STOP YOUR APPLICATION RUNNING AGAINST YOUR MASTER DATABASE WHEN REMOVING THE WHOLE SLONY CLUSTER, or at least re-cycle all your open connections after the event! The connections appear to "remember" or refer to objects which are removed by the uninstall node script. And you get lots of errors as a result (and possible data loss??)... Question: Why do our database objects still refer to removed slony objects after they are removed? John Sidney-Woollett More info... Our system is a web application which runs against a postgres 7.4.6 database. Tomcat is restarted at 5am each day. Last Friday afternoon we uninstalled the slony cluster (1 master + 1 slave) so that we could add a new second slave. (I find it easier to uninstall the cluster and then reapply a new setup, subscribe script rather than trying to add the single node.) The cluster was successfully removed, and then rebuilt with 1 master and 2 slave nodes. However, we didn't stop and start our web application which uses Tomcat connection pool and continued to run against the master (during the uninstall and rebuild). Everything appeared fine. Only today while checking something else did I notice lots of ERROR: relation with OID 1036711 does not exist errors in the postgres serverlog In our tomcat application we also saw lots of errors like org.postgresql.util.PSQLException: ERROR: relation with OID 1036711 does not exist at org.postgresql.util.PSQLException.parseServerError(PSQLException.java:139) at org.postgresql.core.QueryExecutor.executeV3(QueryExecutor.java:154) at org.postgresql.core.QueryExecutor.execute(QueryExecutor.java:101) at org.postgresql.core.QueryExecutor.execute(QueryExecutor.java:43) ... Basically these errors started when the cluster was uninstalled and continued until the web application was restarted (at 5am the following morning). I'm not sure what object had OID 1036711 - maybe the slony replication schema?? Is there a way to find out after the event? I do have daily full backups of the master each day going back 1 month - I could load one into another database and lookup the OID, if someone could explain how to do that. And if that would be useful? But the net effect is that we got bizarre failures in our application, and large gaps in primary key values (using sequences) in certain tables within the database. On the whole the app seemed to be behaving fine, with the odd error that we put down to user error... (bad assumption, I guess in hindsight) At the moment I'm trying to figure out exactly what kind of damage our data has suffered. If anyone has any ideas or suggestions as to what went wrong or what to check for, I'd appreciate hearing. Thanks John ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] Slony uninstall info/warning
Thanks for the info, Richard. I didn't think that it was a slony issue per se, but that a note should be added to the slony docs warning to recycle connections after making substantive changes to the schema. You're right, we use both (java) prepared statements and pl/pgsql functions. The data loss aspect is not so clear cut (for us). We definitely got records that failed to insert (missing sequence numbers in tables) while others did insert correctly into the same tables (using the same pl/pgsql functions). So we can't figure out what the pattern is. Although we do have our connection pool recycle idle connections - it could be that things worked when a new recycled connection was used by the web app, and failed when one of the "old/original" connections was used. This is probably what masked the errors for us... It would be great if postgres could "recompile" pl/pgsql functions whenever it found a missing object referenced within the function - chances are that it would compile cleanly (in this case) and then could be executed without error. something along the lines of execute function OK -> return result ERROR - OID -> recompile function, and re-execute OK -> return result ERROR - OID -> report error This would help get rid of the temp table in functions work around having to use an execute statement. Thanks for your help and feedback. John Sidney-Woollett Richard Huxton wrote: John Sidney-Woollett wrote: Hopefully this will prevent data loss or problems for others using slony 1.0.5 and pg 7.4.6... We just got bitten by something we didn't foresee when completely uninstalling a slony replication cluster from the master and slave... MAKE SURE YOU STOP YOUR APPLICATION RUNNING AGAINST YOUR MASTER DATABASE WHEN REMOVING THE WHOLE SLONY CLUSTER, or at least re-cycle all your open connections after the event! The connections appear to "remember" or refer to objects which are removed by the uninstall node script. And you get lots of errors as a result (and possible data loss??)... Question: Why do our database objects still refer to removed slony objects after they are removed? Well, there are two areas I know cache plans/OIDs: 1. Prepared statements 2. plpgsql functions At a guess, since you mention Java the first almost certainly applies to you. This isn't a slony issue so much as a cached plan issue. I'm guessing the same problems would occur if you were manually changing the database schema. Don't think you could get data loss (unless the application ignores errors). You will however get to see a wide range of OID related errors. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] Slony uninstall info/warning
Richard Huxton wrote: > Hmm - not sure you could do this without a savepoint to catch the > error. > However, it might be possible to add track dependencies with the > function (as with views). Then you'd have to issue a CASCADE to alter > the table. If you use Oracle and drop and recreate a table (or any other object) referred to by a function or procedure, and then execute the procedure/function, it will work fine. Oracle will figure out that it needs recompiling (including dependent objects) and will do it prior to execution. You don't get bitten by caches or OID issues... :( Maybe someone will get to this for 8.1? John Sidney-Woollett ps That having been said I still think postgres is very good, and we're only using 7.4 at the moment!... ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [Slony1-general] Re: [GENERAL] Slony uninstall info/warning
It doesn't make sense to me either. The error was always for the same OID. Like you I assumed that removing slony would not cause any problems to a running app. Hopefully someone more involved in Slony will be able to explain why my pl/pgsql functions all got broken after uninstalling slony, even if the issue is an artifact of postgres rather than slony itself. At any rate, I think that a warning is definitely needed in the slony manuals for the slonik uninstall node command. John Sidney-Woollett David Parker wrote: We recently ran into this as well, because in testing we had people leaving an application running against the database while they uninstalled slony. I'm curious, what OIDs would be missing exactly, since the application does not refer directly to any slony objects? Does the cached plan "know" about the slony trigger on a given table? I don't know the extent of information stored in plans. Thanks, because I was just beginning to be puzzled by this! - DAP ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] automating csv import
Have you tried creating a script to do the import operation, and then scheduling it to run periodically using cron (on unix) or at (on windows)? Seems like that is all you need to do... John Sidney-Woollett walker1145 wrote: I get a dump of some data in a CSV file and am able to import it manually using psql and \COPY without a problem. However I now need to automate this as it will need to be done every 2 hours 24/7. I've tried looking through the archives and nothing seems to touch upon this completely and the documentation doesn't mention this or if it does it seems to be some what obscure (at least to me). Does anyone have any suggestions? ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] PostgreSQL Replication
Try this for starters - it's a good introductory article http://www.onlamp.com/pub/a/onlamp/2004/11/18/slony.html Maybe someone else can provide other links. John Sidney-Woollett Dieter Schröder wrote: Hello all, I am currently migrating a few oracle servers to postgre sql and management wants us to have replication. We have found information about the slony replicator with blog posts and articles. We have not found any other replication systems reviewed, so it slony the best tool for the job? I am looking for some articles to share with our management before we make a choice. Sincerely, D. Schröder ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] Resend: Question about PostgreSQL, pgpool, and Postfix
Take a loot at postfix's proxymap feature - I believe it works for MySQL and Postgres, and may solve your problem. http://www.postfix.org/proxymap.8.html From the docs "To consolidate the number of open lookup tables by sharing one open table among multiple processes. For example, making mysql connections from every Postfix daemon process results in "too many connec- tions" errors. The solution: virtual_alias_maps = proxy:mysql:/etc/postfix/virtual_alias.cf The total number of connections is limited by the number of proxymap server processes." John Gregory Youngblood wrote: [I don't know if this message made it out before or not. If it did, please accept my apologies for the duplicate message. Thanks.] I'm running postfix 2.0.18 with a postgresql 8.0.3 database backend. I'm also using courier imap/pop servers connected to postgresql as well. All email users are stored in tables, with views providing lookup information to courier and postfix. It works very well, with one exception. Postfix likes to hang on to idle connections to the database, even if there are not that many postfix processes running. For example, with postfix: 18338 ?Ss 0:00 /usr/lib/postfix/master 18339 ?S 0:00 \_ pickup -l -t fifo -u 18340 ?S 0:00 \_ qmgr -l -t fifo -u 18344 ?S 0:00 \_ trivial-rewrite -n rewrite -t unix -u 18358 ?S 0:00 \_ smtpd -n smtp -t inet -u -s 2 18360 ?S 0:00 \_ smtpd -n smtp -t inet -u -s 2 18361 ?S 0:00 \_ smtpd -n smtp -t inet -u -s 2 18362 ?S 0:00 \_ smtpd -n smtp -t inet -u -s 2 18363 ?S 0:00 \_ cleanup -z -t unix -u 18370 ?S 0:00 \_ smtpd -n smtp -t inet -u -s 2 18371 ?S 0:00 \_ smtpd -n smtp -t inet -u -s 2 18372 ?S 0:00 \_ smtpd -n smtp -t inet -u -s 2 18373 ?S 0:00 \_ smtpd -n smtp -t inet -u -s 2 18386 ?S 0:00 \_ cleanup -z -t unix -u 18390 ?S 0:00 \_ cleanup -z -t unix -u 18397 ?S 0:00 \_ cleanup -z -t unix -u 18401 ?S 0:00 \_ cleanup -z -t unix -u 18402 ?S 0:00 \_ cleanup -z -t unix -u 18403 ?S 0:00 \_ cleanup -z -t unix -u 18427 ?S 0:00 \_ cleanup -z -t unix -u 18440 ?S 0:00 \_ smtpd -n smtp -t inet -u -s 2 18441 ?S 0:00 \_ smtpd -n smtp -t inet -u -s 2 18557 ?S 0:00 \_ cleanup -z -t unix -u 18558 ?S 0:00 \_ virtual -t unix This is what postgresql looks like: 17610 pts/1S 0:00 /usr/bin/postmaster -D /var/lib/pgsql/data 17612 pts/1S 0:00 \_ postgres: writer process 17613 pts/1S 0:00 \_ postgres: stats buffer process 17614 pts/1S 0:00 | \_ postgres: stats collector process 17916 pts/1S 0:00 \_ postgres: courier netio 127.0.0.1(3037) idle 18345 pts/1S 0:00 \_ postgres: transport netio 127.0.0.1(27587) idle 18346 pts/1S 0:00 \_ postgres: virtual netio 127.0.0.1(27588) idle 18347 pts/1S 0:00 \_ postgres: domains netio 127.0.0.1(27589) idle 18364 pts/1S 0:00 \_ postgres: transport netio 127.0.0.1(27590) idle 18365 pts/1S 0:00 \_ postgres: virtual netio 127.0.0.1(27591) idle 18366 pts/1S 0:00 \_ postgres: virtual netio 127.0.0.1(27592) idle 18367 pts/1S 0:00 \_ postgres: transport netio 127.0.0.1(27593) idle 18377 pts/1S 0:00 \_ postgres: transport netio 127.0.0.1(27596) idle 18378 pts/1S 0:00 \_ postgres: virtual netio 127.0.0.1(27597) idle 18379 pts/1S 0:00 \_ postgres: mailbox netio 127.0.0.1(27598) idle 18387 pts/1S 0:00 \_ postgres: transport netio 127.0.0.1(27601) idle 18388 pts/1S 0:00 \_ postgres: virtual netio 127.0.0.1(27602) idle 18389 pts/1S 0:00 \_ postgres: mailbox netio 127.0.0.1(27603) idle 18394 pts/1S 0:00 \_ postgres: transport netio 127.0.0.1(27604) idle 18395 pts/1S 0:00 \_ postgres: virtual netio 127.0.0.1(27605) idle 18396 pts/1S 0:00 \_ postgres: mailbox netio 127.0.0.1(27606) idle 18398 pts/1S 0:00 \_ postgres: transport netio 127.0.0.1(27607) idle 18399 pts/1S 0:00 \_ postgres: virtual netio 127.0.0.1(27608) idle 18400 pts/1S 0:00 \_ postgres: mailbox netio 127.0.0.1(27609) idle 18404 pts/1S 0:00 \_ postgres: mailbox netio 127.0.0.1(27610) idle 18408 pts/1S 0:00 \_ postgres: transport netio 127.0.0.1(27611) idle 18409 pts/1S 0:00 \_ postgres: virtual netio 127.0.0.1(27612) idle 18410 pts/1S 0:00 \_ postgres: mailbox netio 127.0.0.1(27613) idle 18411 pts/1S 0:00 \_ postgres: transport netio 127.0.0.1(27614) idle 18412 pts/1S 0:00 \_ postgres: virtual netio 127.0.0.1(27615) idle 18413 pts/1S 0:00 \_ postgres: mailbox netio 127.0.0.1(27616) idle 18420 pts/1S 0:00 \_ postgres: transport netio 127.0.0.1(27618) idle 18
Re: [GENERAL] Memory Leakage Problem
We're seeing memory problems on one of our postgres databases. We're using 7.4.6, and I suspect the kernel version is a key factor with this problem. One running under Redhat Linux 2.4.18-14smp #1 SMP and the other Debian Linux 2.6.8.1-4-686-smp #1 SMP The second Debian server is a replicated slave using Slony. We NEVER see any problems on the "older" Redhat (our master) DB, whereas the Debian slave database requires slony and postgres to be stopped every 2-3 weeks. This server just consumes more and more memory until it goes swap crazy and the load averages start jumping through the roof. Stopping the two services restores the server to some sort of normality - the load averages drop dramatically and remain low. But the memory is only fully recovered by a server reboot. Over time memory gets used up, until you get to the point where those services require another stop and start. Just my 2 cents... John Will Glynn wrote: Mike Rylander wrote: Right, I can definitely see that happening. Some backends are upwards of 200M, some are just a few since they haven't been touched yet. Now, multiply that effect by N backends doing this at once, and you'll have a very skewed view of what's happening in your system. Absolutely ... I'd trust the totals reported by free and dstat a lot more than summing per-process numbers from ps or top. And there's the part that's confusing me: the numbers for used memory produced by free and dstat, after subtracting the buffers/cache amounts, are /larger/ than those that ps and top report. (top says the same thing as ps, on the whole.) I'm seeing the same thing on one of our 8.1 servers. Summing RSS from `ps` or RES from `top` accounts for about 1 GB, but `free` says: total used free sharedbuffers cached Mem: 40609683870328 190640 0 14788 432048 -/+ buffers/cache:3423492 637476 Swap: 2097144 1756801921464 That's 3.4 GB/170 MB in RAM/swap, up from 2.7 GB/0 last Thursday, 2.2 GB/0 last Monday, or 1.9 GB after a reboot ten days ago. Stopping Postgres brings down the number, but not all the way -- it drops to about 2.7 GB, even though the next most memory-intensive process is `ntpd` at 5 MB. (Before Postgres starts, there's less than 30 MB of stuff running.) The only way I've found to get this box back to normal is to reboot it. Now, I'm not blaming Pg for the apparent discrepancy in calculated vs. reported-by-free memory usage, but I only noticed this after upgrading to 8.1. I don't know of any reason to think that 8.1 would act differently from older PG versions in this respect. Neither can I, which is why I don't blame it. ;) I'm just reporting when/where I noticed the issue. I can't offer any explanation for why this server is starting to swap -- where'd the memory go? -- but I know it started after upgrading to PostgreSQL 8.1. I'm not saying it's something in the PostgreSQL code, but this server definitely didn't do this in the months under 7.4. Mike: is your system AMD64, by any chance? The above system is, as is another similar story I heard. --Will Glynn Freedom Healthcare ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Memory Leakage Problem
/rpc.statd root 2969 1 0 Nov09 00:01:41 /usr/sbin/xinetd -pidfile /var/r root 2980 1 0 Nov09 00:00:07 /usr/sbin/ntpd -p /var/run/ntpd. root 2991 1 0 Nov09 00:00:01 /sbin/mdadm -F -m root -s daemon3002 1 0 Nov09 00:00:00 /usr/sbin/atd root 3013 1 0 Nov09 00:00:03 /usr/sbin/cron root 3029 1 0 Nov09 00:00:00 /sbin/getty 38400 tty1 root 3031 1 0 Nov09 00:00:00 /sbin/getty 38400 tty2 root 3032 1 0 Nov09 00:00:00 /sbin/getty 38400 tty3 root 3033 1 0 Nov09 00:00:00 /sbin/getty 38400 tty4 root 3034 1 0 Nov09 00:00:00 /sbin/getty 38400 tty5 root 3035 1 0 Nov09 00:00:00 /sbin/getty 38400 tty6 postgres 27806 1 0 Dec12 00:00:00 /usr/local/pgsql/bin/postmaster postgres 27809 27806 0 Dec12 00:00:00 postgres: stats buffer process postgres 27810 27809 0 Dec12 00:00:00 postgres: stats collector proces postgres 27821 27806 0 Dec12 00:01:30 postgres: postgres bp_live postgres 27842 1 0 Dec12 00:00:00 /usr/local/pgsql/bin/slon -d 1 b postgres 27844 27842 0 Dec12 00:00:00 /usr/local/pgsql/bin/slon -d 1 b postgres 27847 27806 0 Dec12 00:00:50 postgres: postgres bp_live postgres 27852 27806 1 Dec12 00:18:00 postgres: postgres bp_live postgres 27853 27806 0 Dec12 00:00:33 postgres: postgres bp_live postgres 27854 27806 0 Dec12 00:00:18 postgres: postgres bp_live root 3273510 0 05:35 00:00:00 [pdflush] postfix 2894 2933 0 07:04 00:00:00 pickup -l -t fifo -u -c root 385310 0 07:37 00:00:00 [pdflush] All I know is that stopping postgres brings the server back to normality. Stopping slon on its own is not enough. John Tom Lane wrote: John Sidney-Woollett <[EMAIL PROTECTED]> writes: This server just consumes more and more memory until it goes swap crazy and the load averages start jumping through the roof. *What* is consuming memory, exactly --- which processes? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] to_char() Question
Not sure if there is a numeric formatting option that allows what you want. But how about? substr(to_char(1029, '9,999'),2) John Terry Lee Tucker said: > Greetings List: > > I am using to_char to format numeric data into a string that is ultimately > displayed in an XmText widget. Much of the numeric data is always going to > be > positive. In some of the windows that display this data, space is at a > premium. Basically, I need to display something like 1,029 in an XmText > widget that is exactly 5 characters wide. In a select statement like: > SELECT to_char (1029, '9,999'), to_char always puts one extra space in the > leftmost position for a numeric sign. What I get is: ' 1,029', which > causes > me to have to have the XmText widget 6 characters wide as opposed to 5. > This > is becoming a problem on several of the application windows where, as I > said > earlier, space is at a premium. The only way that I've been able to > suppress > the extra space is by the following: SELECT to_char (1029, 'FM9,999'). > What I > get then is a string that is not right justified and I want the numbers to > be > right justified. Now, finally, the question: Is there a way to suppress > the > extra space for the sign? > > TIA > rnd=# select version(); >version > -- > PostgreSQL 7.4.6 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) > 3.2.3 > 20030502 (Red Hat Linux 3.2.3-49) > (1 row) > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faq > ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Memory Leakage Problem
Tom Lane said: > John Sidney-Woollett <[EMAIL PROTECTED]> writes: >> Tom Lane wrote: >>> *What* is consuming memory, exactly --- which processes? > >> Sorry but I don't know how to determine that. > > Try "ps auxw", or some other incantation if you prefer, so long as it > includes some statistics about process memory use. What you showed us > is certainly not helpful. At the moment not one process's VSZ is over 16Mb with the exception of one of the slon processes which is at 66Mb. I'll run this over the next few days and especially as the server starts bogging down to see if it identifies the culprit. Is it possible to grab memory outsize of a processes space? Or would a leak always show up by an ever increasing VSZ amount? Thanks John ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Memory Leakage Problem
Martijn Thanks for the tip. Since the connections on this server are from slon, I'm hoping that they hand around for a *long* time, and long enough to take a look to see what is going on. John Martijn van Oosterhout wrote: On Tue, Dec 13, 2005 at 04:37:42PM -0000, John Sidney-Woollett wrote: I'll run this over the next few days and especially as the server starts bogging down to see if it identifies the culprit. Is it possible to grab memory outsize of a processes space? Or would a leak always show up by an ever increasing VSZ amount? The only way to know what a process can access is by looking in /proc//maps. This lists all the memory ranges a process can access. The thing about postgres is that each backend dies when the connection closes, so only a handful of processes are going to be around long enough to cause a problem. The ones you need to look at are the number of mappings with a zero-inode excluding the shared memory segment. A diff between two days might tell you which segments are growing. Must be for exactly the same process to be meaningful. Have a nice day, ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Mem usage/leak - advice needed
In trying to investigate a possible memory issue that affects only one of our servers, I have been logging the process list for postgres related items 4 times a day for the past few days. This server uses postgres 7.4.6 + slon 1.1.0 on Debian i686 (Linux server2 2.6.8.1-4-686-smp) and is a slon slave in a two server replicated cluster. Our master DB (similar setup) does not exbibit this problem at all - only the subscriber node... The load average starts to go mental once the machine has to start swapping (ie starts running out of physical RAM). The solution so far is to stop and restart both slon and postgres and things return to normal for another 2 weeks. I know other people have reported similar things but there doesn't seem to be an explanation or solution (other than stopping and starting the two processes). Can anyone suggest what else to look at on the server to see what might be going on? Appreciate any help or advice anyone can offer. I'm not a C programmer nor a unix sysadmin, so any advice needs to be simple to understand. Thanks John The first log is 14th Dec and the second is the 22nd Dec. You can see the slon process (id=27844) using more memory over time. It's memory map and the postmaster are posted below too. ~/meminfo # cat 200512141855.log 27806 1 1052 15288 0.0 0.1 /usr/local/pgsql/bin/postmaster 27809 27806 812 6024 0.0 0.0 pg: stats buffer process 27810 27809 816 5032 0.0 0.0 pg: stats collector process 27821 27806 10744 16236 0.1 1.0 pg: postgres bp_live 192.168.22.76 idle 27842 1 620 2324 0.0 0.0 /usr/local/pgsql/bin/slon -d 1 bprepl4 27844 27842 5920 66876 0.0 0.5 /usr/local/pgsql/bin/slon -d 1 bprepl4 27847 27806 10488 16020 0.0 1.0 pg: postgres bp_live [local] idle 27852 27806 12012 17020 1.1 1.1 pg: postgres bp_live [local] idle 27853 27806 11452 16868 0.0 1.1 pg: postgres bp_live [local] idle 27854 27806 10756 16240 0.0 1.0 pg: postgres bp_live [local] idle ~/meminfo # cat 200512220655.log 27806 1 940 15288 0.0 0.0 /usr/local/pgsql/bin/postmaster 27809 27806 752 6024 0.0 0.0 p: stats buffer process 27810 27809 764 5032 0.0 0.0 pg: stats collector process 27821 27806 4684 16236 0.0 0.4 pg: postgres bp_live 192.168.22.76 idle 27842 1 564 2324 0.0 0.0 /usr/local/pgsql/bin/slon -d 1 bprepl4 27844 27842 2368 70096 0.0 0.2 /usr/local/pgsql/bin/slon -d 1 bprepl4 27847 27806 4460 16020 0.0 0.4 pg: postgres bp_live [local] idle 27852 27806 11576 17020 1.0 1.1 pg: postgres bp_live [local] idle 27853 27806 11328 16868 0.0 1.0 pg: postgres bp_live [local] idle 27854 27806 4640 16240 0.0 0.4 pg: postgres bp_live [local] idle The top listing (right now is) - the key thing is the kswapd0 process. Once physical memory becomes exhausted, the server goes into rapid decline as the swap burden increases... top-08:27:27 up 43 days, 42 min, 1 user, load average: 0.01, 0.04, 0.00 Tasks: 85 total, 1 running, 84 sleeping, 0 stopped, 0 zombie Cpu(s): 0.1% us, 0.0% sy, 0.0% ni, 99.4% id, 0.5% wa, 0.0% hi, 0.0% si Mem: 1035612k total, 1030512k used, 5100k free,46416k buffers Swap: 497972k total, 157088k used, 340884k free,28088k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 27821 postgres 16 0 16236 6480 14m S 0.3 0.6 14:00.34 postmaster 18939 root 16 0 2044 1040 1820 R 0.3 0.1 0:00.02 top 1 root 16 0 1492 136 1340 S 0.0 0.0 0:05.43 init 2 root RT 0 000 S 0.0 0.0 0:02.51 migration/0 3 root 34 19 000 S 0.0 0.0 0:00.02 ksoftirqd/0 4 root RT 0 000 S 0.0 0.0 0:05.35 migration/1 5 root 34 19 000 S 0.0 0.0 0:00.05 ksoftirqd/1 6 root RT 0 000 S 0.0 0.0 0:04.91 migration/2 7 root 34 19 000 S 0.0 0.0 0:00.00 ksoftirqd/2 8 root RT 0 000 S 0.0 0.0 0:21.87 migration/3 9 root 34 19 000 S 0.0 0.0 0:00.00 ksoftirqd/3 10 root 5 -10 000 S 0.0 0.0 0:00.20 events/0 11 root 5 -10 000 S 0.0 0.0 0:00.06 events/1 12 root 5 -10 000 S 0.0 0.0 0:00.01 events/2 13 root 5 -10 000 S 0.0 0.0 0:00.00 events/3 14 root 8 -10 000 S 0.0 0.0 0:00.00 khelper 15 root 7 -10 000 S 0.0 0.0 0:00.00 kacpid 67 root 5 -10 000 S 0.0 0.0 19:26.36 kblockd/0 68 root 5 -10 000 S 0.0 0.0 0:59.05 kblockd/1 69 root 5 -10 000 S 0.0 0.0 0:08.40 kblockd/2 70 root 5 -10 000 S 0.0 0.0 0:10.17 kblockd/3 82 root 15 0 000 S 0.0 0.0 624:18.25 kswapd0 [snipped] The memory map for the slon process is below. cat /proc/27844/maps 08048000-08067000 r-xp 08:0c 198200 /usr/local/pgsql/bin/
Re: [GENERAL] [Slony1-general] Mem usage/leak - advice needed
Thanks for your response. None of the data rows are wide (as far as I can remember). We don't have any blob data, and any text fields only contain several hundred bytes at most (and even those would be rare). Just stopping and starting the slon process on the slave node doesn't seem to help much. Stopping postgres on the slave itself seems to be also required. I'm wondering if this requirement is due to the continued running of the slon psocess on the master. Does it makes sense that shutting down the slave postgres db is necessary? Or would stopping and restarting ALL slon processes on all nodes mean that I wouldn't have to stop and restart the slave postgres DB? Thanks John Ian Burrell wrote: On 12/22/05, John Sidney-Woollett <[EMAIL PROTECTED]> wrote: In trying to investigate a possible memory issue that affects only one of our servers, I have been logging the process list for postgres related items 4 times a day for the past few days. This server uses postgres 7.4.6 + slon 1.1.0 on Debian i686 (Linux server2 2.6.8.1-4-686-smp) and is a slon slave in a two server replicated cluster. Our master DB (similar setup) does not exbibit this problem at all - only the subscriber node... The load average starts to go mental once the machine has to start swapping (ie starts running out of physical RAM). The solution so far is to stop and restart both slon and postgres and things return to normal for another 2 weeks. I know other people have reported similar things but there doesn't seem to be an explanation or solution (other than stopping and starting the two processes). Can anyone suggest what else to look at on the server to see what might be going on? Appreciate any help or advice anyone can offer. I'm not a C programmer nor a unix sysadmin, so any advice needs to be simple to understand. The memory usage growth is caused by the buffers in the slave slon daemon growing when long rows go through them. The buffers never shrink while the slon daemon is running. How big is the largest rows which slon replicates? One suggestion I have seen is to recompile slon to use fewer buffers. Another is to set a ulimit for memory size to automatically kill the slon daemons when they get too big. The watchdog will then restart them. Alternatively, your strategy of restarting the slon daemons each week will work (you don't need to restart postgres). I came up with a patch which shrinks the buffers when they go above a certain size. This doesn't fix the problem of lots of big rows happening at once but it fixes the gradual growth. - Ian ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Storing images in a db (for web/midlet access)
You need to create a servlet (or the equivalent in ASP, PHP etc) that sits between the clients browser and your database. In this layer you implement a cache. The first time the image is requested you retrieve it from the database and serve it normally, but you also copy the byte stream to your cache (possibly using the image URL as your cache key). The next time the image is requested, you check the cache first to see if you can serve the image directly from there, otherwise you go back to the database again. If you're using java, looked at implementing a LRU cache using LinkedHashMap, and you'll need to read up on HttpResponseWrappers to intercept the byte stream and write it to your cache. Also keep details of the modification date of the image resource, size and the "etag". You may find a client making a HEAD request to check that the image hasn't changed - serving this data from your cache will definitely yield a performance boost as the client won't then need to request the image. Caching is no silver bullet if you're serving hundreds/thousands of different images in a short time frame and your cache size is memory limited as you may find that you are not be able to hold images in memory long enough to make the cache viable. This is not a postgres issue so much as a web application design issue. You may be better off directing your questions to a list more appropriate for the web technology you are using. John Assad Jarrahian wrote: Hi All, So I am trying to store images in the db which will be accessed from a midlet and website. So I looked into generating temp files, such as the one done here using servlets (http://fdegrelle.over-blog.com/categorie-275139.html). What I am confused about is that this may not be very efficient. When one browses a website, the pages (and hence the images) are basically cached in the browser, but with this method, it seems that every access to an image (that has already been viewed in a web session) is going to cause a re-query to the database. Not that efficient, no? Is there another way around that will help me take advantage of caching etc, just like it would with storing images on fs? Much thanks in advance! -Assad ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Getting timestamp without milliseconds
Either date_trunc eg, SELECT date_trunc('hour', TIMESTAMP '2001-02-16 20:38:40'); Result: 2001-02-16 20:00:00 Or format the timestamp as a string select to_char(now(), '-mm-dd HH24:MI:SS'); Hope that helps. John Tadej Kanizar wrote: Hi! I've got a timestamp field.. and it returns sth like "2006-01-04 21:33:17.156". How can I get the same result, just without the .156? I've looked at date/time formatting functions, etc, but couldn't find a straight way to do this :-( Thanks. Regards, Tadej ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Sequence Manipulation Functions
Select last_value from your_sequence_name; John MG wrote: Hello, I use PostgreSQL 8.0.3. I want to get the information of the last value of a sequence. The function 'currval' only gives the value back, if before a nextval is executed. Return the value most recently obtained by nextval for this sequence in the current session. (An error is reported if nextval has never been called for this sequence in this session.) Notice that because this is returning a session-local value, it gives a predictable answer whether or not other sessions have executed nextval since the current session did. But that is not very helpful. I noticed that the phpPgAdmin has that information Name Last value Increment by Max value Min value Cache value Log count Is cycled? Is called? adr_dsnr 108 1 9223372036854775807 1 1 25 No Yes So how can I get that information? Thanks Michaela ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Distance calculation
I'm no expert on this but can't you limit the points to checking any hotel whose lat is +- 25km north/south of your city, AND whose longitude is also +- 25km of your city. It's crude but will probably eliminate lots of points you should never be checking... If you could index an approx lat and long for each hotel you could use two indexes (or one combined) on these fields to filter the results BEFORE applying the DISTANCE function. You'll need to compute your approx upper and lower bound lat and long for filtering. I'm not sure if this approach works, but it may help? John [EMAIL PROTECTED] wrote: Hi I have a latiude and longitude for a city and latitude, longitude foreach hotel in hotels table. I have to reitreive 20 hotels nearby to that city in 25 miles. The below is the query I am using to check the distance. But the query is slow because of distance calulation on fly and order by distance. Can anybody help me how can I improve performance by refining lat and long data. v_point is the city lat and long.(point(citylong citylat)) SELECT pr.property_id , pr.property_type As property_type_id , pr.property_name ,round (DISTANCE( v_point:: geometry, POINTFromText('Point(' ||pr.long ||' ' || pr.lat||')')::geometry) *69.055) as CityToHotelDistance FROM property.property pr INNER JOIN place p ON (pr.place_id = p.place_id) INNER JOIN placedetail pd ON (p.place_id = pd.place_id) LEFT OUTER JOIN property.vw_property_price vwp ON (vwp.property_id = pr.property_id) WHERE DISTANCE( v_point :: geometry, POINTFromText('Point(' ||pr.long ||' ' || pr.lat||')')::geometry) < .4 AND pr.place_id != p_place_id AND (pr.status_type_id is null OR pr.status_type_id = 0) ORDER BY DISTANCE( v_point :: geometry, POINTFromText('Point(' ||pr.long ||' ' || pr.lat||')')::geometry) offset 0 LIMIT 20; ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Update value to "the first character is capital and
Have you tried the initcap function? select initcap('abcd efgh'); initcap --- Abcd Efgh John Emi Lu wrote: Hello all, Does anyone have available plpgsql codes to update all capital letters in a column to "the first character is capital and the rest is small" ? For example, in tableA(id, description) 001, 'ZHANG ZHE XIN' => 'Zhang Zhe Xin' 002, 'LIU, WEI-HUAI' =>'Liu, Wei-Huai' 003, 'GU & WEI. NAN (CE SHI) & TOMMORROW' => 'Gu & Wei. Nan (Ce Shi) & Tommorrow' Thanks a lot! Ying ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Is a high tab_reloid worrying?
I just added a new table to a slony relication set. The new table seems to have a really high tab_reloid value of 94,198,669 I'm using Slon 1.1.5 with pg 7.4.6 and 7.4.11 on unix On the master I see (wcprogallery is the new table added to set 4) tab_id|tab_reloid|tab_relname |tab_nspname|tab_set|tab_idxname ---+--++---+---+--- 4005| 18284|wccustomer |customer | 4|wccustomer_pkey ... 4035| 18858|wcrecommend |customer | 4|wcrecommend_pkey 4036| 94198669|wcprogallery|customer | 4|wcprogallery_pkey On the slave I see tab_id|tab_reloid|tab_relname |tab_nspname|tab_set|tab_idxname ---+--++---+---+--- 4005| 1671239|wccustomer |customer | 4|wccustomer_pkey ... 4035| 1671608|wcrecommend |customer | 4|wcrecommend_pkey 4036| 5741203|wcprogallery|customer | 4|wcprogallery_pkey Is this something I should be worried about? Can I find out where all the intermediate OIDs have gone? I have around 70 tables and about 200 pgplsql functions in the database. John ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] [Slony1-general] Is a high tab_reloid worrying?
My tables are defined "WITHOUT OID" - does that make a difference? John Hannu Krosing wrote: Ühel kenal päeval, N, 2006-04-13 kell 10:06, kirjutas John Sidney-Woollett: I just added a new table to a slony relication set. The new table seems to have a really high tab_reloid value of 94,198,669 ... Is this something I should be worried about? Can I find out where all the intermediate OIDs have gone? probably to data rows, unless you have all your tables defined using WITHOUT OID. OIDs are assigned from a global "sequence". Hannu ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Is a high tab_reloid worrying?
Thanks for the reassurance. You're right the db has been around for a while. Doea anyone know if OIDs for data and system (DDL) objects from the same number generator? John Christopher Browne wrote: In an attempt to throw the authorities off his trail, [EMAIL PROTECTED] (John Sidney-Woollett) transmitted: I just added a new table to a slony relication set. The new table seems to have a really high tab_reloid value of 94,198,669 I presume the database instance has been around for a while? If so, then I wouldn't worry too much. I'm not certain comprehensively what would consume OIDs, but I expect temp tables would, so if you have applications that generate them, that would naturally lead to increases in OID values. The only time you get *low* values is if you define tables immediately after creating the database. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Select / sub select? query... help...
Not sure if I have this the right way round, but one option is a self-join select p2.first, p2.last, p1.first, p1.last from people p1, people p2 where p1.person_id = p2.alias and p2.isalias = true; Another is to use a sub-select as a column result (haven't checked the SQL for mistakes) select p2.first, p2.last, (select first||' - '||last from person p1 where p2.alias = p1.person_id) as realname from person p2 and p2.isalias = true; Hope that helps as a starting point. John Jim Fitzgerald wrote: Hello - I'm trying to figure out how to write a particular query and need some assistance. I imagine this is extremely simple. I have the table defined below with five records. This table keeps track of peoples names. Each person has a unique ID number ("person_id"). The table can also keep track of alias names for these people. Each record has a flag ("isalias") indicating whether or not this record indicates a persons real name or a persons alias name. If it is an alias name then an additional field ("alias") has the number indicating this persons real name record by person_id (ie alias field of an alias record == the person_id of the real name record). I want a query that will select all entries where "isalias" is true and will display the person_id, first, and last fields from the alias record and ALSO the first and last fields from the real name entry. Output would be something like this for the example data below 3 - Johns - Alias - John - Smith 4 - Marks - Alias - Mark - Twain Any thoughts on how this can be accomplished easily / efficiently? Thanks -Jim Table "public.people" Column | Type | Modifiers ---+---+--- person_id | integer | first | character varying(20) | last | character varying(20) | alias | integer | isalias | boolean | Containing the example data: person_id | first | last | alias | isalias ---+---+---+---+- 1 | John | Smith | 0 | f 2 | Mark | Twain | 0 | f 3 | Johns | Alias | 1 | t 4 | Marks| Alias | 2 | t 5 | someone | else | 0| f (5 rows) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Can't Figure Out Where Rows Are Going
If you're using a (java) web app with a connection pool is there a possibility that some connections are configured with auto-commit=false and that some future transaction issues a rollback which may include the insert of the order items? Perhaps some kind of persistance manager is doing this without your realising it (hibernate, JDO etc) Or that the connection pool recycled the connection without issuing a commit. Although that doesn't necessarily explain why you can see the rows outside of that particular transaction... If this is a java app, try asking the postgres-jdbc list. John HH wrote: Thanks Tom. I don't use any PL language functions in my app explicitly but perhaps something implicit when using one of the command line tools or something like that? I don't have anything pointing in that direction. I double checked the definition for the 'orders' table (header) and also the 'order_lines', just to make sure there wasn't something in there that could be throwing me off. I didn't see anything in 'orders' of any interest at all. I've included both below in case I missed something in there. I appreciate everyone trying to help and any other ideas are very much appreciated. I hope to be able to trace the cause at some point. - db_production=# \d orders Table "public.orders" Column |Type | Modifiers +-+- id | integer | not null default nextval('orders_id_seq'::regclass) billing_address_1 | character varying(255) | billing_address_2 | character varying(255) | billing_city | character varying(255) | billing_state_province | character varying(255) | billing_postal_code| character varying(255) | billing_country| character varying(255) | phone_number | character varying(255) | email_address | character varying(255) | store_name | character varying(255) | cardholders_name | character varying(255) | card_type | character varying(255) | card_number| character varying(255) | card_security_code | character varying(255) | expires_month | character varying(255) | expires_year | character varying(255) | sent_to_gateway| timestamp without time zone | gateway_confirmation | character varying(255) | avs_address| character varying(255) | avs_zip_code | character varying(255) | created_at | timestamp without time zone | updated_at | timestamp without time zone | billing_first_name | character varying(255) | billing_last_name | character varying(255) | shipping_cost | double precision| sales_tax | double precision| order_status | character varying(255) | processor_type | character varying(255) | ipn_notification | character varying(255) | ipn_date | timestamp without time zone | ipn_email | character varying(255) | ip_address | character varying(255) | Indexes: "orders_pkey" PRIMARY KEY, btree (id) "orders_order_status_index" btree (order_status) 'order_lines' fatwreck_production=# \d order_lines; Table "public.order_lines" Column |Type | Modifiers ---+-+-- id| integer | not null default nextval('order_lines_id_seq'::regclass) order_id | integer | order_item| character varying(255) | order_description | character varying(255) | order_quantity| integer | unit_price| numeric | extended_price| numeric | created_at| timestamp without time zone | updated_at| timestamp without time zone | band_name | character varying(255) | catalog_number| character varying(255) | product_id| character varying(255) | line_source | character varying(255) | Indexes: "order_lines_pkey" PRIMARY KEY, btree (id) "order_lines_order_id_index" btree (order_id) Foreign-key constraints: "order_lines_order_id_fkey" FOREIGN KEY (order_id) REFERENCES orders(id) From: Tom Lane <[EMAIL PROTECTED]> Date: Sat, 06 May 2006 22:04:56 -0400 To: HH <[EMAIL PROTECTED]> Cc: PostgreSQL Subject: Re: [GENERAL] Can't Figure Out Where Rows Are Going HH <[EMAIL PROTECTED]> writes: I can't figure out where this row went and why it disappeared! Is it
Re: [GENERAL] Java Triggers
I would think the answer is yes. pljava allows you to create java functions, and triggers invoke functions eg CREATE TRIGGER sometrigger BEFORE UPDATE ON schema.table FOR EACH ROW EXECUTE PROCEDURE yourfunction(); Here's a link for the pljava language. http://gborg.postgresql.org/project/pljava/genpage.php?userguide John Jimbo1 wrote: Hi there, I'm a very competent Oracle developer, but have never used Postgres. There's currently a project taking place where I'm working that is developing on an Oracle database, but could potentially be migrated to an open source database in the future; possibly Postgres. There are two questions currently being asked about Postgres: 1. Can it support triggers? 2. Can it support Java Triggers (Java in the database trigger body)? I know the answer to the first question is a definite "Yes". However, I don't know the answer to the second. Please can anybody help? Thanks in advance. James ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Querying for strings that match after prefix
Do you mean? select replace(address, 'mailto:', '') from people ... and if you only want to find the ones that start with "mailto:"; select replace(address, 'mailto:', '') from people where address like 'mailto:%' John badlydrawnbhoy wrote: Hi all, I hope this is the right forum for this, but please correct me if somewhere else is more appropriate. I need to locate all the entries in a table that match , but only after a number of characters have been ignored. I have a table of email addresses, and someone else has erroneously entered some addresses prefixed with 'mailto:', which I'd like to ignore. An example would be: [EMAIL PROTECTED] should match mailto:[EMAIL PROTECTED] I've tried the following select address from people where address = (select replace(address, 'mailto:', '') from people); which gives me the error ERROR: more than one row returned by a subquery used as an expression I'm running on PostgreSQL 7.4.7 Thanks in advance, BBB ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Querying for strings that match after prefix
Or something like select ltrim(substr(address, 8)) from people where address like 'mailto:%' union select address from people where address not like 'mailto:%' John John Sidney-Woollett wrote: Do you mean? select replace(address, 'mailto:', '') from people ... and if you only want to find the ones that start with "mailto:"; select replace(address, 'mailto:', '') from people where address like 'mailto:%' John badlydrawnbhoy wrote: Hi all, I hope this is the right forum for this, but please correct me if somewhere else is more appropriate. I need to locate all the entries in a table that match , but only after a number of characters have been ignored. I have a table of email addresses, and someone else has erroneously entered some addresses prefixed with 'mailto:', which I'd like to ignore. An example would be: [EMAIL PROTECTED] should match mailto:[EMAIL PROTECTED] I've tried the following select address from people where address = (select replace(address, 'mailto:', '') from people); which gives me the error ERROR: more than one row returned by a subquery used as an expression I'm running on PostgreSQL 7.4.7 Thanks in advance, BBB ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Querying for strings that match after prefix
1) select ltrim(substr(address, 8)) from people where address like 'mailto:%' gives all addresses that start with "mailto:"; but first strips off the prefix leaving only the email address 2) select address from people where address not like 'mailto:%' produces all email address that don't need the prefix stripped off The UNION of the two gives you all the unique/distinct addresses by combining the results from the first and second query. John brian ally wrote: John Sidney-Woollett wrote: I need to locate all the entries in a table that match , but only after a number of characters have been ignored. I have a table of email addresses, and someone else has erroneously entered some addresses prefixed with 'mailto:', which I'd like to ignore. > Or something like select ltrim(substr(address, 8)) from people where address like 'mailto:%' union select address from people where address not like 'mailto:%' Could you explain why the UNION? brian ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Backwards index scan
I don't think that null values are indexed - you'll probably need to coalesce your null data value to some value if you want it indexed. You can coalesce those value back to null when you retrieve the data from the query. John Carlos Oliva wrote: Thank for your response Alan. This indeed corrects the problem as long as we configure the database to enable_seqscan=false. Perhaps, you can help me with a side effect of using this index: Rows with null dates seem to fall off the index. When the ordschdte is null, the query fails the rows of the data for which the ordschdte is null. We had to resort to a second query that uses a sequential scan to retrieve the rows that have a null ordschdte. Is there any kind of index that we can create that would allow us to order by ordshcdte and which would retrieve rows with null dates? Thanks in advance for your response. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alan Hodgson Sent: Tuesday, June 06, 2006 11:05 AM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] Backwards index scan On June 6, 2006 07:59 am, "Carlos Oliva" <[EMAIL PROTECTED]> wrote: We are conducting a simple test to asses if the optimizer ever uses the index. The table has several columns and the select statement is as follows: select * from ord0007 order by prtnbr, ordschdte desc. The index that we added is "ord0007_k" btree (prtnbr, ordschdte). Prtnbr is numeric(10,0) not null, and ordschdte is date. You have to "order by prtnbr desc, ordschdte desc" to have the index used the way you want. You can re-order in an outer query if you need to. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Backwards index scan
Thanks for putting me straight - I thought I remembered a previous post from Tom about nulls not being indexed but it was probably referring to partial indexes not indexing values that are null... Coalescing null values might still be helpful to ensure that they are ordered in the index at a specific location (either the beginning or the end depending on your substitution value). John Greg Stark wrote: John Sidney-Woollett <[EMAIL PROTECTED]> writes: I don't think that null values are indexed - you'll probably need to coalesce your null data value to some value if you want it indexed. That is most definitely not true for Postgres. NULL values are included in the index. However NULLs sort as greater than all values in Postgres. So when you sort descending they'll appear *first*. If you sort ascending they'll appear last. If you have any clauses like 'WHERE col > foo' then it will not be true for NULL values of col regardless of what foo is and those records will be dropped. This is true regardless of whether there's an index. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Best open source tool for database design / ERDs?
Take a look at http://www.databaseanswers.com/modelling_tools.htm I have used Case Studio 2 (fine, weak on documentation generation), and PowerDesigner by Sybase (truly excellent! but $$$) John [EMAIL PROTECTED] wrote: Bjørn T Johansen wrote: Have you tried Druid (http://druid.sourceforge.net/index.html) ? It does anything a good ERD designer do and it's free... Thanks Bjorn. I have downloaded it but not tested it yet. I will test it in the next few days. DIA doesn't seem like a good choice. Did somebody say Druid can do forward engineering for PostgreSQL? I'm a little concerned about stepping over dollars to pick-up pennies so to speak. If Druid does about as much as the commercial diagramming products do, then I will use it. However, if there is a non-open source diagrammer that is USD $200 or less that does a lot more, or does what it does a lot better - e.g. it makes me a lot more efficient, then I would rather pay for the commercial tool. What inexpensive (~USD $200 or less) ERD tools are out there, and are they a lot more feature-rich than Druid? Thanks. Dana ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Ever increasing OIDs - gonna run out soon?
Back in April 2006 I emailed about high OIDs in a 7.4.6 database. I've just added new tables to the database (and the slony-relication set) and I'm seeing much higher OID values for the new tables. In April I added a table, it was assigned an OID value of 94198669 (94 million). I've just added three new tables (using an interactive psql session), and the OIDs assigned are (now in the 182 million range): wcpartner 182027615 wccustomdata 182027995 wccustpartnerdata 182028076 All the tables are created without OIDs, eg create table customer.wcpartner ( wcpartneridinteger, name varchar(32), wduserid integer, primary key(wcpartnerid) ) without oids; And the three tables were created in the same session within about 1 minute of each other. By way of comparison, the oids for the Slony slave (7.4.11) are wcpartner38220869 wccustomdata 38221080 wccustpartnerdata38221139 This is a normal production database with slony replication 1.1.5 and is mainly accessed from a web application using JDBC. I don't use any temporary tables - so what can be using up all the OIDs (especially between two successive create table statements)? It looks like the db is using them at the rate of 1.5 million per day. At what value will I hit a wraparound, and what options do I have to identify/fix the (impending) problem. Thanks. John ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Ever increasing OIDs - gonna run out soon?
Jim C. Nasby wrote: > Except IIRC the OP is running 7.4 which doesn't have checks in DDL > code to deal with OID collisions. :( This is not good news! :( What about other long runing 7.4.x DBs? Do you really have to dump, init and restore every once in a while? Also, do you know what is actually using the OIDs - transactions? John ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Ever increasing OIDs - gonna run out soon?
We'll probably upgrade to 8.1.x before we hit the wraparound problem! :) Hmm, looks like slony uses OIDs... And I found a couple of my own tables which were incorrectly created with OIDs. select relname, relnamespace, reltype from pg_catalog.pg_class where relhasoids=true; relname | relnamespace | reltype +--+-- pg_attrdef | 11 |16385 pg_constraint | 11 |16387 pg_database| 11 | 88 pg_proc| 11 | 81 pg_rewrite | 11 |16411 pg_type| 11 | 71 pg_class | 11 | 83 pg_operator| 11 |16393 pg_opclass | 11 |16395 pg_am | 11 |16397 pg_language| 11 |16403 pg_trigger | 11 |16413 pg_cast| 11 |16419 pg_namespace | 11 |16596 pg_conversion | 11 |16598 sturllog |18161 |18519 <-- MINE stsession |18161 |18504 sl_trigger | 82061042 | 82061126 <-- SLONY sl_table | 82061042 | 82061113 sl_nodelock| 82061042 | 82061082 sl_setsync | 82061042 | 82061098 sl_sequence| 82061042 | 82061134 sl_node| 82061042 | 82061073 sl_listen | 82061042 | 82061162 sl_path| 82061042 | 82061147 sl_subscribe | 82061042 | 82061174 sl_set | 82061042 | 82061087 sl_event | 82061042 | 82061186 sl_confirm | 82061042 | 82061193 sl_seqlog | 82061042 | 82061198 sl_log_1 | 82061042 | 82061202 sl_log_2 | 82061042 | 82061209 sl_config_lock | 82061042 | 82061229 Thanks John Tom Lane wrote: Martijn van Oosterhout writes: Well, you have to be using a lot of OIDs for this to be an issue. At your stated rate of 1.5 million OIDs per day it will take just under eight years before you wraparound. That's a lot of OIDs and most databases don't get anywhere near that many, which is why it's not a big deal for most people... It should also be pointed out that OID wraparound is not a fatal condition. Pre-8.1 you might get occasional query failures due to trying to insert duplicate OIDs, but that's about it. This gives you a list of tables that use OIDs. Maybe it can help you track down the problem. Look at pg_class.relhasoids --- easier, and more reliable. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Ever increasing OIDs - gonna run out soon?
Slony does appear to use OIDs. John Jim C. Nasby wrote: On Mon, Jun 12, 2006 at 05:02:09PM +0100, John Sidney-Woollett wrote: Jim C. Nasby wrote: Except IIRC the OP is running 7.4 which doesn't have checks in DDL code to deal with OID collisions. :( This is not good news! :( What about other long runing 7.4.x DBs? Do you really have to dump, init and restore every once in a while? Also, do you know what is actually using the OIDs - transactions? Since you're running Slony, I suspect it's using them somehow. Or maybe it doesn't create it's tables WITHOUT OIDs. Also note that any time you create an object you burn through an OID. Probably your best bet is to just upgrade to 8.1, which will gracefully handle OID collisions. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Searching BLOB
Save yourself some effort and use Lucene to index a directory of your 300 word documents. I'm pretty sure that Lucene includes an extension to read Word documents, and you can use PDFBox to read/write PDF files. Marrying the searching and displaying of results to your web application should be trivial since you're wanting to use java anyway. Lucene has full character set support and is blindingly fast If you're looking for a solution to this problem using Postgres, then you'll be creating a ton extra work for yourself. If you're wanting to learn more about postgres, then maybe it'll be worthwhile. John James Watson said: > Hi, > I am not 100% sure what the best solution would be, so I was hoping > someone could point me in the right direction. > > I usually develop in MS tools, such as .net, ASP, SQL Server etc..., > but I really want to expand my skillset and learn as much about > Postgresqlas > possible. > > What I need to do, is design a DB that will index and store > approximately 300 word docs, each with a size no more that 1MB. They > need to be able to seacrh the word documents for keyword/phrases to be > able to identify which one to use. > > So, I need to write 2 web interfaces. A front end and a back end. Front > end for the users who will search for their documents, and a backend > for an admin person to upload new/ammended documents to the DB to be > searchable. > > NOW. I could do this in the usual MS tools that I work with using > BLOB's and the built in Full-text searching that comes with SQL Server, > but i don't have these to work with at the mometn. I am working with > PostGres & JSP > pages > > What I was hoping someone could help me out with was identifying the > best possible solution to use. > > 1. How can I store the word doc's in the DB, would it be best to use a > BLOB data type? > > 2. Does Postgres support full text searching of a word document once it > is loaded into the BLOB column & how would this work? Would I have to > unload each BLOB object, convert it back to text to search, or does > Postgres have the ability to complete the full-text search of a BLOB, > like MSSQL Server & Oracle do? > > 3. Is there a way to export the Word Doc From the BLOB colum and dump > it into a PDF format (I guess I am asking if someone has seen or > written a PDF generator script/storedProc for Postgres)? > > If someone could help me out, it would be greatly appreciated. > > cheers, > James > ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Searching BLOB - Lucene setup & problem
This is a bit off topic for the Postgres list... ;) Make sure you explicitly include the name of the Lucene jar file in your command line invocation, and any other directories that are required (normally your current working directory), so for Windows you'd use something like java -cp .;{pathto}\lucene-1.4.3.jar YouJavaApp When you use Lucene in your webapp include the Lucene jar file in {tomcat_home}\commons\lib or the WEB-INF\lib directory under your webapp. Hope that helps. John [EMAIL PROTECTED] wrote: Hi John, I have had a read through the lucene website (http://lucene.apache.org/java/docs/index.html) and it sounds pretty good to me. I should be able to use this in conjuction with my JSP pages. This may sound quite dumb to anyone who develops in java, but I need a little help setting up the demo on my windowsXP machine. I have installed JDY 1.5.0_07, i have installed tomcat and can confirm that is is all up and running correctly, as I have already written a few simple JSP pages. I have downloaded the lucene package, extracted the package to my C:\ and followed the steps of the demo page: http://lucene.apache.org/java/docs/demo.html But, when i try to run "java org.apache.lucene.demo.IndexFiles c:\lucene-2.0.0\src" from the cmd prompt, I get the following error: "Exception in thread 'main' java.lang.NoClassDefFoundError: org/apache/lucene/analysis/Analyser" I am not sure why this is coming up. I have followed the instructions on the demo page on the web. The only thing i can think of is I may have my "CLASSPATH" incorrect. Can someone help me out with a basic desription if what the classpath is and where I should point the classpath environment variable to? Once I have that correct, i think that I may be able to run the demo. thanks for any help you can provide. James "John Sidney-Woollett" wrote: Save yourself some effort and use Lucene to index a directory of your 300 word documents. I'm pretty sure that Lucene includes an extension to read Word documents, and you can use PDFBox to read/write PDF files. Marrying the searching and displaying of results to your web application should be trivial since you're wanting to use java anyway. Lucene has full character set support and is blindingly fast If you're looking for a solution to this problem using Postgres, then you'll be creating a ton extra work for yourself. If you're wanting to learn more about postgres, then maybe it'll be worthwhile. John ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] limit results to one row per foreign object
Without trying it out, how about something like: select username, maxbid from users u, ( select user_id, max(amount) as maxbid from bids group by user_id where auction_id = XXX ) as b where u.id = b.user_id; John Alan Bullock wrote: hi all, I have the following schema: CREATE TABLE auctions ( id serial NOT NULL, user_id int4, title varchar(255), body varchar(255), CONSTRAINT auctions_pkey PRIMARY KEY (id) ) WITHOUT OIDS; CREATE TABLE bids ( id serial NOT NULL, auction_id int4, user_id int4, amount float8, created_at timestamp, CONSTRAINT bids_pkey PRIMARY KEY (id) ) WITHOUT OIDS; CREATE TABLE users ( id serial NOT NULL, username varchar(255), CONSTRAINT users_pkey PRIMARY KEY (id) ) WITHOUT OIDS; I'd like to return all the bids for a given auction, but limit it to only the *latest* bid from each user. so regardless of how many bids a user has placed, only their latest is returned. I dont have a clue where to even start with this and would appreciate some pointers thanks alan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Importance of re-index
In addition to making sure databases are vacuumed regularly, it is worth running REINDEX on tables that see a lot of updates (or insert/deletes). Running REINDEX on a regular basis will keep the indexes compacted and can noticeably improve the database performance. The other benefit is that the disk space taken by your database can be significantly reduced. This is barely mentioned in the 7.4 docs, and alluded to in the 8.1 docs. FWIW, in my experience it is DEFINITELY worth reindexing regularly. John ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Best Procedural Language?
I'd say that the biggest benefit of pl/pgsql for postgres is that it is so close to Oracle's own procedural language. This makes the job of porting from Oracle to postgres *nearly* trivial. Convincing a site to switch from Oracle to Postgres is therefroe easier and a major feather in postgres's cap. Working with both Oracle 8,9,10 and postgres 7.4 and 8, I find switching between the two dbs fairly easy. Oracle is richer in terms of (programming) features but bang for buck and ease of administration/setup etc, you just can't beat postgres... John Merlin Moncure wrote: On 8/1/06, Christopher Browne <[EMAIL PROTECTED]> wrote: Martha Stewart called it a Good Thing when "Carlo Stonebanks" <[EMAIL PROTECTED]> wrote: > I am interested in finding out a "non-religious" answer to which > procedural language has the richest and most robust implementation > for Postgres. C is at the bottom of my list because of how much > damage runaway code can cause. I also would like a solution which is > platorm-independent; we develop on Windows but may deploy on Linux. my take: C: you can probably get by without doing any C. Most (but not quite all) of things you would do via C is exposed in libraries. One thing you can do with C for example is invoke a function via its oid and manually supplying parameters to make callbacks for proceures. you can also dump core on your backend. good luck! pl/pgsql: you do not know postgresql if you do not know pl/pgsql. period. ideal for data processing and all sorts of things. all queries are first class in the code (except for dynamic sql), which in my estimation cuts code size, defect rate, and development time about 75% for typical database type stuff. just be warned, after you learn it you will never want to use another database ever again, i'm not kiddig. pl/perl, etc: not much to add beyond what chris browe said: great for text processing or library support. merlin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Importance of re-index
Disagree. We only apply reindex on tables that see lots of updates... With our 7.4.x databases we vacuum each day, but we see real performance gains after re-indexing too - we see lower load averages and no decrease in responsiveness over time. Plus we have the benefit of reduced disk space usage. I think that the two things go hand in hand, although vacuum is the most important. John Jim C. Nasby wrote: And if you're vacuuming frequently enough, there shouldn't be that much need to reindex. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] remote duplicate rows
If you have a primary key value (or OID?) then you can delete the duplicates in situ using something like (untested) -- should work if never more than 1 duplicate row for colname1, colname2 delete from table where pk_value in ( select min(pk_value) from table group by colname1, colname2 having count(*) > 1 ) -- if you can have multiple duplicate rows for colname1, colname2 -- then you need something like delete from table where pk_value not in ( select min(pk_value) from table group by colname1, colname2 having count(*) = 1 ) Hope that helps. John A. Kretschmer wrote: am Wed, dem 13.09.2006, um 15:46:58 -0700 mailte Junkone folgendes: hI i have a bad situation that i did not have primary key. so i have a table like this colname1colname2 1 apple 1 apple 2 orange 2 orange It is a very large table. how do i remove the duplctes quickly annd without much change. begin; alter table foo rename to tmp; create table foo as select distinct * from tmp; commit; You should create a primary key now to avoid duplicated entries... HTH, Andreas ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq