Madison Kelly wrote:
Richard Huxton wrote:

Madison Kelly wrote:

Hi all,

I am new to the list and I didn't want to seem rude at all so I wanted to ask if this was okay first.



No problem. Reading your message below, you might want to try the performance list, but general is a good place to start.


I have a program I have written in perl which uses a postgresSQL database as the backend. The program works but the performance is really bad. I have been reading as much as I can on optimizing performance but still it isn't very reasonable. At one point I had my program able to process 175,000 records in 16min 10sec on a Pentium3 650MHz, 448MB RAM test machine. Since then I got a Pentium3 1GHz, 512MB system and I have tried a lot of things to get the performance up but now it is substantially slower and I can't seem to figure out what I am doing wrong.



A few places to start: 1. VACUUM FULL This will make sure any unused space is reclaimed 2. ANALYZE This will recalculate stats for the tables 3. Basic performce tuning: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php There's also a good guide to the postgresql.conf file on varlena.com

Would it be appropriate to ask for help on my program on this list? Full disclosure: The program won't be initially GPL'ed because it is for my company but it will be released for free to home users and the source code will be made available (similar to other split-license programs) though once my company makes it's money back I think they will fully GPL it (I am on my boss's case about it :p ).



No problem - what you licence your software under is your concern. Once you've taken the basic steps described above, try to pick out a specific query that you think is too slow and provide:


1. PostgreSQL version
2. Basic hardware info (as you have)
3. Sizes of tables.
4. Output of EXPLAIN ANALYZE <query here>

The EXPLAIN ANALYZE runs the query and shows how much work PG thought it would be and how much it actually turned out to be.

HTH


Thank you very much!! I am using Psql 7.4 on a stock install of Fedora Core 2 on my IBM thinkpad a22m (P3 1GHz, 512MB RAM, not the fastest HDD). The drive carrier I am using is connected via USB2 and uses a few different hard drives with the fastest being a couple of Barracuda 7200.7 drives (2MB cache, 7,200rpm). I described the program in my reply to Martijn so here is some of the code (code not related to psql snipped, let me know if posting it would help - sorry for the wrapping...):

I'm not clear if the database is on the local disk or attached to the USB2. Not sure it's important, since neither will be that fast.


If I understand, you scan thousands or millions of files for backup purposes and then issue a select + update/insert for each.

Once a partition is scanned, a flag is cleared on all rows.

Once all selected files have been dealt with a vaccum/analyse is issued.


Some things to look at:
1. How many files are you handling per second? Are the disks involved in the backup as well as the database?
2. What does the output of "vmstat 10" show when the system is running. Is your I/O saturated? CPU?
3. Is your main index (file_src_uuid,file_name,file_parent_dir) being used? Your best bet is to select from "pg_stat_indexes" before and after.
4. If you are updating several hundred thousand rows then you probably don't have enought vacuum memory set aside - try a vacuum full after each set of updates.
5. You might want to batch together queries into transactions of a few hundred or even few thousand updates.
--
Richard Huxton
Archonet Ltd


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Reply via email to