[PERFORM] Windows performance again
Hello, We have the next scenario: Linux box with postgresql 7.2.1-1 (client) Windows XP with postgresql 8.1.1 (server) Windows XP with postgresql 8.1.1 (client) All connected in 10Mb LAN In server box, we have a table with 65000 rows and usign "psql" we have these results: Linux box with psql version 7.2.1 versus Windows XP server: select * from ; -> 7 seconds aprox. to obtain a results. Network utilization: 100% Windows XP client box with psql version 8.1.1 versus Windows XP server: select * from ; -> 60 seconds aprox. to obtain a results Network utilization: 3% Windows XP server box with psql version 8.1.1 versus Windows XP server: select * from ; -> <1 seconds aprox. to obtain a results. Network utilization: 0% Is a really problem, because we are migrating a old server to 8.0 version in a windows box, and our application works really slow Thanks in advance, Josep Maria -- Josep Maria Pinyol i Fontseca Responsable àrea de programació ENDEPRO - Enginyeria de programari Passeig Anselm Clavé, 19 Bx. 08263 Callús (Barcelona) Tel. +34 936930018 - Mob. +34 600310755 - Fax. +34 938361994 [EMAIL PROTECTED] - http://www.endepro.com Aquest missatge i els documents en el seu cas adjunts, es dirigeixen exclusivament al seu destinatari i poden contenir informació reservada i/o CONFIDENCIAL, us del qual no està autoritzat ni la divulgació del mateix, prohibit per la legislació vigent (Llei 32/2002 SSI-CE). Si ha rebut aquest missatge per error, li demanem que ens ho comuniqui immediatament per la mateixa via o bé per telèfon (+34936930018) i procedeixi a la seva destrucció. Aquest e-mail no podrà considerar-se SPAM. Este mensaje, y los documentos en su caso anexos, se dirigen exclusivamente a su destinatario y pueden contener información reservada y/o CONFIDENCIAL cuyo uso no autorizado o divulgación está prohibida por la legislación vigente (Ley 32/2002 SSI-CE). Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente por esta misma vía o por teléfono (+34936930018) y proceda a su destrucción. Este e-mail no podrá considerarse SPAM. This message and the enclosed documents are directed exclusively to its receiver and can contain reserved and/or confidential information, from which use isn’t allowed its divulgation, forbidden by the current legislation (Law 32/2002 SSI-CE). If you have received this message by mistake, we kindly ask you to communicate it to us right away by the same way or by phone (+34936930018) and destruct it. This e-mail can’t be considered as SPAM. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Windows performance again
Josep Maria Pinyol Fontseca wrote: Linux box with psql version 7.2.1 versus Windows XP server: select * from ; -> 7 seconds aprox. to obtain a results. Network utilization: 100% Windows XP client box with psql version 8.1.1 versus Windows XP server: select * from ; -> 60 seconds aprox. to obtain a results Network utilization: 3% Windows XP server box with psql version 8.1.1 versus Windows XP server: select * from ; -> <1 seconds aprox. to obtain a results. Network utilization: 0% It's *got* to be the network configuration on the client machine. I'd be tempted to install ethereal on the linux box and watch for the difference between the two networked sessions. I'm guessing it might be something to do with TCP/IP window/buffer sizes and delays on ACKs - this certainly used to be an issue on older MS systems, but I must admit I thought they'd fixed it for XP. If not that, could some firewall/security system be slowing network traffic? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] SAN/NAS options
I hope this isn't too far off topic for this list. Postgres is the main application that I'm looking to accomodate. Anything else I can do with whatever solution we find is just gravy... You've given me a lot to go on... Now I'm going to have to do some research as to real-world RAID controller performance. It's vexing (to say the least) that most vendors don't supply any raw throughput or TPS stats on this stuff... One word of advice. Stay away from Dell kit. The PERC 4 controllers they use don't implement RAID 10 properly. It's RAID 1 + JBOD array. It also has generally dismal IOPS performance too. You might get away with running software RAID, either in conjunction with, or entirely avoiding the card. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] What's the best hardver for PostgreSQL 8.1?
Jim C. Nasby wrote: How much space does that equate to? Have you optimized the queries? Items that generally have the biggest impact on performance in decreasing order: 1. System architecture 2. Database design 3. (for long-running/problem queries) Query plans 4. Disk I/O 5. Memory 6. CPU So, I'd make sure that the queries have been optimized (and that includes tuning postgresql.conf) before assuming you need more hardware. Based on what you've told us (very little parallelization), then your biggest priority is probably either disk IO or memory (or both). Without knowing the size of your database/working set it's difficult to provide more specific advice. Hi! We have 3 Compaq Proliant ML530 servers with dual Xeon 2.8GHz processors, 3 GB DDR RAM, Ultra Wide SCSI RAID5 1rpm and 1000Gbit ethernet. We partitioned our databases among these machines, but there are cross refrences among the machines theoretically. Now the size of datas is about 100-110GB. We've used these servers for 3 years with Debian Linux. We have already optimized the given queries and the postgresql.conf. We tried more tricks and ideas and we read and asked on mailing lists. We cannot do anything, we should buy new server for the databases, because we develop our system for newer services, so the size will grow along. After that we need better responsiblility and shorter execution time for the big queries (These queries are too complicated to discuss here, and more times we optimized with plpgsql stored procedures.). The PostgreSQL 8.1 solved more paralellization and overload problem, the average load is decreased significantly on our servers. But the big queries aren't fast enough. We think the hardver is the limit. Generally 2 parallel guery running in working hours, after we make backups at night. Regards, Atesz ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] SAN/NAS options
Charles, On 12/20/05 9:58 PM, "Charles Sprickman" <[EMAIL PROTECTED]> wrote: > You've given me a lot to go on... Now I'm going to have to do some > research as to real-world RAID controller performance. It's vexing (to > say the least) that most vendors don't supply any raw throughput or TPS > stats on this stuff... Take a look at this: http://www.wlug.org.nz/HarddiskBenchmarks > Anyhow, thanks again. You'll probably see me back here in the coming > months as I try to shake some mysql info out of my brain as our pgsql DBA > gets me up to speed on pgsql and what specifically he's doing to stress > things. Cool! BTW - based on the above benchmark page, I just immediately ordered 2 x of the Areca 1220 SATA controllers ( http://www.areca.com.tw/products/html/pcie-sata.htm) so that we can compare them to the 3Ware 9550SX that we've been using. The 3Ware controllers have been super fast on sequential access, but I'm concerned about their random IOPs. The Areca's aren't as popular, and there's consequently less volume of them, but people who use them rave about them. - Luke ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Speed of different procedural language
On Sun, Dec 18, 2005 at 01:10:21AM -, Ben Trewern wrote: > I know I should be writing these in C but that's a bit beyond me. I was > going to try PL/Python or PL/Perl or even PL/Ruby. Has anyone any idea > which language is fastest, or is the data access going to swamp the overhead > of small functions? I'm not sure if it's what you ask for, but there _is_ a clear difference between the procedural languages -- I've had a 10x speed increase from rewriting PL/PgSQL stuff into PL/Perl, for instance. I'm not sure which ones would be faster, though -- I believe Ruby is slower than Perl or Python generally, but I don't know how it all works out in a PL/* setting. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [GENERAL] need help
Try to execute your query (in psql) with prefixing by EXPLAIN ANALYZE and send us the result db=# EXPLAIN ANALYZE UPDATE s_apotik SET stock = 100 WHERE obat_id='A'; regards -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tino Wildenhain Sent: mardi 6 décembre 2005 09:55 To: Jenny Cc: pgsql-general@postgresql.org; pgsql-sql@postgresql.org; pgsql-performance@postgresql.org Subject: Re: [PERFORM] [GENERAL] need help Jenny schrieb: > I'm running PostgreSQL 8.0.3 on i686-pc-linux-gnu (Fedora Core 2). > I've been dealing with Psql for over than 2 years now, but I've never > had this case before. > > I have a table that has about 20 rows in it. > >Table "public.s_apotik" > Column | Type| Modifiers > ---+--+-- > obat_id| character varying(10)| not null > stock | numeric | not null > s_min | numeric | not null > s_jual | numeric | > s_r_jual | numeric | > s_order| numeric | > s_r_order | numeric | > s_bs | numeric | > last_receive | timestamp without time zone | > Indexes: >"s_apotik_pkey" PRIMARY KEY, btree(obat_id) > > When I try to UPDATE one of the row, nothing happens for a very long time. > First, I run it on PgAdminIII, I can see the miliseconds are growing > as I waited. Then I stop the query, because the time needed for it is > unbelievably wrong. > > Then I try to run the query from the psql shell. For example, the > table has obat_id : A, B, C, D. > db=# UPDATE s_apotik SET stock = 100 WHERE obat_id='A'; ( nothing > happens.. I press the Ctrl-C to stop it. This is what comes out > :) > Cancel request sent > ERROR: canceling query due to user request > > (If I try another obat_id) > db=# UPDATE s_apotik SET stock = 100 WHERE obat_id='B'; (Less than a > second, this is what comes out :) UPDATE 1 > > I can't do anything to that row. I can't DELETE it. Can't DROP the table. > I want this data out of my database. > What should I do? It's like there's a falsely pointed index here. > Any help would be very much appreciated. > 1) lets hope you do regulary backups - and actually tested restore. 1a) if not, do it right now 2) reindex the table 3) try again to modify Q: are there any foreign keys involved? If so, reindex those tables too, just in case. did you vacuum regulary? HTH Tino ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] 2 phase commit: performance implications?
>Why not just query adjacent databases, rather than copying the data around? The reasons I didn't choose this way were: 1) I didn't think there's a way to write a query that can act on the data in two Databases as though it was all in one, and I didn't want to get into merging multiple database query results on the Application side. I'd rather just have all the needed data sitting in a single database so that I can perform whatever query I like without complication. 2) Most database accesses are reads, and I didn't want a big network overhead for these, especially since I'm aiming for each database to be entirely RAM resident. >If you really wanted to do this, do you need 2pc? Once data has been uploaded to the database for region A, then asynchronously copy the data to B, C, D and E later, using a queue. I've always assumed that my data needed to be consistent. I guess there are some circumstances where it isn't really a problem, but each would need to be carefully evaluated. The easy answer is to say 'yes, it must be consistent'. >If you try to commit to all at once, then if one fails, then none has the data. Yes, I'd prefer things to be that way in any event. Regards, Andy ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Windows performance again
Richard Huxton writes: > Josep Maria Pinyol Fontseca wrote: > > Windows XP client box with psql version 8.1.1 versus Windows XP server: > > select * from ; -> 60 seconds aprox. to obtain a results > > Network utilization: 3% The 60 seconds sounds suspiciously like a DNS problem. -- greg ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Windows performance again
Richard Huxton writes: > Josep Maria Pinyol Fontseca wrote: >> Network utilization: 0% > It's *got* to be the network configuration on the client machine. We've seen gripes of this sort before --- check the list archives for possible fixes. I seem to recall something about a "QoS patch", as well as suggestions to get rid of third-party packages that might be interfering with the TCP stack. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Wrong index used when ORDER BY LIMIT 1
Dear Gurus, Version: 7.4.6 I use a query on a heavily indexed table which picks a wrong index unexpectedly. Since this query is used in response to certain user interactions thousands of times in succession (with different constants), 500ms is not affordable for us. I can easily work around this, but I'd like to understand the root of the problem. Basically, there are two relevant indexes: - muvelet_vonalkod_muvelet btree (muvelet, ..., idopont) - muvelet_vonalkod_pk3 btree (idopont, ...) Query is: SELECT idopont WHERE muvelet = x ORDER BY idopont LIMIT 1. I expected the planner to choose the index on muvelet, then sort by idopont. Instead, it took the other index. I think there is heavy correlation since muvelet references to a sequenced pkey and idopont is a timestamp (both increase with passing time). May that be a cause? See full table description and explain analyze results at end of the email. TIA, -- G. table : Table "public.muvelet_vonalkod" Column | Type | Modifiers +--+--- az | integer | not null def. nextval('...') olvaso_nev | character varying| not null vonalkod | character varying| not null mozgasnem | integer | not null idopont| timestamp with time zone | not null muvelet| integer | minoseg| integer | not null cikk | integer | muszakhely | integer | muszakkod | integer | muszaknap | date | repre | boolean | not null default false hiba | integer | not null default 0 Indexes: "muvelet_vonalkod_pkey" primary key, btree (az) "muvelet_vonalkod_pk2" unique, btree (olvaso_nev, idopont) "muvelet_vonalkod_muvelet" btree (muvelet, mozgasnem, vonalkod, olvaso_nev, idopont) "muvelet_vonalkod_pk3" btree (idopont, olvaso_nev) "muvelet_vonalkod_vonalkod" btree (vonalkod, mozgasnem, olvaso_nev, idopont) Foreign-key constraints: "$1" FOREIGN KEY (mozgasnem) REFERENCES mozgasnem(az) "$2" FOREIGN KEY (muvelet) REFERENCES muvelet(az) "$3" FOREIGN KEY (minoseg) REFERENCES minoseg(az) "$4" FOREIGN KEY (cikk) REFERENCES cikk(az) "$5" FOREIGN KEY (muszakhely) REFERENCES hely(az) "$6" FOREIGN KEY (muszakkod) REFERENCES muszakkod(az) "muvelet_vonalkod_muszak_fk" FOREIGN KEY (muszakhely, muszaknap, muszakkod) REFERENCES muszak(hely, nap, muszakkod) Triggers: muvelet_vonalkod_aiud AFTER INSERT OR DELETE OR UPDATE ON muvelet_vonalkod FOR EACH ROW EXECUTE PROCEDURE muvelet_vonalkod_aiud() muvelet_vonalkod_biu BEFORE INSERT OR UPDATE ON muvelet_vonalkod FOR EACH ROW EXECUTE PROCEDURE muvelet_vonalkod_biu() muvelet_vonalkod_noty AFTER INSERT OR DELETE OR UPDATE ON muvelet_vonalkod FOR EACH ROW EXECUTE PROCEDURE muvelet_vonalkod_noty() -- original query, limit # explain analyze select idopont from muvelet_vonalkod where muvelet=6859 order by idopont limit 1; QUERY PLAN Limit (cost=0.00..25.71 rows=1 width=8) (actual time=579.528..579.529 rows=1 loops=1) -> Index Scan using muvelet_vonalkod_pk3 on muvelet_vonalkod (cost=0.00..8304.42 rows=323 width=8) (actual time=579.522..579.522 rows=1 loops=1) Filter: (muvelet = 6859) Total runtime: 579.606 ms (4 rows) -- however, if I omit the limit clause: # explain analyze select idopont from muvelet_vonalkod where muvelet=6859 order by idopont; QUERY PLAN --- Sort (cost=405.41..405.73 rows=323 width=8) (actual time=1.295..1.395 rows=360 loops=1) Sort Key: idopont -> Index Scan using muvelet_vonalkod_muvelet on muvelet_vonalkod (cost=0.00..400.03 rows=323 width=8) (actual time=0.049..0.855 rows=360 loops=1) Index Cond: (muvelet = 6859) Total runtime: 1.566 ms (5 rows) -- workaround 1: the planner is hard to trick... # explain analyze select idopont from (select idopont from muvelet_vonalkod where muvelet=6859) foo order by idopont limit 1; QUERY PLAN --- Limit (cost=0.00..25.71 rows=1 width=8) (actual time=584.403..584.404 rows=1 loops=1) -> Index Scan using muvelet_vonalkod_pk3 on muvelet_vonalkod (cost=0.00..8304.42 rows=323 width=8) (actual time=584.397..584.397 rows=1 loops=1) Filter: (muvelet = 6859) Total runtime: 584.482 ms (4 rows) -- workaround 2: quite ugly but seems to work (at least for this -- one test case): # explain analyze select idopont from (select idopont from muvelet_vonalkod where muvelet=6859 orde
[PERFORM] ORDER BY costs
Hi, We´ve a SELECT that even without ORDER BY is returning the rows in the order that we liked but when we add the ORDER BY clause the runtime and costs are much bigger. We have to use ORDER BY otherwise in some future postgresql version probably it will not return in the correct order anymore. But if we use ORDER BY it´s too much expensive... is there a way to have the same costs and runtime but with the ORDER BY clause? Why is not the planner using the access plan builded for the "without order by" select even if we use the order by clause? The results are both the same... Postgresql version: 8.0.3 Without order by: explain analyzeSELECT * FROM iparq.ARRIPT where (ANOCALC = 2005and CADASTRO = 19and CODVENCTO = 00a nd PARCELA >= 00 ) or (ANOCALC = 2005and CADASTRO = 19and CODVENCTO > 00 ) or (ANOCALC = 2005and CADASTRO > 19 ) or (ANOCALC > 2005 ); Index Scan using pk_arript, pk_arript, pk_arript, pk_arript on arript (cost=0.00..122255.35 rows=146602 width=897) (actual time=9.303..1609.987 rows=167710 loops=1) Index Cond: (((anocalc = 2005::numeric) AND (cadastro = 19::numeric) AND (codvencto = 0::numeric) AND (parcela >= 0::numeric)) OR ((anocalc = 2005::numeric) AND (cadastro = 19::numeric) AND (codvencto > 0::numeric)) OR ((anocalc = 2005::numeric) AND (cadastro > 19::numeric)) OR (anocalc > 2005::numeric)) Total runtime: 1712.456 ms(3 rows) With order by:explain analyzeSELECT * FROM iparq.ARRIPT where (ANOCALC = 2005and CADASTRO = 19and COD VENCTO = 00and PARCELA >= 00 ) or (ANOCALC = 2005and CADASTRO = 19and CODVENCTO > 00 ) or (ANOCALC = 2005and CADASTRO > 19 ) or (ANOCALC > 2005 )order by ANOCALC asc, CADASTRO asc, CODVENCTO asc, PARCELA asc; Sort (cost=201296.59..201663.10 rows=146602 width=897) (actual time=9752.555..10342.363 rows=167710 loops=1) Sort Key: anocalc, cadastro, codvencto, parcela -> Index Scan using pk_arript, pk_arript, pk_arript, pk_arript on arript (cost=0.00..122255.35 rows=146602 width=897) (actual time=0.402..1425.085 rows=167710 loops=1) Index Cond: (((anocalc = 2005::numeric) AND (cadastro = 19::numeric) AND (codvencto = 0::numeric) AND (parcela >= 0::numeric)) OR ((anocalc = 2005::numeric) AND (cadastro = 19::numeric) AND (codvencto > 0::numeric)) OR ((anocalc = 2005::numeric) AND (cadastro > 19::numeric)) OR (anocalc > 2005::numeric)) Total runtime: 10568.290 ms(5 rows) Table definition: Table "iparq.arript" Column | Type | Modifiers---+---+--- anocalc | numeric(4,0) | not null cadastro | numeric(8,0) | not null codvencto | numeric(2,0) | not null parcela | numeric(2,0) | not null inscimob | character varying(18) | not null codvencto2 | numeric(2,0) | not null parcela2 | numeric(2,0) | not null codpropr | numeric(10,0) | not null dtaven | numeric(8,0) | not null anocalc2 | numeric(4,0) |..Indexes: "pk_arript" PRIMARY KEY, btree (anocalc, cadastro, codvencto, parcela) "iarchave04" UNIQUE, btree (cadastro, anocalc, codvencto, parcela) "iarchave02" btree (inscimob, anocalc, codvencto2, parcela2) "iarchave03" btree (codpropr, dtaven) "iarchave05" btree (anocalc, inscimob, codvencto2, parcela2) Best regards and thank you very much in advance, Carlos Benkendorf Yahoo! doce lar. Faça do Yahoo! sua homepage.
Re: [PERFORM] Windows performance again
> > It's *got* to be the network configuration on the client machine. > > We've seen gripes of this sort before --- check the list archives for > possible fixes. I seem to recall something about a "QoS patch", as > well as suggestions to get rid of third-party packages that might be > interfering with the TCP stack. I personally checked out the last report from a poster who got the issue on win2k but not on winxp. I ran his exact dump into my 2k server with no problems. This is definitely some type of local issue. Josep: does your table have any large ( > 1k ) fields in it? Merlin ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Wrong index used when ORDER BY LIMIT 1
On Wed, Dec 21, 2005 at 07:03:00PM +0100, Sz?cs Gbor wrote: > Version: 7.4.6 [...] > Query is: > SELECT idopont WHERE muvelet = x ORDER BY idopont LIMIT 1. > > I expected the planner to choose the index on muvelet, then sort by idopont. > Instead, it took the other index. I think the planner is guessing that since you're ordering on idopont, scanning the idopont index will find the first matching row faster than using the muvelet index would. In many cases that's a good bet, but in this case the guess is wrong and you end up with a suboptimal plan. I just ran some tests with 8.1.1 and it chose the better plan for a query similar to what you're doing. One of the developers could probably explain why; maybe it's because of the changes that allow better use of multicolumn indexes. Try 8.1.1 if you can and see if you get better results. > -- workaround 2: quite ugly but seems to work (at least for this > -- one test case): > # explain analyze > select idopont from > (select idopont from muvelet_vonalkod >where muvelet=6859 order by idopont) foo > order by idopont limit 1; Another workaround is to use OFFSET 0 in the subquery. -- Michael Fuhr ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Wrong index used when ORDER BY LIMIT 1
=?ISO-8859-2?Q?Sz=FBcs_G=E1bor?= <[EMAIL PROTECTED]> writes: > Query is: > SELECT idopont WHERE muvelet = x ORDER BY idopont LIMIT 1. Much the best solution for this would be to have an index on (muvelet, idopont) --- perhaps you can reorder the columns of "muvelet_vonalkod_muvelet" instead of making a whole new index --- and then say SELECT idopont WHERE muvelet = x ORDER BY muvelet, idopont LIMIT 1 PG 8.1 can apply such an index to your original query, but older versions will need the help of the modified ORDER BY to recognize that the index is usable. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] ORDER BY costs
Carlos Benkendorf <[EMAIL PROTECTED]> writes: > Table "iparq.arript" > Column | Type | Modifiers > ---+---+--- > anocalc | numeric(4,0) | not null > cadastro | numeric(8,0) | not null > codvencto | numeric(2,0) | not null > parcela | numeric(2,0) | not null > inscimob | character varying(18) | not null > codvencto2| numeric(2,0) | not null > parcela2 | numeric(2,0) | not null > codpropr | numeric(10,0) | not null > dtaven| numeric(8,0) | not null > anocalc2 | numeric(4,0) | I suspect you'd find a significant performance improvement from changing the NUMERIC columns to int or bigint as needed. Numeric comparisons are pretty slow. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Speed of different procedural language
> On Sun, Dec 18, 2005 at 01:10:21AM -, Ben Trewern wrote: > > I know I should be writing these in C but that's a bit beyond me. I was > > going to try PL/Python or PL/Perl or even PL/Ruby. Has anyone any idea > > which language is fastest, or is the data access going to swamp the > overhead > > of small functions? > > I'm not sure if it's what you ask for, but there _is_ a clear difference > between the procedural languages -- I've had a 10x speed increase from > rewriting PL/PgSQL stuff into PL/Perl, for instance. I'm not sure which > ones > would be faster, though -- I believe Ruby is slower than Perl or Python > generally, but I don't know how it all works out in a PL/* setting. So far, I use plpgsql for everything...queries being first class and all...I don't have any performance problems with it. I have cut the occasional C routine, but for flexibility not for speed. PL/Perl routines cannot directly execute each other, meaning you can't pass high level objects between them like refcursors. YMMV Since most database apps are bound by the server one way or another I would imagine you should be choosing a language on reasons other than performance. Maybe Ben you could provide an example of what you are trying to do that is not fast enough? Merlin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Speed of different procedural language
On Wed, Dec 21, 2005 at 12:06:47PM +0100, Steinar H. Gunderson wrote: > On Sun, Dec 18, 2005 at 01:10:21AM -, Ben Trewern wrote: > > I know I should be writing these in C but that's a bit beyond me. I was > > going to try PL/Python or PL/Perl or even PL/Ruby. Has anyone any idea > > which language is fastest, or is the data access going to swamp the > > overhead > > of small functions? > > I'm not sure if it's what you ask for, but there _is_ a clear difference > between the procedural languages -- I've had a 10x speed increase from > rewriting PL/PgSQL stuff into PL/Perl, for instance. The difference is clear only in specific cases; just because you saw a 10x increase in some cases doesn't mean you can expect that kind of increase, or indeed any increase, in others. I've seen PL/pgSQL beat all other PL/* challengers handily many times, especially when the function does a lot of querying and looping through large result sets. I tend to use PL/pgSQL except in cases where PL/pgSQL can't do what I want or the job would be much easier in another language (e.g., string manipulation, for which I'd use PL/Perl or PL/Ruby). Even then I might use the other language only to write small functions that a PL/pgSQL function could call. As Merlin suggested, maybe Ben could tell us what he wants to do that he thinks should be written in C or a language other than PL/pgSQL. Without knowing what problem is to be solved it's near impossible to recommend an appropriate tool. -- Michael Fuhr ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Speed of different procedural language
On Wed, Dec 21, 2005 at 02:24:42PM -0700, Michael Fuhr wrote: > The difference is clear only in specific cases; just because you > saw a 10x increase in some cases doesn't mean you can expect that > kind of increase, or indeed any increase, in others. I've seen > PL/pgSQL beat all other PL/* challengers handily many times, > especially when the function does a lot of querying and looping > through large result sets. That's funny, my biggest problems with PL/PgSQL have been (among others) exactly with large result sets... Anyhow, the general idea is: It _does_ matter which one you use, so you'd better test if it matters to you :-) /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Speed of different procedural language
On Wed, Dec 21, 2005 at 10:38:10PM +0100, Steinar H. Gunderson wrote: > On Wed, Dec 21, 2005 at 02:24:42PM -0700, Michael Fuhr wrote: > > The difference is clear only in specific cases; just because you > > saw a 10x increase in some cases doesn't mean you can expect that > > kind of increase, or indeed any increase, in others. I've seen > > PL/pgSQL beat all other PL/* challengers handily many times, > > especially when the function does a lot of querying and looping > > through large result sets. > > That's funny, my biggest problems with PL/PgSQL have been (among others) > exactly with large result sets... Out of curiosity, do you have a simple test case? I'd be interested in seeing what you're doing in PL/pgSQL that's contradicting what I'm seeing. -- Michael Fuhr ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] What's the best hardver for PostgreSQL 8.1?
Juan Casero wrote: Can you elaborate on the reasons the opteron is better than the Xeon when it comes to disk io? I have a PostgreSQL 7.4.8 box running a DSS. One of our Opterons have 64-bit IOMMU -- Xeons don't. That means in 64-bit mode, transfers to > 4GB, the OS must allocated the memory < 4GB, DMA to that block and then the CPU must do extra work in copying the memory to > 4GB. Versus on the Opteron, it's done by the IO adaptor using DMA in the background. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] ORDER BY costs
I restored the table in another database and repeated the analyze again with original column definitions (numeric): With order by: Sort (cost=212634.30..213032.73 rows=159374 width=897) (actual time=9286.817..9865.030 rows=167710 loops=1) Sort Key: anocalc, cadastro, codvencto, parcela -> Index Scan using pk_arript, pk_arript, pk_arript, pk_arript on arript (cost=0.00..126604.64 rows=159374 width=897) (actual time=0.152..1062.664 rows=167710 loops=1) Index Cond: (((anocalc = 2005::numeric) AND (cadastro = 19::numeric) AND (codvencto = 0::numeric) AND (parcela >= 0::numeric)) OR ((anocalc = 2005::numeric) AND (cadastro = 19::numeric) AND (codvencto > 0::numeric)) OR ((anocalc = 2005::numeric) AND (cadastro > 19::numeric)) OR (anocalc > 2005::numeric)) Total runtime: 10086.884 ms(5 rows) Without order by: Index Scan using pk_arript, pk_arript, pk_arript, pk_arript on arript (cost=0.00..126604.64 rows=159374 width=897) (actual time=0.154..809.566 rows=167710 loops=1) Index Cond: (((anocalc = 2005::numeric) AND (cadastro = 19::numeric) AND (codvencto = 0::numeric) AND (parcela >= 0::numeric)) OR ((anocalc = 2005::numeric) AND (cadastro = 19::numeric) AND (codvencto > 0::numeric)) OR ((anocalc = 2005::numeric) AND (cadastro > 19::numeric)) OR (anocalc > 2005::numeric)) Total runtime: 894.218 ms(3 rows) Then I recreated the table and changed the primary key column type definitions to smallint, integer and bigint. CREATE TABLE arript ( anocalc smallint NOT NULL, cadastro integer NOT NULL, codvencto smallint NOT NULL, parcela smallint NOT NULL, inscimob character varying(18) NOT NULL, codvencto2 smallint NOT NULL, parcela2 smallint NOT NULL, codpropr bigint NOT NULL, dtaven integer NOT NULL, anocalc2 smallint, dtabase integer, vvt numeric(14,2), vvp numeric(14,2),... ... Now the new analyze: With order by: Sort (cost=180430.98..180775.10 rows=137649 width=826) (actual time=4461.524..5000.707 rows=167710 loops=1) Sort Key: anocalc, cadastro, codvencto, parcela ->&n bsp; Index Scan using pk_arript, pk_arript, pk_arript, pk_arript on arript (cost=0.00..26.93 rows=137649 width=826) (actual time=0.142..763.255 rows=167710 loops=1) Index Cond: (((anocalc = 2005) AND (cadastro = 19) AND (codvencto = 0) AND (parcela >= 0)) OR ((anocalc = 2005) AND (cadastro = 19) AND (codvencto > 0)) OR ((anocalc = 2005) AND (cadastro > 19)) OR (anocalc > 2005)) Total runtime: 5222.729 ms(5 rows) Without order by: Index Scan using pk_arript, pk_arript, pk_arript, pk_arript on arript (cost=0.00..26.93 rows=137649 width=826) (actual time=0.135..505.250 rows=167710 loops=1) Index Cond: (((anocalc = 2005) AND (cadastro = 19) AND (codvencto = 0) AND (parcela >= 0)) OR ((anocalc = 2005) AND (cadastro = 19) AND (codvencto > 0)) OR ((anocalc = 2005) AND (cadastro > 19)) OR (anocalc > 2005)) Total runtime: 589.528 ms(3 rows) Total runtime summary:Primary key columns defined with integer/smallint/bigint and select with order by: 5222.729 msPrimary key columns defined with integer/smallint/bigint and select without order by: 589.528 msPrimary key columns defined with numeric and select with order by: 10086.884 msPrimary key columns defined with numeric and select without order by: 894.218 ms Using order by and integer/smallint/bigint (5222.729) is almost half total runtime of select over numeric columns (10086.884) but is still 6 x more from the numbers of the original select (without order by and number columns=894.218). Is there something more that could be done? Planner cost constants? Thanks very much in advance!< /DIV> Benkendorf Yahoo! doce lar. Faça do Yahoo! sua homepage.
Re: [PERFORM] Speed of different procedural language
On Wed, Dec 21, 2005 at 03:10:28PM -0700, Michael Fuhr wrote: >> That's funny, my biggest problems with PL/PgSQL have been (among others) >> exactly with large result sets... > Out of curiosity, do you have a simple test case? I'd be interested > in seeing what you're doing in PL/pgSQL that's contradicting what > I'm seeing. I'm not sure if I have the code anymore (it was under 7.4 or 8.0), but it was largely scanning through ~2 million rows once, noting differences from the previous rows as it went. In that case, I didn't benchmark against any of the other PL/* languages, but it was pretty clear that even on a pretty speedy Opteron, it was CPU bound, which it really shouldn't have been. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] CPU and RAM
I am currently using a dual Opteron (248) single core system (RAM PC3200) and for a change I am finding that the bottleneck is not disk I/O but CPU/RAM (not sure which). The reason for this is that the most frequently accessed tables/indexes are all held in RAM and when querying the database there is almost no disk activity which is great, most of the time. However, the database is growing and this database is supporting an OLTP system where the retrieval of the data is an order of magnitude more important than the insertion and general upkeep of the data. It supports a search engine[0] and contains a reverse index, lexicon and the actual data table (currently just under 2Gb for the three tables and associated indexes). At the moment everything is working OK but I am noticing an almost linear increase in time to retrieve data from the database as the data set increases in size. Clustering knocks the access times down by 25% but it also knocks users off the website and can take up to 30 minutes which is hardly an ideal scenario. I have also considered partitioning the tables up using extendible hashing and tries to allocate the terms in the index to the correct table but after some testing I noticed no noticeable gain using this method which surprised me a bit. The actual size of the database is not that big (4Gb) but I am expecting this to increase to at least 20Gb over the next year or so. This means that search times are going to jump dramatically which also means the site becomes completely unusable. This also means that although disk access is currently low I am eventually going to run out of RAM and require a decent disk subsystem. Do people have any recommendations as to what hardware would alleviate my current CPU/RAM problem but with a mind to the future would still be able to cope with heavy disk access. My budget is about £2300/$4000 which is not a lot of money when talking databases so suggestions of a Sun Fire T2000 or similar systems will be treated with the utmost disdain ;) unless you are about to give me one to keep. -- Harry http://www.hjackson.org http://www.uklug.co.uk Before anyone asks I have considered using tsearch2. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] MySQL is faster than PgSQL but a large margin in my program... any ideas why?
Hi all, On a user's request, I recently added MySQL support to my backup program which had been written for PostgreSQL exclusively until now. What surprises me is that MySQL is about 20%(ish) faster than PostgreSQL. Now, I love PostgreSQL and I want to continue recommending it as the database engine of choice but it is hard to ignore a performance difference like that. My program is a perl backup app that scans the content of a given mounted partition, 'stat's each file and then stores that data in the database. To maintain certain data (the backup, restore and display values for each file) I first read in all the data from a given table (one table per partition) into a hash, drop and re-create the table, then start (in PostgreSQL) a bulk 'COPY..' call through the 'psql' shell app. In MySQL there is no 'COPY...' equivalent so instead I generate a large 'INSERT INTO file_info_X (col1, col2, ... coln) VALUES (...), (blah) ... (blah);'. This doesn't support automatic quoting, obviously, so I manually quote my values before adding the value to the INSERT statement. I suspect this might be part of the performance difference? I take the total time needed to update a partition (load old data into hash + scan all files and prepare COPY/INSERT + commit new data) and devide by the number of seconds needed to get a score I call a 'U.Rate). On average on my Pentium3 1GHz laptop I get U.Rate of ~4/500. On MySQL though I usually get a U.Rate of ~7/800. If the performace difference comes from the 'COPY...' command being slower because of the automatic quoting can I somehow tell PostgreSQL that the data is pre-quoted? Could the performance difference be something else? If it would help I can provide code samples. I haven't done so yet because it's a little convoluded. ^_^; Thanks as always! Madison Where the big performance concern is when -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Madison Kelly (Digimer) TLE-BU; The Linux Experience, Back Up Main Project Page: http://tle-bu.org Community Forum:http://forum.tle-bu.org -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] MySQL is faster than PgSQL but a large margin in my program... any ideas why?
* Madison Kelly ([EMAIL PROTECTED]) wrote: > If the performace difference comes from the 'COPY...' command being > slower because of the automatic quoting can I somehow tell PostgreSQL > that the data is pre-quoted? Could the performance difference be > something else? I doubt the issue is with the COPY command being slower than INSERTs (I'd expect the opposite generally, actually...). What's the table type of the MySQL tables? Is it MyISAM or InnoDB (I think those are the main alternatives)? IIRC, MyISAM doesn't do ACID and isn't transaction safe, and has problems with data reliability (aiui, equivilant to doing 'fsync = false' for Postgres). InnoDB, again iirc, is transaction safe and whatnot, and more akin to the default PostgreSQL setup. I expect some others will comment along these lines too, if my response isn't entirely clear. :) Stephen signature.asc Description: Digital signature
Re: [PERFORM] Speed of different procedural language
On Thu, Dec 22, 2005 at 02:08:23AM +0100, Steinar H. Gunderson wrote: > On Wed, Dec 21, 2005 at 03:10:28PM -0700, Michael Fuhr wrote: > >> That's funny, my biggest problems with PL/PgSQL have been (among others) > >> exactly with large result sets... > > Out of curiosity, do you have a simple test case? I'd be interested > > in seeing what you're doing in PL/pgSQL that's contradicting what > > I'm seeing. > > I'm not sure if I have the code anymore (it was under 7.4 or 8.0), but it was > largely scanning through ~2 million rows once, noting differences from the > previous rows as it went. > > In that case, I didn't benchmark against any of the other PL/* languages, but > it was pretty clear that even on a pretty speedy Opteron, it was CPU bound, > which it really shouldn't have been. Try looping through two million rows with PL/Perl or PL/Tcl and you'll probably see significantly worse performance than with PL/pgSQL -- so much worse that I'd be surprised to see those languages make up the difference with whatever processing they'd be doing for each row unless it was something they're particularly good at and PL/pgSQL is particularly bad at. In 8.1 PL/Perl has a couple of ways to fetch query results: spi_exec_query to fetch all the rows at once into a single data structure, and spi_query/spi_fetchrow to fetch the rows one at a time. In my tests with one million rows, spi_exec_query was around 8 times slower than a loop in PL/pgSQL, not to mention requiring a lot of memory. spi_query/spi_fetchrow was about 25 times slower but didn't require the amount of memory that spi_exec_query did. A PL/Tcl function that used spi_exec was about 10 times slower than PL/pgSQL, or only slightly slower than PL/Perl and spi_exec_query. If you didn't benchmark the two million row query, do you have an example that you did benchmark? I don't doubt that PL/Perl and other langauges can do some things faster than PL/pgSQL, but looping through large result sets doesn't seem to be one of them. -- Michael Fuhr ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] MySQL is faster than PgSQL but a large margin in my program... any ideas why?
On Wednesday 21 December 2005 20:14, Stephen Frost wrote: > * Madison Kelly ([EMAIL PROTECTED]) wrote: > > If the performace difference comes from the 'COPY...' command being > > slower because of the automatic quoting can I somehow tell PostgreSQL > > that the data is pre-quoted? Could the performance difference be > > something else? > > I doubt the issue is with the COPY command being slower than INSERTs > (I'd expect the opposite generally, actually...). What's the table type > of the MySQL tables? Is it MyISAM or InnoDB (I think those are the main > alternatives)? IIRC, MyISAM doesn't do ACID and isn't transaction safe, > and has problems with data reliability (aiui, equivilant to doing 'fsync > = false' for Postgres). InnoDB, again iirc, is transaction safe and > whatnot, and more akin to the default PostgreSQL setup. > > I expect some others will comment along these lines too, if my response > isn't entirely clear. :) Is fsync() on in your postgres config? If so, that's why you're slower. The default is to have it on for stability (writes are forced to disk). It is quite a bit slower than just allowing the write caches to do their job, but more stable. MySQL does not force writes to disk. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] What's the best hardver for PostgreSQL 8.1?
I just sent my boss an email asking him for a Sun v20z with dual 2.2 Ghz opterons, 2 Gigs of RAM and RAID 1. I would have liked a better server capable of RAID but that seems to be out of his budget right now. Ok so I assume I get this Sun box. Most likely I will go with Linux since it is a fair bet he doesn't want to pay for the Solaris 10 x86 license. Although I kind of like the idea of using Solaris 10 x86 for this. I will assume I need to install the x64 kernel that comes with say Fedora Core 4. Should I run the Postgresql 8.x binaries in 32 bit mode or 64 bit mode? My instinct tells me 64 bit mode is most efficient for our database size about 20 gigs right now but may grow to 100 gigs in a year or so. I just finished loading a 20 gig database on a dual 900 Mhz Ultrasparc III system with 2 gigs of ram and about 768 megs of shared memory available for the posgresql server running Solaris 10. The load has smoked a P4 3.2 Ghz system I am using also with 2 gigs of ram running postgresql 8.0.3. I mean I started the sparc load after the P4 load. The sparc load has finished already rebuilding the database from a pg_dump file but the P4 system is still going. The p4 has 1.3 Gigs of shared memory allocated to postgresql. How about them apples? Thanks, Juan On Wednesday 21 December 2005 18:57, William Yu wrote: > Juan Casero wrote: > > Can you elaborate on the reasons the opteron is better than the Xeon when > > it comes to disk io? I have a PostgreSQL 7.4.8 box running a DSS. One > > of our > > Opterons have 64-bit IOMMU -- Xeons don't. That means in 64-bit mode, > transfers to > 4GB, the OS must allocated the memory < 4GB, DMA to that > block and then the CPU must do extra work in copying the memory to > > 4GB. Versus on the Opteron, it's done by the IO adaptor using DMA in the > background. > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] MySQL is faster than PgSQL but a large margin in
What version of postgres? Copy has been substantially improved in bizgres and also in 8.1. - Luke -- Sent from my BlackBerry Wireless Device -Original Message- From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> To: pgsql-performance@postgresql.org Sent: Wed Dec 21 21:03:18 2005 Subject: [PERFORM] MySQL is faster than PgSQL but a large margin in my program... any ideas why? Hi all, On a user's request, I recently added MySQL support to my backup program which had been written for PostgreSQL exclusively until now. What surprises me is that MySQL is about 20%(ish) faster than PostgreSQL. Now, I love PostgreSQL and I want to continue recommending it as the database engine of choice but it is hard to ignore a performance difference like that. My program is a perl backup app that scans the content of a given mounted partition, 'stat's each file and then stores that data in the database. To maintain certain data (the backup, restore and display values for each file) I first read in all the data from a given table (one table per partition) into a hash, drop and re-create the table, then start (in PostgreSQL) a bulk 'COPY..' call through the 'psql' shell app. In MySQL there is no 'COPY...' equivalent so instead I generate a large 'INSERT INTO file_info_X (col1, col2, ... coln) VALUES (...), (blah) ... (blah);'. This doesn't support automatic quoting, obviously, so I manually quote my values before adding the value to the INSERT statement. I suspect this might be part of the performance difference? I take the total time needed to update a partition (load old data into hash + scan all files and prepare COPY/INSERT + commit new data) and devide by the number of seconds needed to get a score I call a 'U.Rate). On average on my Pentium3 1GHz laptop I get U.Rate of ~4/500. On MySQL though I usually get a U.Rate of ~7/800. If the performace difference comes from the 'COPY...' command being slower because of the automatic quoting can I somehow tell PostgreSQL that the data is pre-quoted? Could the performance difference be something else? If it would help I can provide code samples. I haven't done so yet because it's a little convoluded. ^_^; Thanks as always! Madison Where the big performance concern is when -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Madison Kelly (Digimer) TLE-BU; The Linux Experience, Back Up Main Project Page: http://tle-bu.org Community Forum:http://forum.tle-bu.org -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] What's the best hardver for PostgreSQL 8.1?
AFAIK there are no licensing costs for solaris, unless you are talking about a software support agreement, which is not required. Juan Casero wrote: I just sent my boss an email asking him for a Sun v20z with dual 2.2 Ghz opterons, 2 Gigs of RAM and RAID 1. I would have liked a better server capable of RAID but that seems to be out of his budget right now. Ok so I assume I get this Sun box. Most likely I will go with Linux since it is a fair bet he doesn't want to pay for the Solaris 10 x86 license. Although I kind of like the idea of using Solaris 10 x86 for this. I will assume I need to install the x64 kernel that comes with say Fedora Core 4. Should I run the Postgresql 8.x binaries in 32 bit mode or 64 bit mode? My instinct tells me 64 bit mode is most efficient for our database size about 20 gigs right now but may grow to 100 gigs in a year or so. I just finished loading a 20 gig database on a dual 900 Mhz Ultrasparc III system with 2 gigs of ram and about 768 megs of shared memory available for the posgresql server running Solaris 10. The load has smoked a P4 3.2 Ghz system I am using also with 2 gigs of ram running postgresql 8.0.3. I mean I started the sparc load after the P4 load. The sparc load has finished already rebuilding the database from a pg_dump file but the P4 system is still going. The p4 has 1.3 Gigs of shared memory allocated to postgresql. How about them apples? Thanks, Juan On Wednesday 21 December 2005 18:57, William Yu wrote: Juan Casero wrote: Can you elaborate on the reasons the opteron is better than the Xeon when it comes to disk io? I have a PostgreSQL 7.4.8 box running a DSS. One of our Opterons have 64-bit IOMMU -- Xeons don't. That means in 64-bit mode, transfers to > 4GB, the OS must allocated the memory < 4GB, DMA to that block and then the CPU must do extra work in copying the memory to > 4GB. Versus on the Opteron, it's done by the IO adaptor using DMA in the background. ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Speed of different procedural language
Michael Fuhr <[EMAIL PROTECTED]> writes: > Try looping through two million rows with PL/Perl or PL/Tcl and > you'll probably see significantly worse performance than with > PL/pgSQL -- so much worse that I'd be surprised to see those languages > make up the difference with whatever processing they'd be doing for > each row unless it was something they're particularly good at and > PL/pgSQL is particularly bad at. I'd expect plpgsql to suck at purely computational tasks, compared to the other PLs, but to win at tasks involving database access. These are two sides of the same coin really --- plpgsql is tightly tied to the PG query execution engine, to the extent of using it even for simply adding 2 and 2, but that also gives it relatively low overhead for invoking a database query. Perl, Tcl, et al have their own computational engines and can easily beat the PG SQL engine for simple arithmetic and string-pushing. But they pay a high overhead for calling back into the database engine. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] CPU and RAM
On Thu, 22 Dec 2005, Harry Jackson wrote: > I am currently using a dual Opteron (248) single core system (RAM > PC3200) and for a change I am finding that the bottleneck is not disk > I/O but CPU/RAM (not sure which). The reason for this is that the most > frequently accessed tables/indexes are all held in RAM and when > querying the database there is almost no disk activity which is great, > most of the time. > > At the moment everything is working OK but I am noticing an almost > linear increase in time to retrieve data from the database as the data > set increases in size. Clustering knocks the access times down by 25% Let's find out what's going on first. Can you find out the most expensive query. Also, according to you what you said: (1) execution time is linear to data set size (2) no disk IO - so why cluster will improve 25%? Regards, Qingqing ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] What's the best hardver for PostgreSQL 8.1?
Hi Juan, Solaris 10 license is for free.. Infact I believe you do receive the media with Sun Fire V20z. If you want support then there are various "pay" plans depending on the level of support. If not your license allows you Right to Use anyway for free. That said I haven't done much testing with 32/64 bit differences. However for long term purposes, 64-bit always seems to be the safe bet. As for your load performance, lot of it depends on your file system layout also. Regards, Jignesh Juan Casero wrote: I just sent my boss an email asking him for a Sun v20z with dual 2.2 Ghz opterons, 2 Gigs of RAM and RAID 1. I would have liked a better server capable of RAID but that seems to be out of his budget right now. Ok so I assume I get this Sun box. Most likely I will go with Linux since it is a fair bet he doesn't want to pay for the Solaris 10 x86 license. Although I kind of like the idea of using Solaris 10 x86 for this. I will assume I need to install the x64 kernel that comes with say Fedora Core 4. Should I run the Postgresql 8.x binaries in 32 bit mode or 64 bit mode? My instinct tells me 64 bit mode is most efficient for our database size about 20 gigs right now but may grow to 100 gigs in a year or so. I just finished loading a 20 gig database on a dual 900 Mhz Ultrasparc III system with 2 gigs of ram and about 768 megs of shared memory available for the posgresql server running Solaris 10. The load has smoked a P4 3.2 Ghz system I am using also with 2 gigs of ram running postgresql 8.0.3. I mean I started the sparc load after the P4 load. The sparc load has finished already rebuilding the database from a pg_dump file but the P4 system is still going. The p4 has 1.3 Gigs of shared memory allocated to postgresql. How about them apples? Thanks, Juan On Wednesday 21 December 2005 18:57, William Yu wrote: Juan Casero wrote: Can you elaborate on the reasons the opteron is better than the Xeon when it comes to disk io? I have a PostgreSQL 7.4.8 box running a DSS. One of our Opterons have 64-bit IOMMU -- Xeons don't. That means in 64-bit mode, transfers to > 4GB, the OS must allocated the memory < 4GB, DMA to that block and then the CPU must do extra work in copying the memory to > 4GB. Versus on the Opteron, it's done by the IO adaptor using DMA in the background. ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] MySQL is faster than PgSQL but a large margin in my
Stephen Frost wrote: * Madison Kelly ([EMAIL PROTECTED]) wrote: If the performace difference comes from the 'COPY...' command being slower because of the automatic quoting can I somehow tell PostgreSQL that the data is pre-quoted? Could the performance difference be something else? I doubt the issue is with the COPY command being slower than INSERTs (I'd expect the opposite generally, actually...). What's the table type of the MySQL tables? Is it MyISAM or InnoDB (I think those are the main alternatives)? IIRC, MyISAM doesn't do ACID and isn't transaction safe, and has problems with data reliability (aiui, equivilant to doing 'fsync = false' for Postgres). InnoDB, again iirc, is transaction safe and whatnot, and more akin to the default PostgreSQL setup. I expect some others will comment along these lines too, if my response isn't entirely clear. :) Stephen Ah, that makes a lot of sense (I read about the 'fsync' issue before, now that you mention it). I am not too familiar with MySQL but IIRC MyISAM is their open-source DB and InnoDB is their commercial one, ne? If so, then I am running MyISAM. Here is the MySQL table. The main difference from the PostgreSQL table is that the 'varchar(255)' columns are 'text' columns in PostgreSQL. mysql> DESCRIBE file_info_1; +-+--+--+-+-+---+ | Field | Type | Null | Key | Default | Extra | +-+--+--+-+-+---+ | file_group_name | varchar(255) | YES | | NULL| | | file_group_uid | int(11) | | | 0 | | | file_mod_time | bigint(20) | | | 0 | | | file_name | varchar(255) | | | | | | file_parent_dir | varchar(255) | | MUL | | | | file_perm | int(11) | | | 0 | | | file_size | bigint(20) | | | 0 | | | file_type | char(1) | | | | | | file_user_name | varchar(255) | YES | | NULL| | | file_user_uid | int(11) | | | 0 | | | file_backup | char(1) | | MUL | i | | | file_display| char(1) | | | i | | | file_restore| char(1) | | | i | | +-+--+--+-+-+---+ I will try turning off 'fsync' on my test box to see how much of a performance gain I get and to see if it is close to what I am getting out of MySQL. If that does turn out to be the case though I will be able to comfortably continue recommending PostgreSQL from a stability point of view. Thanks!! Madison -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Madison Kelly (Digimer) TLE-BU; The Linux Experience, Back Up Main Project Page: http://tle-bu.org Community Forum:http://forum.tle-bu.org -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] MySQL is faster than PgSQL but a large margin in
Luke Lonergan wrote: What version of postgres? Copy has been substantially improved in bizgres and also in 8.1. - Luke Currently 7.4 (what comes with Debian Sarge). I have run my program on 8.0 but not since I have added MySQL support. I should run the tests on the newer versions of both DBs (using v4.1 for MySQL which is also mature at this point). As others mentioned though, so far the most likely explanation is the 'fsync' being enabled on PostgreSQL. Thanks for the reply! Madison -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Madison Kelly (Digimer) TLE-BU; The Linux Experience, Back Up Main Project Page: http://tle-bu.org Community Forum:http://forum.tle-bu.org -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] MySQL is faster than PgSQL but a large margin in
Madison, On 12/21/05 10:58 PM, "Madison Kelly" <[EMAIL PROTECTED]> wrote: > Ah, that makes a lot of sense (I read about the 'fsync' issue before, > now that you mention it). I am not too familiar with MySQL but IIRC > MyISAM is their open-source DB and InnoDB is their commercial one, ne? > If so, then I am running MyISAM. You can run either storage method with MySQL, I expect the default is MyISAM. COPY performance with or without fsync was sped up recently nearly double in Postgresql. The Bizgres version (www.bizgres.org, www.greenplum.com) is the fastest, Postgres 8.1.1 is close, depending on how fast your disk I/O is (as I/O speed increases Bizgres gets faster). fsync isn't really an "issue" and I'd suggest you not run without it! We've found that "fdatasync" as the wal sync method is actually a bit faster than fsync if you want a bit better speed. So, I'd recommend you upgrade to either bizgres or Postgres 8.1.1 to get the maximum COPY speed. >Here is the MySQL table. The main difference from the PostgreSQL > table is that the 'varchar(255)' columns are 'text' columns in PostgreSQL. Shouldn't matter. > mysql> DESCRIBE file_info_1; > +-+--+--+-+-+---+ > | Field | Type | Null | Key | Default | Extra | > +-+--+--+-+-+---+ > | file_group_name | varchar(255) | YES | | NULL| | > | file_group_uid | int(11) | | | 0 | | > | file_mod_time | bigint(20) | | | 0 | | > | file_name | varchar(255) | | | | | > | file_parent_dir | varchar(255) | | MUL | | | > | file_perm | int(11) | | | 0 | | > | file_size | bigint(20) | | | 0 | | > | file_type | char(1) | | | | | > | file_user_name | varchar(255) | YES | | NULL| | > | file_user_uid | int(11) | | | 0 | | > | file_backup | char(1) | | MUL | i | | > | file_display| char(1) | | | i | | > | file_restore| char(1) | | | i | | > +-+--+--+-+-+---+ What's a bigint(20)? Are you using "numeric" in Postgresql? >I will try turning off 'fsync' on my test box to see how much of a > performance gain I get and to see if it is close to what I am getting > out of MySQL. If that does turn out to be the case though I will be able > to comfortably continue recommending PostgreSQL from a stability point > of view. Again - fsync is a small part of the performance - you will need to run either Postgres 8.1.1 or Bizgres to get good COPY speed. - Luke ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] MySQL is faster than PgSQL but a large margin in
Madison, On 12/21/05 11:02 PM, "Madison Kelly" <[EMAIL PROTECTED]> wrote: > Currently 7.4 (what comes with Debian Sarge). I have run my program on > 8.0 but not since I have added MySQL support. I should run the tests on > the newer versions of both DBs (using v4.1 for MySQL which is also > mature at this point). Yes, this is *definitely* your problem. Upgrade to Postgres 8.1.1 or Bizgres 0_8_1 and your COPY speed could double without even changing fsync (depending on your disk speed). We typically get 12-14MB/s from Bizgres on Opteron CPUs and disk subsystems that can write at least 60MB/s. This means you can load 100GB in 2 hours. Note that indexes will also slow down loading. - Luke ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster