Re: [PERFORM] Comparative performance
It's more understandable if the table names are in front of the column names : SELECT relationship.topic_id1, relationship.topic_id2, topic.topic_name, topic.categ_id, topic.list_name, topic.title, topic.url, topic.page_type, relationship.rel_type, entry_type.inverse_id, relationship.description AS rel_descrip, relationship.created, relationship.updated FROM relationship, topic, entry_type WHERE ((relationship.topic_id1 = topic.topic_id AND relationship.topic_id2 = 1252) OR (relationship.topic_id2 = topic.topic_id and relationship.topic_id1 = 1252)) AND relationship.rel_type = entry_type.type_id AND entry_type.class_id = 2 ORDER BY rel_type, list_name; I see a few problems in your schema. - topic_id1 and topic_id2 play the same role, there is no constraint to determine which is which, hence it is possible to define the same relation twice. - as you search on two columns with OR, you need UNION to use indexes. - lack of indexes - I don't understand why the planner doesn't pick up hash joins... - if you use a version before 8, type mismatch will prevent use of the indexes. I'd suggest rewriting the query like this : SELECT topic.*, foo.* FROM topic, (SELECT topic_id2 as fetch_id, topic_id1, topic_id2, rel_type, description as rel_descrip, created, updated FROM relationship WHERE rel_type IN (SELECT type_id FROM entry_type WHERE class_id = 2) AND topic_id1 = 1252 UNION SELECT topic_id1 as fetch_id, topic_id1, topic_id2, rel_type, description as rel_descrip, created, updated FROM relationship WHERE rel_type IN (SELECT type_id FROM entry_type WHERE class_id = 2) AND topic_id2 = 1252) AS foo WHERE topic.topic_id = foo.fetch_id CREATE INDEX'es ON entry_type( class_id ) relationship( topic_id1, rel_type, topic_id2 ) which becomes your new PRIMARY KEY relationship( topic_id2, rel_type, topic_id1 ) Of course, this doesn't explain how MySQL manages to execute the query in about 9 msec. The only minor differences in the schema are: entry_type.title and rel_title are char(32) in MySQL, entry_type.class_id is a tinyint, and topic.categ_id, page_type and dark_ind are also tinyints. MySQL also doesn't have the REFERENCES. Can you post the result from MySQL EXPLAIN ? You might be interested in the following code. Just replace mysql_ by pg_, it's quite useful. $global_queries_log = array(); function _getmicrotime() { list($u,$s) = explode(' ',microtime()); return $u+$s; } /* Formats query, with given arguments, escaping all strings as needed. db_quote_query( 'UPDATE junk SET a=%s WHERE b=%s', array( 1,"po'po" ) ) => UPDATE junk SET a='1 WHERE b='po\'po' */ function db_quote_query( $sql, $params=false ) { // if no params, send query raw if( !$params ) return $sql; // quote params foreach( $params as $key => $val ) { if( is_array( $val )) $val = implode( ',', $val ); $params[$key] = "'".mysql_real_escape_string( $val )."'"; } return vsprintf( $sql, $params ); } /* Formats query, with given arguments, escaping all strings as needed. Runs query, logging its execution time. Returns the query, or dies with error. */ function db_query( $sql, $params=false ) { // it's already a query if( is_resource( $sql )) return $sql; $sql = db_quote_query( $sql, $params ); $t = _getmicrotime(); $r = mysql_query( $sql ); if( !$r ) { echo "Erreur MySQL :".mysql_error()."Requte :".$sql."Traceback :"; foreach( debug_backtrace() as $t ) xdump( $t ); echo ""; die(); } global $global_queries_log; $global_queries_log[] = array( _getmicrotime()-$t, $sql ); return $r; } At the end of your page, display the contents of $global_queries_log. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Is There Any Way ....
On 2005-09-30 01:21, Lane Van Ingen wrote: > (3) Assure that a disk-based table is always in memory (outside of keeping > it in > memory buffers as a result of frequent activity which would prevent > LRU > operations from taking it out) ? I was wondering about this too. IMO it would be useful to have a way to tell PG that some tables were needed frequently, and should be cached if possible. This would allow application developers to consider joins with these tables as "cheap", even when querying on columns that are not indexed. I'm thinking about smallish tables like users, groups, *types, etc which would be needed every 2-3 queries, but might be swept out of RAM by one large query in between. Keeping a table like "users" on a RAM fs would not be an option, because the information is not volatile. cheers, stefan ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Is There Any Way ....
Yes, Stefan, the kind of usage you are mentioning is exactly why I was asking. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Stefan Weiss Sent: Tuesday, October 04, 2005 6:32 AM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Is There Any Way On 2005-09-30 01:21, Lane Van Ingen wrote: > (3) Assure that a disk-based table is always in memory (outside of keeping > it in > memory buffers as a result of frequent activity which would prevent > LRU > operations from taking it out) ? I was wondering about this too. IMO it would be useful to have a way to tell PG that some tables were needed frequently, and should be cached if possible. This would allow application developers to consider joins with these tables as "cheap", even when querying on columns that are not indexed. I'm thinking about smallish tables like users, groups, *types, etc which would be needed every 2-3 queries, but might be swept out of RAM by one large query in between. Keeping a table like "users" on a RAM fs would not be an option, because the information is not volatile. cheers, stefan ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Slow concurrent update of same row in a given table
On Thu, Sep 29, 2005 at 07:59:34AM +0530, Rajesh Kumar Mallah wrote: > > I see. These problems regularly come up in database design. The best thing > > you can do is modify your database design/application such that instead of > > incrementing a count in a single row, you insert a row into a table, > > recording the 'dispatch_id'. Counting the number of rows for a given > > dispatch id will give you your count. > > > > sorry i will be accumulating huge amount of rows in seperate table > with no extra info when i really want just the count. Do you have > a better database design in mind? > > Also i encounter same problem in implementing read count of > articles in sites and in counting banner impressions where same > row get updated by multiple processes frequently. Databases like to work on *sets* of data, not individual rows. Something like this would probably perform much better than what you've got now, and would prevent having a huge table laying around: INSERT INTO holding_table ... -- Done for every incomming connection/what-have-you CREATE OR REPLACE FUNCTION summarize() RETURNS void AS $$ DECLARE v_rows int; BEGIN DELETE FROM holding_table; GET DIAGNOSTICS v_rows = ROW_COUNT; UPDATE count_table SET count = count + v_rows ; END; $$ LANGUAGE plpgsql; Periodically (say, once a minute): SELECT summarize() VACUUM holding_table; VACUUM count_table; -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Logarithmic change (decrease) in performance
On Wed, Sep 28, 2005 at 06:03:03PM -0400, Ron Peacetree wrote: > 1= keep more of the data set in RAM > 2= increase the size of your HD IO buffers > 3= make your RAID sets wider (more parallel vs sequential IO) > 4= reduce the atomic latency of your RAID sets > (time for Fibre Channel 15Krpm HD's vs 7.2Krpm SATA ones?) > 5= make sure your data is as unfragmented as possible > 6= change you DB schema to minimize the problem > a= overall good schema design > b= partitioning the data so that the system only has to manipulate a > reasonable chunk of it at a time. Note that 6 can easily swamp the rest of these tweaks. A poor schema design will absolutely kill any system. Also of great importance is how you're using the database. IE: are you doing any row-by-row operations? > In many cases, there's a number of ways to accomplish the above. > Unfortunately, most of them require CapEx. > > Also, ITRW world such systems tend to have this as a chronic > problem. This is not a "fix it once and it goes away forever". This > is a part of the regular maintenance and upgrade plan(s). And why DBA's typically make more money that other IT folks. :) -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Comparative performance
On Thu, Sep 29, 2005 at 08:44:16AM -0400, Joe wrote: > CREATE TABLE entry ( > entry_id serial PRIMARY KEY, > title VARCHAR(128) NOT NULL, > subtitle VARCHAR(128), > subject_type SMALLINT, > subject_id INTEGER REFERENCES topic, > actor_type SMALLINT, > actor_id INTEGER REFERENCES topic, > actor VARCHAR(64), > actor_role VARCHAR(64), > rel_entry_id INTEGER, > rel_entry VARCHAR(64), > description VARCHAR(255), > quote text, > url VARCHAR(255), > entry_date CHAR(10), > created DATE NOT NULL DEFAULT CURRENT_DATE, > updated TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP) > WITHOUT OIDS; > CREATE INDEX entry_actor_id ON entry (actor_id); > CREATE INDEX entry_subject_id ON entry (subject_id); A few tips... Fields in PostgreSQL have alignment requirements, so the smallints aren't saving you anything right now. If you put both of them together though, you'll save 4 bytes on most hardware. You'll also get some minor gains from putting all the variable-length fields at the end, as well as nullable fields. If you search the archives for 'field order' you should be able to find some useful info. Make sure these indexes exist if you'll be updating or inserting into entry: CREATE INDEX topic__subject_id ON topic(subject_id); CREATE INDEX topic__actor_id ON topic(actor_id); Also, the fact that subject and actor both point to topic along with subject_type and actor_type make me suspect that your design is de-normalized. Of course there's no way to know without more info. FWIW, I usually use timestamptz for both created and updated fields. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] [HACKERS] Query in SQL statement
On Sat, Oct 01, 2005 at 12:51:08PM -0700, Roger Hand wrote: > > -Original Message- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] Behalf Of Jim C. Nasby > > Sent: Friday, September 30, 2005 4:49 PM > > Subject: Re: [PERFORM] [HACKERS] Query in SQL statement > > > I suggest ditching the CamelCase and going with underline_seperators. > > I'd also not use the bareword id, instead using bad_user_id. And I'd > > name the table bad_user. But that's just me. :) > > I converted a db from MS SQL, where tables and fields were CamelCase, and > just lowercased the ddl to create the tables. > > So table and fields names were all created in lowercase, but I didn't have to > change > any of the application code: the SELECT statements worked fine with mixed > case. > > -- sample DDL > CREATE TABLE testtable > ( > fieldone int4 > ) > insert into TestTable (fieldone) values (11); That will usually work (see Tom's reply), but fieldone is a heck of a lot harder to read than field_one. But like I said, this is the coding conventions I've found work well; YMMV. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Comparative performance
On Thu, Sep 29, 2005 at 04:39:36PM -0400, Joe wrote: > Andreas Pflug wrote: > >Hm, if you only have 4 tables, why do you need 12 queries? > >To reduce queries, join them in the query; no need to merge them > >physically. If you have only two main tables, I'd bet you only need 1-2 > >queries for the whole page. > > There are more than four tables and the queries are not functionally > overlapping. As an example, allow me to refer to the page > www.freedomcircle.com/topic.php/Economists. > > The top row of navigation buttons (Life, Liberty, etc.) is created from a > query of the 'topic' table. It could've been hard-coded as a PHP array, > but with less flexibility. The alphabetical links are from a SELECT > DISTINCT substring from topic. It could've been generated by a PHP for > loop (originally implemented that way) but again with less flexibility. > The listing of economists is another SELECT from topic. The subheadings > (Articles, Books) come from a SELECT of an entry_type table --which > currently has 70 rows-- and is read into a PHP array since we don't know > what headings will be used in a given page. The detail of the entries I suspect this might be something better done in a join. > comes from that query that I posted earlier, but there are three additional > queries that are used for specialized entry types (relationships between > topics --e.g., Prof. Williams teaches at George Mason, events, and > multi-author or multi-subject articles and books). And there's yet another Likewise... > table for the specific book information. Once the data is retrieved it's > sorted internally with PHP, at the heading level, before display. It's often better to let the database sort and/or aggregate data. > Maybe there is some way to merge all the queries (some already fairly > complex) that fetch the data for the entries box but I believe it would be > a monstrosity with over 100 lines of SQL. Also, just because no one else has mentioned it, remember that it's very easy to get MySQL into a mode where you have no data integrity. If that's the case it's going to be faster than PostgreSQL (though I'm not sure how much that affects the performance of SELECTs). -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Ultra-cheap NVRAM device
There was a discussion about this about 2 months ago. See the archives. On Mon, Oct 03, 2005 at 01:02:26PM +0200, Steinar H. Gunderson wrote: > I thought this might be interesting, not the least due to the extremely low > price ($150 + the price of regular DIMMs): > > http://www.tomshardware.com/storage/20050907/index.html > > Anybody know a good reason why you can't put a WAL on this, and enjoy a hefty > speed boost for a fraction of the price of a traditional SSD? (Yes, it's > SATA, not PCI, so the throughput is not all that impressive -- but still, > it's got close to zero seek time.) > > /* Steinar */ > -- > Homepage: http://www.sesse.net/ > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faq > -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Lists or external TABLE?
On Fri, Sep 30, 2005 at 10:34:35AM +0100, Richard Huxton wrote: > xchris wrote: > > > >Let's suppose i need to add an info about addresses (which includes > >country,city,capetc etc). > >Addresses can vary from 1 to 20 entries.. > > > >Talking about performance is it better to include a list of addresses in > >TABLE A or is it better to create an external TABLE B? > > Don't optimise before you have to. > > Do the addresses belong in "A"? If so, put them there. On the other > hand, if you want items in "A" to have more than one address, or to > share addresses then clearly you will want a separate address table. > It's difficult to say more without a clear example of your requirements. > > Even if you choose to alter your design for performance reasons, you > should make sure you run tests with realistic workloads and hardware. > But first, trust PG to do its job and design your database according to > the problem requirements. On top of what Richard said, 5000 rows is pretty tiny. Even if each row was 1K wide, that's still only 5MB. Also, if from a data-model standpoint it doesn't matter which way you go, I suggest looking at what it will take to write queries against both versions before deciding. I tend to stay away from arrays because they tend to be harder to query against. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Comparative performance
Hi Jim, Jim C. Nasby wrote: Also, just because no one else has mentioned it, remember that it's very easy to get MySQL into a mode where you have no data integrity. If that's the case it's going to be faster than PostgreSQL (though I'm not sure how much that affects the performance of SELECTs). Yes indeed. When I added the REFERENCES to the schema and reran the conversion scripts, aside from having to reorder the table creation and loading (they used to be in alphabetical order), I also found a few referential integrity errors in the MySQL data. Joe ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Is There Any Way ....
On Tue, Oct 04, 2005 at 12:31:42PM +0200, Stefan Weiss wrote: > On 2005-09-30 01:21, Lane Van Ingen wrote: > > (3) Assure that a disk-based table is always in memory (outside of keeping > > it in > > memory buffers as a result of frequent activity which would prevent > > LRU > > operations from taking it out) ? > > I was wondering about this too. IMO it would be useful to have a way to tell > PG that some tables were needed frequently, and should be cached if > possible. This would allow application developers to consider joins with > these tables as "cheap", even when querying on columns that are not indexed. > I'm thinking about smallish tables like users, groups, *types, etc which > would be needed every 2-3 queries, but might be swept out of RAM by one > large query in between. Keeping a table like "users" on a RAM fs would not > be an option, because the information is not volatile. Why do you think you'll know better than the database how frequently something is used? At best, your guess will be correct and PostgreSQL (or the kernel) will keep the table in memory. Or, your guess is wrong and you end up wasting memory that could have been used for something else. It would probably be better if you describe why you want to force this table (or tables) into memory, so we can point you at more appropriate solutions. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Comparative performance
PFC wrote: - if you use a version before 8, type mismatch will prevent use of the indexes. I'm using 8.0.3, but the type mismatch between relationship.rel_type and entry_type.type_id was unintended. The current databases use SMALLINT for both. The PostgreSQL schema was derived from an export script stored in Subversion, apparently before the column datatypes were changed. CREATE INDEX'es ON entry_type( class_id ) relationship( topic_id1, rel_type, topic_id2 )which becomes your new PRIMARY KEY relationship( topic_id2, rel_type, topic_id1 ) Creating the second relationship index was sufficient to modify the query plan to cut down runtime to zero: Sort (cost=75.94..75.95 rows=2 width=381) (actual time=0.000..0.000 rows=0 loops=1) Sort Key: r.rel_type, t.list_name -> Nested Loop (cost=16.00..75.93 rows=2 width=381) (actual time=0.000..0.000 rows=0 loops=1) Join Filter: ((("outer".topic_id1 = "inner".topic_id) AND ("outer".topic_id2 = 1252)) OR (("outer".topic_id2 = "inner".topic_id) AND ("outer".topic_id1 = 1252))) -> Nested Loop (cost=16.00..35.11 rows=1 width=169) (actual time=0.000..0.000 rows=0 loops=1) Join Filter: ("inner".rel_type = "outer".type_id) -> Seq Scan on entry_type e (cost=0.00..18.75 rows=4 width=4) (actual time=0.000..0.000 rows=15 loops=1) Filter: (class_id = 2) -> Materialize (cost=16.00..16.04 rows=4 width=167) (actual time=0.000..0.000 rows=0 loops=15) -> Seq Scan on relationship r (cost=0.00..16.00 rows=4 width=167) (actual time=0.000..0.000 rows=0 loops=1) Filter: ((topic_id2 = 1252) OR (topic_id1 = 1252)) -> Seq Scan on topic t (cost=0.00..30.94 rows=494 width=216) (never executed) Total runtime: 0.000 ms (13 rows) The overall execution time for the Economists page for PostgreSQL is within 4% of the MySQL time, so for the time being I'll leave the query in its current form. Thanks for your help. Joe ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] SQL Function performance
On Thu, Sep 29, 2005 at 10:54:58PM +0300, [EMAIL PROTECTED] wrote: > Hi All, > > I have a SQL function like : > > CREATE OR REPLACE FUNCTION > fn_get_yetkili_inisyer_listesi(int4, int4) > RETURNS SETOF kod_adi_liste_type AS > $BODY$ > SELECT Y.KOD,Y.ADI >FROM T_YER Y > WHERE EXISTS (SELECT 1 > FROM T_GUZER G > WHERE (G.BIN_YER_KOD = $1 OR COALESCE($1,0)=0) > AND FN_FIRMA_ISVISIBLE(G.FIRMA_NO,$2) = 1 > AND G.IN_YER_KOD = Y.KOD) > AND Y.IPTAL = 'H'; > $BODY$ > LANGUAGE 'sql' VOLATILE; > > When i use like "SELECT * FROM > fn_get_yetkili_inisyer_listesi(1, 3474)" and > planner result is "Function Scan on > fn_get_yetkili_inisyer_listesi (cost=0.00..12.50 rows=1000 > width=36) (1 row) " and it runs very slow. > > But when i use like > > "SELECT Y.KOD,Y.ADI > FROM T_YER Y >WHERE EXISTS (SELECT 1 > FROM T_GUZER G > WHERE (G.BIN_YER_KOD > = 1 OR COALESCE(1,0)=0) > AND FN_FIRMA_ISVISIBLE(G.FIRMA_NO,3474) = 1 > AND G.IN_YER_KOD = Y.KOD) > AND Y.IPTAL = 'H';" > > planner result : > > " > QUERY PLAN > > > - > Seq Scan on t_yer y (cost=0.00..3307.79 rows=58 width=14) >Filter: (((iptal)::text = 'H'::text) AND (subplan)) >SubPlan > -> Index Scan using > t_guzer_ucret_giris_performans_idx on t_guzer g (cost > =0.00..28.73 rows=1 width=0) >Index Cond: ((bin_yer_kod = 1) AND (in_yer_kod = > $0)) >Filter: (fn_firma_isvisible(firma_no, 3474) = 1) > (6 rows) > " > and it runs very fast. > > Any idea ? Need EXPLAIN ANALYZE. I suspect this is due to a cached query plan. PostgreSQL will cache a query plan for the SELECT the first time you run the function and that plan will be re-used. Depending on what data you call the function with, you could get a very different plan. Also, you might do better with a JOIN instead of using EXISTS. You can also make this function STABLE instead of VOLATILE. Likewise, if FN_FIRMA_ISVISIBLE can't change any data, you can also make it STABLE which would likely improve the performance of the query. But neither of these ideas would account for the difference between function performance and raw query performance. On a side note, if OR $1 IS NULL works that will be more readable (and probably faster) than the OR COALESCE($1,0)=0. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue
On Fri, Sep 23, 2005 at 08:17:03PM +0800, K C Lau wrote: > esdt=> create or replace function player_max_atdate (varchar(32)) returns > varchar(32) as $$ > esdt$> select distinct on (PlayerID) AtDate from player where PlayerID= $1 > order by PlayerID desc, AtDate desc limit 1; > esdt$> $$ language sql immutable; > CREATE FUNCTION That function is not immutable, it should be defined as stable. > esdt=> create or replace view VCurPlayer3 as select * from Player where > AtDate = player_max_atdate(PlayerID); > CREATE VIEW > esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where > PlayerID='0'; > > Index Scan using pk_player on player (cost=0.00..1331.83 rows=9 > width=23) (actual time=76.660..76.664 rows=1 loops=1) >Index Cond: ((playerid)::text = '0'::text) >Filter: ((atdate)::text = (player_max_atdate(playerid))::text) > Total runtime: 76.716 ms > > Why wouldn't the function get the row as quickly as the direct sql does? PostgreSQL doesn't pre-compile functions, at least not until 8.1 (and I'm not sure how much those are pre-compiled, though they are syntax-checked at creation). Do you get the same result time when you run it a second time? What time do you get from running just the function versus the SQL in the function? Also, remember that every layer you add to the cake means more work for the database. If speed is that highly critical you'll probably want to not wrap things in functions, and possibly not use views either. Also, keep in mind that getting below 1ms doesn't automatically mean you'll be able to scale to 1000TPS. Things will definately change when you load the system down, so if performance is that critical you should start testing with the system under load if you're not already. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Comparative performance
Jim C. Nasby wrote: Make sure these indexes exist if you'll be updating or inserting into entry: CREATE INDEX topic__subject_id ON topic(subject_id); CREATE INDEX topic__actor_id ON topic(actor_id); Actually, topic's primary key is topic_id. Also, the fact that subject and actor both point to topic along with subject_type and actor_type make me suspect that your design is de-normalized. Of course there's no way to know without more info. Yes, the design is denormalized. The reason is that a book or article is usually by a single author (an "actor" topic) and it will be listed under one main topic (a "subject" topic). There's a topic_entry table where additional actors and subjects can be added. It's somewhat ironic because I used to teach and/or preach normalization and the "goodness" of a 3NF+ design (also about having the database do aggregation and sorting as you mentioned in your other email). FWIW, I usually use timestamptz for both created and updated fields. IIRC 'created' ended up as a DATE because MySQL 4 has a restriction about a single TIMESTAMP column per table taking the default value of current_timestamp. Joe ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue
On Fri, Sep 23, 2005 at 04:53:55PM +0800, K C Lau wrote: > Thank you all for your suggestions. I' tried, with some variations too, but > still no success. The times given are the best of a few repeated tries on > an 8.1 beta 2 db freshly migrated from 8.0.3 on Windows. > > For reference, only the following gets the record quickly: > > esdt=> explain analyze select PlayerID,AtDate from Player a > where PlayerID='0' and AtDate = (select b.AtDate from Player b > where b.PlayerID = '0' order by b.PlayerID desc, b.AtDate desc LIMIT > 1); > > Index Scan using pk_player on player a (cost=0.75..4.26 rows=1 width=23) > (actual time=0.054..0.057 rows=1 loops=1) >Index Cond: (((playerid)::text = '0'::text) AND ((atdate)::text = > ($0)::text)) >InitPlan > -> Limit (cost=0.00..0.75 rows=1 width=23) (actual > time=0.027..0.028 rows=1 loops=1) >-> Index Scan Backward using pk_player on player > b (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.023..0.023 rows=1 > loops=1) > Index Cond: ((playerid)::text = '0'::text) > Total runtime: 0.132 ms If you're doing that, you should try something like the following: decibel=# explain analyze select * from t where ctid=(select ctid from rrs order by rrs_id desc limit 1); QUERY PLAN - Tid Scan on t (cost=0.44..4.45 rows=1 width=42) (actual time=0.750..0.754 rows=1 loops=1) Filter: (ctid = $0) InitPlan -> Limit (cost=0.00..0.44 rows=1 width=10) (actual time=0.548..0.549 rows=1 loops=1) -> Index Scan Backward using rrs_rrs__rrs_id on rrs (cost=0.00..3.08 rows=7 width=10) (actual time=0.541..0.541 rows=1 loops=1) Total runtime: 1.061 ms (6 rows) decibel=# select count(*) from t; count 458752 Note that that's on my nice slow laptop to boot (the count took like 10 seconds). Just remember that ctid *is not safe outside of a transaction*!! So you can't do something like SELECT ctid FROM ... store that in some variable... SELECT * FROM table WHERE ctid = variable -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Comparative performance
On Tue, Oct 04, 2005 at 05:11:19PM -0400, Joe wrote: > Hi Jim, > > Jim C. Nasby wrote: > >Also, just because no one else has mentioned it, remember that it's very > >easy to get MySQL into a mode where you have no data integrity. If > >that's the case it's going to be faster than PostgreSQL (though I'm not > >sure how much that affects the performance of SELECTs). > > Yes indeed. When I added the REFERENCES to the schema and reran the > conversion scripts, aside from having to reorder the table creation and > loading (they used to be in alphabetical order), I also found a few > referential integrity errors in the MySQL data. Data integrity != refferential integrity. :) It's very easy to accidentally get MyISAM tables in MySQL, which means you are nowhere near ACID which also means you can't get anything close to an apples to apples comparison to PostgreSQL. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue
On Tue, Oct 04, 2005 at 04:15:41PM -0500, Jim C. Nasby wrote: > >Index Cond: ((playerid)::text = '0'::text) Also, why is playerid a text field? Comparing ints will certainly be faster... -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Is There Any Way ....
Which version of PG are you using? One of the new features for 8.0 was an improved caching algorithm that was smart enough to avoid letting a single big query sweep everything else out of cache. -- Mark Lewis On Tue, 2005-10-04 at 10:45 -0400, Lane Van Ingen wrote: > Yes, Stefan, the kind of usage you are mentioning is exactly why I was > asking. > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Stefan Weiss > Sent: Tuesday, October 04, 2005 6:32 AM > To: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Is There Any Way > > > On 2005-09-30 01:21, Lane Van Ingen wrote: > > (3) Assure that a disk-based table is always in memory (outside of > keeping > > it in > > memory buffers as a result of frequent activity which would prevent > > LRU > > operations from taking it out) ? > > I was wondering about this too. IMO it would be useful to have a way to tell > PG that some tables were needed frequently, and should be cached if > possible. This would allow application developers to consider joins with > these tables as "cheap", even when querying on columns that are not indexed. > I'm thinking about smallish tables like users, groups, *types, etc which > would be needed every 2-3 queries, but might be swept out of RAM by one > large query in between. Keeping a table like "users" on a RAM fs would not > be an option, because the information is not volatile. > > > cheers, > stefan > > ---(end of broadcast)--- > TIP 5: don't forget to increase your free space map settings > > > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Comparative performance
Postgresql uses MVCC to ensure data integrity. Server must choose the right version of tuple, according to transaction ID of statement. Even for a select (ACID features of postgresql, I think C and I apply here), it must accomplish some extra work. -Mensaje original- De: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] nombre de Joe Enviado el: martes, 04 de octubre de 2005 18:11 Para: Jim C. Nasby CC: Andreas Pflug; pgsql-performance@postgresql.org Asunto: Re: [PERFORM] Comparative performance Hi Jim, Jim C. Nasby wrote: > Also, just because no one else has mentioned it, remember that it's very > easy to get MySQL into a mode where you have no data integrity. If > that's the case it's going to be faster than PostgreSQL (though I'm not > sure how much that affects the performance of SELECTs). Yes indeed. When I added the REFERENCES to the schema and reran the conversion scripts, aside from having to reorder the table creation and loading (they used to be in alphabetical order), I also found a few referential integrity errors in the MySQL data. Joe ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Is There Any Way ....
pg is _very_ stupid about caching. Almost all of the caching is left to the OS, and it's that way by design (as post after post by TL has pointed out). That means pg has almost no ability to take application domain specific knowledge into account when deciding what to cache. There's plenty of papers on caching out there that show that context dependent knowledge leads to more effective caching algorithms than context independent ones are capable of. (Which means said design choice is a Mistake, but unfortunately one with too much inertia behind it currentyl to change easily.) Under these circumstances, it is quite possible that an expert class human could optimize memory usage better than the OS + pg. If one is _sure_ they know what they are doing, I'd suggest using tmpfs or the equivalent for critical read-only tables. For "hot" tables that are rarely written to and where data loss would not be a disaster, "tmpfs" can be combined with an asyncronous writer process push updates to HD. Just remember that a power hit means that The (much) more expensive alternative is to buy SSD(s) and put the critical tables on it at load time. Ron -Original Message- From: "Jim C. Nasby" <[EMAIL PROTECTED]> Sent: Oct 4, 2005 4:57 PM To: Stefan Weiss <[EMAIL PROTECTED]> Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Is There Any Way On Tue, Oct 04, 2005 at 12:31:42PM +0200, Stefan Weiss wrote: > On 2005-09-30 01:21, Lane Van Ingen wrote: > > (3) Assure that a disk-based table is always in memory (outside of keeping > > it in > > memory buffers as a result of frequent activity which would prevent > > LRU > > operations from taking it out) ? > > I was wondering about this too. IMO it would be useful to have a way to tell > PG that some tables were needed frequently, and should be cached if > possible. This would allow application developers to consider joins with > these tables as "cheap", even when querying on columns that are not indexed. > I'm thinking about smallish tables like users, groups, *types, etc which > would be needed every 2-3 queries, but might be swept out of RAM by one > large query in between. Keeping a table like "users" on a RAM fs would not > be an option, because the information is not volatile. Why do you think you'll know better than the database how frequently something is used? At best, your guess will be correct and PostgreSQL (or the kernel) will keep the table in memory. Or, your guess is wrong and you end up wasting memory that could have been used for something else. It would probably be better if you describe why you want to force this table (or tables) into memory, so we can point you at more appropriate solutions. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Is There Any Way ....
On Tue, Oct 04, 2005 at 07:33:47PM -0400, Ron Peacetree wrote: > pg is _very_ stupid about caching. Almost all of the caching is left > to the OS, and it's that way by design (as post after post by TL has > pointed out). > > That means pg has almost no ability to take application domain > specific knowledge into account when deciding what to cache. > There's plenty of papers on caching out there that show that > context dependent knowledge leads to more effective caching > algorithms than context independent ones are capable of. > > (Which means said design choice is a Mistake, but unfortunately > one with too much inertia behind it currentyl to change easily.) > > Under these circumstances, it is quite possible that an expert class > human could optimize memory usage better than the OS + pg. Do you have any examples where this has actually happened? Especially with 8.x, which isn't all that 'stupid' about how it handles buffers? -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Is There Any Way ....
Ron Peacetree sounds like someone talking out of his _AZZ_. He can save his unreferenced flapdoodle for his SQL Server clients. Maybe he will post references so that we may all learn at the feet of Master Peacetree. :-) douglas On Oct 4, 2005, at 7:33 PM, Ron Peacetree wrote: pg is _very_ stupid about caching. Almost all of the caching is left to the OS, and it's that way by design (as post after post by TL has pointed out). That means pg has almost no ability to take application domain specific knowledge into account when deciding what to cache. There's plenty of papers on caching out there that show that context dependent knowledge leads to more effective caching algorithms than context independent ones are capable of. (Which means said design choice is a Mistake, but unfortunately one with too much inertia behind it currentyl to change easily.) Under these circumstances, it is quite possible that an expert class human could optimize memory usage better than the OS + pg. If one is _sure_ they know what they are doing, I'd suggest using tmpfs or the equivalent for critical read-only tables. For "hot" tables that are rarely written to and where data loss would not be a disaster, "tmpfs" can be combined with an asyncronous writer process push updates to HD. Just remember that a power hit means that The (much) more expensive alternative is to buy SSD(s) and put the critical tables on it at load time. Ron -Original Message- From: "Jim C. Nasby" <[EMAIL PROTECTED]> Sent: Oct 4, 2005 4:57 PM To: Stefan Weiss <[EMAIL PROTECTED]> Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Is There Any Way On Tue, Oct 04, 2005 at 12:31:42PM +0200, Stefan Weiss wrote: On 2005-09-30 01:21, Lane Van Ingen wrote: (3) Assure that a disk-based table is always in memory (outside of keeping it in memory buffers as a result of frequent activity which would prevent LRU operations from taking it out) ? I was wondering about this too. IMO it would be useful to have a way to tell PG that some tables were needed frequently, and should be cached if possible. This would allow application developers to consider joins with these tables as "cheap", even when querying on columns that are not indexed. I'm thinking about smallish tables like users, groups, *types, etc which would be needed every 2-3 queries, but might be swept out of RAM by one large query in between. Keeping a table like "users" on a RAM fs would not be an option, because the information is not volatile. Why do you think you'll know better than the database how frequently something is used? At best, your guess will be correct and PostgreSQL (or the kernel) will keep the table in memory. Or, your guess is wrong and you end up wasting memory that could have been used for something else. It would probably be better if you describe why you want to force this table (or tables) into memory, so we can point you at more appropriate solutions. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Indexes on ramdisk
I have an application that has a table that is both read and write intensive. Data from iostat indicates that the write speed of the system is the factor that is limiting performance. The table has around 20 columns and most of the columns are indexed. The data and the indices for the table are distributed over several mirrored disk partitions and pg_xlog is on another. I'm looking at ways to improve performance and besides the obvious one of getting an SSD I thought about putting the indices on a ramdisk. That means that after a power failure or shutdown I would have to recreate them but that is acceptable for this application. What I am wondering though is whether or not I would see much performance benefit and if there would be any startup problems after a power down event due to the indices not being present. Any insight would be appreciated. Emil ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Is There Any Way ....
Douglas J. Trainor wrote: Ron Peacetree sounds like someone talking out of his _AZZ_. He can save his unreferenced flapdoodle for his SQL Server clients. Maybe he will post references so that we may all learn at the feet of Master Peacetree. :-) Although I agree that I would definitely like to see some test cases for what Ron is talking about, I don't think that resorting to insults is going to help the situation. Ron, if you would please -- provide some test cases for what you are describing I am sure that anyone would love to see them. We are all for improving PostgreSQL. Sincerely, Joshua D. Drake -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Is There Any Way ....
Unfortunately, no matter what I say or do, I'm not going to please or convince anyone who has already have made their minds up to the extent that they post comments like Mr Trainor's below. His response style pretty much proves my earlier point that this is presently a religious issue within the pg community. The absolute best proof would be to build a version of pg that does what Oracle and DB2 have done and implement it's own DB specific memory manager and then compare the performance between the two versions on the same HW, OS, and schema. The second best proof would be to set up either DB2 or Oracle so that they _don't_ use their memory managers and compare their performance to a set up that _does_ use said memory managers on the same HW, OS, and schema. I don't currently have the resources for either experiment. Some might even argue that IBM (where Codd and Date worked) and Oracle just _might_ have had justification for the huge effort they put into developing such infrastructure. Then there's the large library of research on caching strategies in just about every HW and SW domain, including DB theory, that points put that the more context dependent, ie application or domain specific awareness, caching strategies are the better they are. Maybe after we do all we can about physical IO and sorting performance I'll take on the religious fanatics on this one. One problem set at a time. Ron -Original Message- From: "Joshua D. Drake" <[EMAIL PROTECTED]> Sent: Oct 4, 2005 9:32 PM To: "Douglas J. Trainor" <[EMAIL PROTECTED]> Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Is There Any Way Douglas J. Trainor wrote: > > Ron Peacetree sounds like someone talking out of his _AZZ_. > He can save his unreferenced flapdoodle for his SQL Server > clients. Maybe he will post references so that we may all > learn at the feet of Master Peacetree. :-) Although I agree that I would definitely like to see some test cases for what Ron is talking about, I don't think that resorting to insults is going to help the situation. Ron, if you would please -- provide some test cases for what you are describing I am sure that anyone would love to see them. We are all for improving PostgreSQL. Sincerely, Joshua D. Drake -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Indexes on ramdisk
Talk about your IO system a bit. There might be obvious ways to improve. What System/Motherboard are you using? What Controller Cards are you using? What kind of Disks do you have (SATA, SCSI 7.6k 10k 15k) What denominations (9, 18, 36, 72, 143, 80, 160, 200 240Gig)? What kind of RAIDs do you have setup (How many drives what stripe sizes, how many used for what). What levels of RAID are you using (0,1,10,5,50)? With good setup, a dual PCI-X bus motherboard can hit 2GB/sec and thousands of transactions to disk if you have a controller/disks that can keep up. That is typicaly enough for most people without resorting to SSD. Alex Turner NetEconomistOn 10/4/05, Emil Briggs <[EMAIL PROTECTED]> wrote: I have an application that has a table that is both read and write intensive.Data from iostat indicates that the write speed of the system is the factorthat is limiting performance. The table has around 20 columns and most of the columns are indexed. The data and the indices for the table are distributedover several mirrored disk partitions and pg_xlog is on another. I'm lookingat ways to improve performance and besides the obvious one of getting an SSD I thought about putting the indices on a ramdisk. That means that after apower failure or shutdown I would have to recreate them but that isacceptable for this application. What I am wondering though is whether or not I would see much performance benefit and if there would be any startupproblems after a power down event due to the indices not being present. Anyinsight would be appreciated.Emil---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Indexes on ramdisk
> Talk about your IO system a bit. There might be obvious ways to improve. > > What System/Motherboard are you using? > What Controller Cards are you using? > What kind of Disks do you have (SATA, SCSI 7.6k 10k 15k) > What denominations (9, 18, 36, 72, 143, 80, 160, 200 240Gig)? > What kind of RAIDs do you have setup (How many drives what stripe sizes, > how many used for what). > What levels of RAID are you using (0,1,10,5,50)? > It's a quad opteron system. RAID controller is a 4 channel LSILogic Megaraid 320 connected to 10 15k 36.7G SCSI disks. The disks are configured in 5 mirrored partitions. The pg_xlog is on one mirror and the data and indexes are spread over the other 4 using tablespaces. These numbers from pg_stat_user_tables are from about 2 hours earlier today on this one table. idx_scan 20578690 idx_tup_fetch 35866104841 n_tup_ins1940081 n_tup_upd 1604041 n_tup_del1880424 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Is There Any Way ....
On Tue, Oct 04, 2005 at 11:06:54PM -0400, Ron Peacetree wrote: > Some might even argue that IBM (where Codd and Date worked) > and Oracle just _might_ have had justification for the huge effort > they put into developing such infrastructure. The OS and FS world is very, very different now than it was when the Oracle and DB2 architectures were being crafted. What may have been an excellent development effort then may not provide such good ROI now. > Then there's the large library of research on caching strategies > in just about every HW and SW domain, including DB theory, > that points put that the more context dependent, ie application > or domain specific awareness, caching strategies are the better > they are. > > Maybe after we do all we can about physical IO and sorting > performance I'll take on the religious fanatics on this one. Actually, the main "religious fanatic" I've seen recently is yourself. While I have a gut feel that some of the issues you raise could certainly do with further investigation, I'm not seeing that much from you other than statements that muchof what postgresql does is wrong (not "wrong for your Ron's use case", but "wrong in every respect"). A little less arrogance and a little more "here are some possibilities for improvement", "here is an estimate of the amount of effort that might be needed" and "here are some rough benchmarks showing the potential return on that investment" would, at the very least, make the threads far less grating to read. Cheers, Steve ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Is There Any Way ....
On Tue, Oct 04, 2005 at 11:06:54PM -0400, Ron Peacetree wrote: > Unfortunately, no matter what I say or do, I'm not going to please > or convince anyone who has already have made their minds up > to the extent that they post comments like Mr Trainor's below. > His response style pretty much proves my earlier point that this > is presently a religious issue within the pg community. Religious for some. Conservative for others. Sometimes people need to see the way, before they are willing to accept it merely on the say so of another person. In some circles, it is called the scientific method... :-) Also, there is a cost to complicated specific optimizations. They can be a real maintenance and portability head-ache. What is the value ratio of performance to maintenance or portability? > The absolute best proof would be to build a version of pg that does > what Oracle and DB2 have done and implement it's own DB > specific memory manager and then compare the performance > between the two versions on the same HW, OS, and schema. Not necessarily. Even if a version of PostgreSQL were to be written to function in this new model, there would be no guarantee that it was written in the most efficient manner possible. Performance could show PostgreSQL using its own caching, and disk space management implementation, and performing poorly. The only true, and accurate way would be to implement, and then invest time by those most competent to test, and optimize the implementation. At this point, it would become a moving target, as those who believe otherwise, would be free to pursue using more efficient file systems, or modifications to the operating system to better co-operate with PostgreSQL. I don't think there can be a true answer to this one. The more innovative, and clever people, will always be able to make their solution work better. If the difference in performance was really so obvious, there wouldn't be doubters on either side. It would be clear to all. The fact is, there is reason to doubt. Perhaps not doubt that the final solution would be more efficient, but rather, the reason to doubt that the difference in efficiency would be significant. > The second best proof would be to set up either DB2 or Oracle so > that they _don't_ use their memory managers and compare their > performance to a set up that _does_ use said memory managers > on the same HW, OS, and schema. Same as above. If Oracle was designed to work with the functionality, then disabling the functionality, wouldn't prove that an efficient design would perform equally poorly, or even, poorly at all. I think it would be obvious that Oracle would have invested most of their dollars into the common execution paths, with the expected functionality present. > I don't currently have the resources for either experiment. This is the real problem. :-) > Some might even argue that IBM (where Codd and Date worked) > and Oracle just _might_ have had justification for the huge effort > they put into developing such infrastructure. Or, not. They might just have more money to throw at the problem, and be entrenched into their solution to the point that they need to innovate to ensure that their solution appears to be the best. > Then there's the large library of research on caching strategies > in just about every HW and SW domain, including DB theory, > that points put that the more context dependent, ie application > or domain specific awareness, caching strategies are the better > they are. A lot of this is theory. It may be good theory, but there is no guarantee that the variables listed in these theories match, or properly estimate the issues that would be found in a real implementation. > Maybe after we do all we can about physical IO and sorting > performance I'll take on the religious fanatics on this one. > One problem set at a time. In any case, I'm on your side - in theory. Intuitively, I don't understand how anybody could claim that a general solution could ever be faster than a specific solution. Anybody who claimed this, would go down in my books as a fool. It should be obvious to these people that, as an extreme, the entire operating system caching layer, and the entire file system layer could be inlined into PostgreSQL, avoiding many of the expenses involved in switching back and forth between user space and system space, leaving a more efficient, although significantly more complicated solution. Whether by luck, or by experience of those involved, I haven't seen any senior PostgreSQL developers actually stating that it couldn't be faster. Instead, I've seen it claimed that the PostgreSQL developers don't have the resources to attack this problem, as there are other far more pressing features, product defects, and more obviously beneficial optimization opportunities to work on. Memory management, or disk management, is "good enough" as provided by decent operating systems, and the itch just isn't bad enough to scratch yet. They rema
Re: [HACKERS] [PERFORM] A Better External Sort?
On E, 2005-10-03 at 14:16 -0700, Josh Berkus wrote: > Jeff, > > > > Nope, LOTS of testing, at OSDL, GreenPlum and Sun. For comparison, A > > > Big-Name Proprietary Database doesn't get much more than that either. > > > > I find this claim very suspicious. I get single-threaded reads in > > excess of 1GB/sec with XFS and > 250MB/sec with ext3. > > Database reads? Or raw FS reads? It's not the same thing. Just FYI, I run a count(*) on a 15.6GB table on a lightly loaded db and it run in 163 sec. (Dual opteron 2.6GHz, 6GB RAM, 6 x 74GB 15k disks in RAID10, reiserfs). A little less than 100MB sec. After this I ran count(*) over a 2.4GB file from another tablespace on another device (4x142GB 10k disks in RAID10) and it run 22.5 sec on first run and 12.5 on second. db=# show shared_buffers ; shared_buffers 196608 (1 row) db=# select version(); version PostgreSQL 8.0.3 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 3.3.6 (Debian 1:3.3.6-7) (1 row) -- Hannu Krosing <[EMAIL PROTECTED]> ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PERFORM] A Better External Sort?
On 10/3/05, Ron Peacetree <[EMAIL PROTECTED]> wrote: [snip] > Just how bad is this CPU bound condition? How powerful a CPU is > needed to attain a DB IO rate of 25MBps? > > If we replace said CPU with one 2x, 10x, etc faster than that, do we > see any performance increase? > > If a modest CPU can drive a DB IO rate of 25MBps, but that rate > does not go up regardless of how much extra CPU we throw at > it... Single threaded was mentioned. Plus even if it's purely cpu bound, it's seldom as trivial as throwing CPU at it, consider the locking in both the application, in the filesystem, and elsewhere in the kernel. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] index on custom function; explain
Some additional thoughts: what appears to take the most time (i.e. account for the highest cost in the explain), is _not_ running the function itself (cost=0.00..0.01), but comparing the result from that function with the name1 column in the mappings table (cost=0.00..35935.05). Am I right? (See EXPLAIN in previous post.) If so: that's pretty strange, because the name1-column in the mappings table is indexed... jan. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PERFORM] A Better External Sort?
> -Original Message- > From: [EMAIL PROTECTED] [mailto:pgsql-hackers- > [EMAIL PROTECTED] On Behalf Of PFC > Sent: Thursday, September 29, 2005 9:10 AM > To: [EMAIL PROTECTED] > Cc: Pg Hackers; pgsql-performance@postgresql.org > Subject: Re: [HACKERS] [PERFORM] A Better External Sort? > > > Just to add a little anarchy in your nice debate... > > Who really needs all the results of a sort on your terabyte table ? Reports with ORDER BY/GROUP BY, and many other possibilities. 40% of mainframe CPU cycles are spent sorting. That is because huge volumes of data require lots of energy to be meaningfully categorized. Let's suppose that instead of a terabyte of data (or a petabyte or whatever) we have 10% of it. That's still a lot of data. > I guess not many people do a SELECT from such a table and want all > the > results. What happens when they do? The cases where it is already fast are not very important. The cases where things go into the crapper are the ones that need attention. >So, this leaves : > - Really wanting all the results, to fetch using a cursor, > - CLUSTER type things, where you really want everything in order, > - Aggregates (Sort->GroupAggregate), which might really need to sort > the > whole table. > - Complex queries where the whole dataset needs to be examined, in > order > to return a few values > - Joins (again, the whole table is probably not going to be > selected) > - And the ones I forgot. > > However, > > Most likely you only want to SELECT N rows, in some ordering : > - the first N (ORDER BY x LIMIT N) > - last N (ORDER BY x DESC LIMIT N) For these, the QuickSelect algorithm is what is wanted. For example: #include typedef double Etype; extern EtypeRandomSelect(Etype * A, size_t p, size_t r, size_t i); extern size_t RandRange(size_t a, size_t b); extern size_t RandomPartition(Etype * A, size_t p, size_t r); extern size_t Partition(Etype * A, size_t p, size_t r); /* ** ** In the following code, every reference to CLR means: ** **"Introduction to Algorithms" **By Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest **ISBN 0-07-013143-0 */ /* ** CLR, page 187 */ Etype RandomSelect(Etype A[], size_t p, size_t r, size_t i) { size_t q, k; if (p == r) return A[p]; q = RandomPartition(A, p, r); k = q - p + 1; if (i <= k) return RandomSelect(A, p, q, i); else return RandomSelect(A, q + 1, r, i - k); } size_t RandRange(size_t a, size_t b) { size_t c = (size_t) ((double) rand() / ((double) RAND_MAX + 1) * (b - a)); return c + a; } /* ** CLR, page 162 */ size_t RandomPartition(Etype A[], size_t p, size_t r) { size_t i = RandRange(p, r); Etype Temp; Temp = A[p]; A[p] = A[i]; A[i] = Temp; return Partition(A, p, r); } /* ** CLR, page 154 */ size_t Partition(Etype A[], size_t p, size_t r) { Etype x, temp; size_t i, j; x = A[p]; i = p - 1; j = r + 1; for (;;) { do { j--; } while (!(A[j] <= x)); do { i++; } while (!(A[i] >= x)); if (i < j) { temp = A[i]; A[i] = A[j]; A[j] = temp; } else return j; } } > - WHERE x>value ORDER BY x LIMIT N > - WHERE x - and other variants > > Or, you are doing a Merge JOIN against some other table ; in that > case, > yes, you might need the whole sorted terabyte table, but most likely there > are WHERE clauses in the query that restrict the set, and thus, maybe we > can get some conditions or limit values on the column to sort. Where clause filters are to be applied AFTER the join operations, according to the SQL standard. > Also the new, optimized hash join, which is more memory efficient, > might > cover this case. For == joins. Not every order by is applied to joins. And not every join is an equal join. > Point is, sometimes, you only need part of the results of your sort. > And > the bigger the sort, the most likely it becomes that you only want part of > the results. That is an assumption that will sometimes be true, and sometimes not. It is not possible to predict usage patterns for a general purpose database system. > So, while we're in the fun hand-waving, new algorithm trying > mode, why not consider this right from the start ? (I know I'm totally in > hand-waving mode right now, so slap me if needed). > > I'd say your new, fancy sort algorithm needs a few more input values > : > > - Range of values that must appear in the final result of the sort : > none, minimum, maximum, both, or even a set of values from the > other > side of the join, hashed, or sorted. That will already
[PERFORM] index on custom function; explain
Hi, I'm trying to include a custom function in my SQL-queries, which unfortunately leaves the server hanging... I basically search through two tables: * TABLE_MAPPING: lists that 'abc' is mapped to 'def' id1 | name1 | id2 | name2 - 1 | abc | 2 | def 3 | uvw | 4 | xyz This data means that 'abc' is mapped_to 'def', and 'uvw' is mapped_to 'xyz'. About 1,500,000 records in total. * TABLE ALIASES: lists different aliases of the same thing id1 | name1 | id2 | name2 - 3 | uvw | 2 | def This data means that 'uvw' and 'def' are essentially the same thing. About 820,000 records in total. I have indexes on all columns of the above tables. Based on the two tables above, 'abc' is indirectly mapped_to 'xyz' as well (through 'def' also-known-as 'uvw'). I wrote this little function: aliases_of CREATE FUNCTION aliases_of(INTEGER) RETURNS SETOF integer AS 'SELECT $1 UNION SELECT id1 FROM aliases WHERE id2 = $1 UNION SELECT id2 FROM aliases WHERE id1 = $1 ' LANGUAGE SQL STABLE; A simple SELECT aliases_of(2) shows: aliases_of -- 2 3 Now, when I want to traverse the aliases, I would write a query as follows: SELECT m1.name1, m1.name2, m2.name1, m2.name2 FROM mappings m1, mappings m2 WHERE m1.name1 = 'abc' AND m2.name1 IN (SELECT aliases_of(m1.name2)); Unfortunately, this query seems to keep running and to never stop... An EXPLAIN of the above query shows: QUERY PLAN - Nested Loop (cost=0.00..118379.45 rows=1384837 width=80) Join Filter: (subplan) -> Index Scan using ind_cmappings_object1_id on c_mappings m1 (cost=0.00..7.08 rows=2 width=40) Index Cond: (name1 = 'abc') -> Seq Scan on c_mappings m2 (cost=0.00..35935.05 rows=1423805 width=40) SubPlan -> Result (cost=0.00..0.01 rows=1 width=0) (7 rows) Strangely enough, I _do_ get output when I type the following query: SELECT m1.name1, m1.name2, m2.name1, m2.name2 FROM mappings m1, mappings m2 WHERE m1.name1 = 'abc' AND m2.name1 IN ( SELECT m1.name2 UNION SELECT name2 FROM aliases WHERE name1 = m1.name2 UNION SELECT name1 FROM aliases WHERE name2 = m2.name1 ); The EXPLAIN for this query is: QUERY PLAN --- Nested Loop (cost=0.00..36712030.90 rows=1384837 width=80) Join Filter: (subplan) -> Index Scan using ind_cmappings_object1_id on c_mappings m1 (cost=0.00..7.08 rows=2 width=40) Index Cond: (object1_id = 16575564) -> Seq Scan on c_mappings m2 (cost=0.00..35935.05 rows=1423805 width=40) SubPlan -> Unique (cost=13.21..13.23 rows=1 width=4) -> Sort (cost=13.21..13.22 rows=3 width=4) Sort Key: object2_id -> Append (cost=0.00..13.18 rows=3 width=4) -> Subquery Scan "*SELECT* 1" (cost=0.00..0.01 rows=1 width=0) -> Result (cost=0.00..0.01 rows=1 width=0) -> Subquery Scan "*SELECT* 2" (cost=0.00..5.92 rows=1 width=4) -> Index Scan using ind_caliases_object2_id on c_aliases (cost=0.00..5.92 rows=1 width=4) Index Cond: (object2_id = $0) -> Subquery Scan "*SELECT* 3" (cost=0.00..7.25 rows=1 width=4) -> Index Scan using ind_caliases_object1_id on c_aliases (cost=0.00..7.25 rows=1 width=4) Index Cond: (object1_id = $0) (18 rows) So my questions are: * Does anyone have any idea how I can integrate a function that lists all aliases for a given name into such a mapping query? * Does the STABLE keyword in the function definition make the function to read all its data into memory? * Is there a way to let postgres use an "Index scan" on that function instead of a "seq scan"? Any help very much appreciated, Jan Aerts ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Is There Any Way ....
First off, Mr. Trainor's response proves nothing about anyone or anything except Mr. Trainor. I'm going to offer an opinion on the caching topic. I don't have any benchmarks; I'm offering a general sense of the issue based on decades of experience, so I'll give a short summary of that. I've been earning my living by working with computers since 1972, and am the architect and primary author of a little-known database product (developed in 1984) which saw tens of thousands of installations in various vertical markets. (Last I checked, a couple years ago, it was still being used statewide by one state government after a multi-million dollar attempt to replace it with a popular commercial database product failed.) I've installed and tuned many other database products over the years. I'm just getting to know PostgreSQL, and am pretty excited about it. Now on to the meat of it. My experience is that a DBMS can improve performance by caching certain types of data. In the product I developed, we had a fairly small cache which used a weighting algorithm for what to keep (rather than simply relying on LRU). Index pages got higher weight than data pages; the higher in the index, the higher the weight. Recent access got higher weight than older access, although it took quite a while for the older access to age out entirely. This improved performance quite a bit over a generalized caching product alone. However, there was a point of diminishing return. My sense is that every 10% you add to a "smart" cache yields less benefit at a higher cost, so beyond a certain point, taking RAM from the general cache to expand the smart cache degrades performance. Clever programming techniques can shift the break-even point, but I very much doubt it can be eliminated entirely, unless the ratio of performance between CPU+RAM and persistent storage is much more extreme than I've ever seen. There is another issue, which has been raised from time to time in these lists, but not enunciated clearly enough in my view. These discussions about caching generally address maximum throughput, while there are times when it is important that certain tables can be queried very quickly, even if it hurts overall throughput. As an example, there can be tables which are accessed as a user types in a window and tabs around from one GUI control to another. The user perception of the application performance is going to depend PRIMARILY on how quickly the GUI renders the results of these queries; if the run time for a large report goes up by 10%, they will probably not notice. This is a situation where removing RAM from a generalized cache, or one based on database internals, to create an "application specific" cache can yield big rewards. One client has addressed this in a commercial product by defining a named cache large enough to hold these tables, and binding those tables to the cache. One side benefit is that such caches can be defined as "relaxed LRU" -- meaning that they eliminate the overhead of tracking accesses, since they can assume that data will rarely, if ever, be discarded from the cache. It seems to me that in the PostgreSQL world, this would currently be addressed by binding the tables to a tablespace where the file system, controller, or drive(s) would cache the data, although this is somewhat less flexible than the "named cache" approach -- unless there is a file system that can layer a cache on top of a reference to some other file system's space. (And let's not forget the many OS environments in which people use PostgreSQL.) So I do see that there would be benefit to adding a feature to PostgreSQL to define caches and bind tables or indexes to them. So I do think that it is SMART of PostgreSQL to rely on the increasingly sophisticated file systems to provide the MAIN cache. I suspect that a couple types of smaller "smart" caches in front of this could boost performance, and it might be a significant boost. I'm not sure what the current shared memory is used for; perhaps this is already caching specific types of structures for the DBMS. I'm pretty sure that programmers of GUI apps would appreciate the named cache feature, so they could tune the database for snappy GUI response, even under heavy load. I realize this is short on specifics -- I'm shooting for perspective. For the record, I don't consider myself particularly religious on the topic, but I do pull back a little at arguments which sound strictly academic -- I've found that most of what I've drawn from those circles has needed adjustment in solving real-world problems. (Particularly when algorithms optimize for best worst-case performance. I've found users are much happier with best typical case performance as long as the product of worst case performance and worst case frequency is low.) Like many others who have posted on the topic, I am quite prepared to alter my views in the face of relavent evidence. Feel free to laugh at the old far
Re: [HACKERS] [PERFORM] A Better External Sort?
> -Original Message- > From: [EMAIL PROTECTED] [mailto:pgsql-hackers- > [EMAIL PROTECTED] On Behalf Of Tom Lane > Sent: Friday, September 30, 2005 11:02 PM > To: Jeffrey W. Baker > Cc: Luke Lonergan; Josh Berkus; Ron Peacetree; pgsql- > [EMAIL PROTECTED]; pgsql-performance@postgresql.org > Subject: Re: [HACKERS] [PERFORM] A Better External Sort? > > "Jeffrey W. Baker" <[EMAIL PROTECTED]> writes: > > I think the largest speedup will be to dump the multiphase merge and > > merge all tapes in one pass, no matter how large M. Currently M is > > capped at 6, so a sort of 60GB with 1GB sort memory needs 13 passes over > > the tape. It could be done in a single pass heap merge with N*log(M) > > comparisons, and, more importantly, far less input and output. > > I had more or less despaired of this thread yielding any usable ideas > :-( but I think you have one here. I believe I made the exact same suggestion several days ago. >The reason the current code uses a > six-way merge is that Knuth's figure 70 (p. 273 of volume 3 first > edition) shows that there's not much incremental gain from using more > tapes ... if you are in the regime where number of runs is much greater > than number of tape drives. But if you can stay in the regime where > only one merge pass is needed, that is obviously a win. > > I don't believe we can simply legislate that there be only one merge > pass. That would mean that, if we end up with N runs after the initial > run-forming phase, we need to fit N tuples in memory --- no matter how > large N is, or how small work_mem is. But it seems like a good idea to > try to use an N-way merge where N is as large as work_mem will allow. > We'd not have to decide on the value of N until after we've completed > the run-forming phase, at which time we've already seen every tuple > once, and so we can compute a safe value for N as work_mem divided by > largest_tuple_size. (Tape I/O buffers would have to be counted too > of course.) You only need to hold the sort column(s) in memory, except for the queue you are exhausting at the time. [And of those columns, only the values for the smallest one in a sub-list.] Of course, the more data from each list that you can hold at once, the fewer the disk reads and seeks. Another idea (not sure if it is pertinent): Instead of having a fixed size for the sort buffers, size it to the query. Given a total pool of size M, give a percentage according to the difficulty of the work to perform. So a query with 3 small columns and a cardinality of 1000 gets a small percentage and a query with 10 GB of data gets a big percentage of available sort mem. > It's been a good while since I looked at the sort code, and so I don't > recall if there are any fundamental reasons for having a compile-time- > constant value of the merge order rather than choosing it at runtime. > My guess is that any inefficiencies added by making it variable would > be well repaid by the potential savings in I/O. > > regards, tom lane > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] A Better External Sort?
I have perused the tuple sort stuff. The good: The documentation of the sort algorithm from Knuth's TAOCP was beautifully done. Everyone who writes an algorithm should credit the original source like this, and also where it deviates. That was done very nicely. The bad: With random access, tape style merging is not necessary. A priority queue based merge will be a lot faster. The UGLY: Please, someone, tell me I was hallucinating. Is that code really READING AND WRITING THE WHOLE TUPLE with every sort exchange?! Maybe there is a layer of abstraction that I am somehow missing. I just can't imagine that is what it is really doing. If (somehow) it really is doing that, a pointer based sort which forms a permutation based upon the keys, would be a lot better. The fundamental algorithm itself could also be improved somewhat. Here is a {public domain} outline for an introspective quick sort that Pete Filander and I wrote some time ago and contributed to FastDB. It is written as a C++ template, but it will take no effort to make it a simple C routine. It assumes that e_type has comparison operators, so in C you would use a compare function instead. /* ** Sorting stuff by Dann Corbit and Pete Filandr. ** ([EMAIL PROTECTED] and [EMAIL PROTECTED]) ** Use it however you like. */ // // The insertion sort template is used for small partitions. // template < class e_type > void insertion_sort(e_type * array, size_t nmemb) { e_type temp, *last, *first, *middle; if (nmemb > 1) { first = middle = 1 + array; last = nmemb - 1 + array; while (first != last) { ++first; if ((*(middle) > *(first))) { middle = first; } } if ((*(array) > *(middle))) { ((void) ((temp) = *(array), *(array) = *(middle), *(middle) = (temp))); } ++array; while (array != last) { first = array++; if ((*(first) > *(array))) { middle = array; temp = *middle; do { *middle-- = *first--; } while ((*(first) > *(&temp))); *middle = temp; } } } } // // The median estimate is used to choose pivots for the quicksort algorithm // template < class e_type > void median_estimate(e_type * array, size_t n) { e_type temp; long unsigned lu_seed = 123456789LU; const size_tk = ((lu_seed) = 69069 * (lu_seed) + 362437) % --n; ((void) ((temp) = *(array), *(array) = *(array + k), *(array + k) = (temp))); if ((*((array + 1)) > *((array { (temp) = *(array + 1); if ((*((array + n)) > *((array { *(array + 1) = *(array); if ((*(&(temp)) > *((array + n { *(array) = *(array + n); *(array + n) = (temp); } else { *(array) = (temp); } } else { *(array + 1) = *(array + n); *(array + n) = (temp); } } else { if ((*((array)) > *((array + n { if ((*((array + 1)) > *((array + n { (temp) = *(array + 1); *(array + 1) = *(array + n); *(array + n) = *(array); *(array) = (temp); } else { ((void) (((temp)) = *((array)), *((array)) = *((array + n)), *((array + n)) = ((temp; } } } } // // This is the heart of the quick sort algorithm used here. // If the sort is going quadratic, we switch to heap sort. // If the partition is small, we switch to insertion sort. // template < class e_type > void qloop(e_type * array, size_t nmemb, size_t d) { e_type temp, *first, *last; while (nmemb > 50) { if (sorted(array, nmemb)) { return; } if (!d--) { heapsort(array, nmemb); return; } median_estimate(array, nmemb); first = 1 + array; last = nmemb - 1 + array; do { ++first; } while ((*(array) > *(first))); do { --last; } while ((*(last) > *(array))); while (last > first) { ((void) ((temp) = *(last), *(last) = *(first), *(first) = (temp))); do { ++first; } while ((*(array) > *(first))); do { --last; } while ((*(last) > *(array))); } ((void) ((temp) = *(array), *(array) = *(last), *(last) = (temp))); qloop(last + 1, nmemb - 1 + array - last, d); nmemb = last - array; } insertion_sort(array, nmemb); } // // This heap sort is better than average because it uses Lamont's heap. // template < class e_type > void heapsort(e_type * array, size_t nmemb) { size_t i, child, parent; e_type temp; if (nmemb > 1)
Re: [HACKERS] [PERFORM] A Better External Sort?
Judy definitely rates a WOW!! > -Original Message- > From: [EMAIL PROTECTED] [mailto:pgsql-hackers- > [EMAIL PROTECTED] On Behalf Of Gregory Maxwell > Sent: Friday, September 30, 2005 7:07 PM > To: Ron Peacetree > Cc: Jeffrey W. Baker; pgsql-hackers@postgresql.org; pgsql- > [EMAIL PROTECTED] > Subject: Re: [HACKERS] [PERFORM] A Better External Sort? > > On 9/28/05, Ron Peacetree <[EMAIL PROTECTED]> wrote: > > 2= We use my method to sort two different tables. We now have these > > very efficient representations of a specific ordering on these tables. > A > > join operation can now be done using these Btrees rather than the > > original data tables that involves less overhead than many current > > methods. > > If we want to make joins very fast we should implement them using RD > trees. For the example cases where a join against a very large table > will produce a much smaller output, a RD tree will provide pretty much > the optimal behavior at a very low memory cost. > > On the subject of high speed tree code for in-core applications, you > should check out http://judy.sourceforge.net/ . The performance > (insert, remove, lookup, AND storage) is really quite impressive. > Producing cache friendly code is harder than one might expect, and it > appears the judy library has already done a lot of the hard work. > Though it is *L*GPLed, so perhaps that might scare some here away from > it. :) and good luck directly doing joins with a LC-TRIE. ;) > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] A Better External Sort?
I see the following routines that seem to be related to sorting. If I were to examine these routines to consider ways to improve it, what routines should I key in on? I am guessing that tuplesort.c is the hub of activity for database sorting. Directory of U:\postgresql-snapshot\src\backend\access\nbtree 08/11/2005 06:22 AM24,968 nbtsort.c 1 File(s) 24,968 bytes Directory of U:\postgresql-snapshot\src\backend\executor 03/16/2005 01:38 PM 7,418 nodeSort.c 1 File(s) 7,418 bytes Directory of U:\postgresql-snapshot\src\backend\utils\sort 09/23/2005 08:36 AM67,585 tuplesort.c 1 File(s) 67,585 bytes Directory of U:\postgresql-snapshot\src\bin\pg_dump 06/29/2005 08:03 PM31,620 pg_dump_sort.c 1 File(s) 31,620 bytes Directory of U:\postgresql-snapshot\src\port 07/27/2005 09:03 PM 5,077 qsort.c 1 File(s) 5,077 bytes ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PERFORM] A Better External Sort?
On R, 2005-09-30 at 13:38 -0700, Luke Lonergan wrote: > > Bulk loading speed is irrelevant here - that is dominated by parsing, which > we have covered copiously (har har) previously and have sped up by 500%, > which still makes Postgres < 1/2 the loading speed of MySQL. Is this < 1/2 of MySQL with WAL on different spindle and/or WAL disabled ? -- Hannu Krosing <[EMAIL PROTECTED]> ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PERFORM] A Better External Sort?
On Sat, Oct 01, 2005 at 10:22:40AM -0400, Ron Peacetree wrote: > Assuming we get the abyssmal physical IO performance fixed... > (because until we do, _nothing_ is going to help us as much) I'm still not convinced this is the major problem. For example, in my totally unscientific tests on an oldish machine I have here: Direct filesystem copy to /dev/null 21MB/s10% user 50% system (dual cpu, so the system is using a whole CPU) COPY TO /dev/null WITH binary 13MB/s55% user 45% system (ergo, CPU bound) COPY TO /dev/null 4.4MB/s 60% user 40% system \copy to /dev/null in psql 6.5MB/s 60% user 40% system This machine is a bit strange setup, not sure why fs copy is so slow. As to why \copy is faster than COPY, I have no idea, but it is repeatable. And actually turning the tuples into a printable format is the most expensive. But it does point out that the whole process is probably CPU bound more than anything else. So, I don't think physical I/O is the problem. It's something further up the call tree. I wouldn't be surprised at all it it had to do with the creation and destruction of tuples. The cost of comparing tuples should not be underestimated. -- Martijn van Oosterhout http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgpDEXF6GSZ4G.pgp Description: PGP signature
Re: [HACKERS] [PERFORM] A Better External Sort?
On 9/28/05, Ron Peacetree <[EMAIL PROTECTED]> wrote: > 2= We use my method to sort two different tables. We now have these > very efficient representations of a specific ordering on these tables. A > join operation can now be done using these Btrees rather than the > original data tables that involves less overhead than many current > methods. If we want to make joins very fast we should implement them using RD trees. For the example cases where a join against a very large table will produce a much smaller output, a RD tree will provide pretty much the optimal behavior at a very low memory cost. On the subject of high speed tree code for in-core applications, you should check out http://judy.sourceforge.net/ . The performance (insert, remove, lookup, AND storage) is really quite impressive. Producing cache friendly code is harder than one might expect, and it appears the judy library has already done a lot of the hard work. Though it is *L*GPLed, so perhaps that might scare some here away from it. :) and good luck directly doing joins with a LC-TRIE. ;) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Which one FreeBSD or Linux
FreeBSD or Linux , which system has better performance for PostgreSQL ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PERFORM] A Better External Sort?
Ron Peacetree wrote: The good news is all this means it's easy to demonstrate that we can improve the performance of our sorting functionality. Assuming we get the abyssmal physical IO performance fixed... (because until we do, _nothing_ is going to help us as much) I for one would be paying more attention if such a demonstration were forthcoming, in the form of a viable patch and some benchmark results. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend