Re: [PERFORM] Perfomance of views
Svenne Krap wrote: Hi there. I am currently building a system, where it would be nice to use multiple levels of views upon each other (it is a staticstics system, where traceability is important). Is there any significant performance reduction in say 10 levels of views instead of one giant, nested sql-statement ? I especially think exection planner-wise. The planner tries to push conditions "inside" views where it can. It's not perfect though, and if you're writing a big query by hand you might be able to do better than it. In short, I'd test if you can. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Perfomance of views
What do you mean exactly but "pushing conditions inside" ? I don't think I will have the option of testing on the full queries, as these take many days to write (the current ones, they are replacing on a mssql takes up more that 5kb of query). The current ones are nightmares from a maintaince standpoint. Basicly what the application is doing is selecting some base data from the "large" table for a point in time (usually a quarter) and selects all matching auxilliare data from the other tables. They are made in a time-travel like manner with a first and last useable date. The ways I have considered was : 1) write a big query in hand (not preferred as it gets hard to manage) 2) write layers of views (still not prefered as I still have to remember to put on the right conditions everywhere) 3) write layers of sql-functions (returning the right sets of rows from the underlying tables) - which I prefer from a development angel .. it gets very clean and I cant forget a parameter anywhere. But I seem to remember (and I have used PGSQL in production since 7.0) that the planner has some problems with solution 3 (i.e. estimating the cost and rearranging the query), but frankly that would be the way I would like to go. Based on the current (non-optimal) design and hardware constraints, I still have to make sure, the query runs fairly optimal - that means the planner must use indexes intelligently and other stuff as if it was (well-)written using solution 1. What do you think of the three solutions ? And is there some ressource about the planners capabilites for someone like me (that is very used to write reasonably fast and complex sql, can read c-code, but does not really want to dig into the source code) Regards Svenne Richard Huxton wrote: Svenne Krap wrote: Hi there. I am currently building a system, where it would be nice to use multiple levels of views upon each other (it is a staticstics system, where traceability is important). Is there any significant performance reduction in say 10 levels of views instead of one giant, nested sql-statement ? I especially think exection planner-wise. The planner tries to push conditions "inside" views where it can. It's not perfect though, and if you're writing a big query by hand you might be able to do better than it. In short, I'd test if you can. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Perfomance of views
Don't forget to CC the list Svenne Krap wrote: What do you mean exactly but "pushing conditions inside" ? If I have something like "SELECT * FROM complicated_view WHERE foo = 7" then the planner can look "inside" complicated_view and see where it can attach the condition "foo=7", rather than running the query and applying the condition at the end. There are cases where it is safe for the planner to do this, but it isn't smart enough to do so. I don't think I will have the option of testing on the full queries, as these take many days to write (the current ones, they are replacing on a mssql takes up more that 5kb of query). The current ones are nightmares from a maintaince standpoint. Hmm - it sounds like they would be. Basicly what the application is doing is selecting some base data from the "large" table for a point in time (usually a quarter) and selects all matching auxilliare data from the other tables. They are made in a time-travel like manner with a first and last useable date. The ways I have considered was : 1) write a big query in hand (not preferred as it gets hard to manage) Agreed. 2) write layers of views (still not prefered as I still have to remember to put on the right conditions everywhere) This is what I'd probably do, but of course I don't have full information about your situation. 3) write layers of sql-functions (returning the right sets of rows from the underlying tables) - which I prefer from a development angel .. it gets very clean and I cant forget a parameter anywhere. But I seem to remember (and I have used PGSQL in production since 7.0) that the planner has some problems with solution 3 (i.e. estimating the cost and rearranging the query), but frankly that would be the way I would like to go. Well, 8.x can "inline" a simple sql function into a larger query, but it doesn't sound like that will be enough in your case. Once a function becomes a "black box" then there's not much the planner can do to figure out what to do. Based on the current (non-optimal) design and hardware constraints, I still have to make sure, the query runs fairly optimal - that means the planner must use indexes intelligently and other stuff as if it was (well-)written using solution 1. Well, #1,#2 are likely to be the most efficient, but you won't know for sure about #2 until you test it. There are a couple of other options though: #4 - Write a set-returning function that breaks the query into steps and executes each in turn. So - fetch IDs from the main table in step 1 and store them in a temporary table, join other tables in later steps. #5 - Write a function that writes your big query for you and either returns the SQL to your application, or runs it and returns the results. What do you think of the three solutions ? And is there some ressource about the planners capabilites for someone like me (that is very used to write reasonably fast and complex sql, can read c-code, but does not really want to dig into the source code) There is some stuff in the "Internals" section of the manuals and it might be worth rummaging around on http://techdocs.postgresql.org -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] browsing table with 2 million records
Christopher > > - Present a nifty date selector to choose the records from any day, > > hour, minute, second > > - show them, with "next day" and "previous day" buttons > > > > - It's more useful to the user (most likely he wants to know what > > happened on 01/05/2005 rather than view page 2857) > > - It's faster (no more limit/offset ! just "date BETWEEN a AND b", > > indexed of course) > > - no more new items pushing old ones to the next page while you > browse > > - you can pretend to your boss it's just like a paginated list > > All very well and good, but now do it generically... I've done it... First of all I totally agree with PFC's rant regarding absolute positioning while browsing datasets. Among other things, it has serious problems if you have multiple updating your table. Also it's kind of silly to be doing this in a set based data paradigm. The 'SQL' way to browse a dataset is by key. If your key has multiple parts or you are trying to sort on two or more fields, you are supposed to use the row constructor: select * from t where (x, y) > (xc, yc) order by x,y; Unfortunately, this gives the wrong answer in postgresql :(. The alternative is to use boolean logic. Here is a log snippit from my ISAM driver (in ISAM, you are *always* browsing datasets): prepare system_read_next_menu_item_favorite_file_0 (character varying, int4, int4, int4) as select from system.menu_item_favorite_file where mif_user_id >= $1 and (mif_user_id > $1 or mif_menu_item_id >= $2) and (mif_user_id > $1 or mif_menu_item_id > $2 or mif_sequence_no > $3) order by mif_user_id, mif_menu_item_id, mif_sequence_no limit $4 This is a Boolean based 'get next record' in a 3 part key plus a parameterized limit. You can do this without using prepared statements of course but with the prepared version you can at least do execute system_read_next_menu_item_favorite_file_0('abc', 1, 2, 1); Merlin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] insertion of bytea
> On Tue, Oct 25, 2005 at 03:44:36PM +0200, Chris Mair wrote: >>Is there a better, faster way to do these inserts? > > COPY is generally the fastest way to do bulk inserts (see > PQputCopyData). Hi, I've rewritten the testclient now to use COPY, but I'm getting the exact same results as when doing bundled, prepared inserts. I'm CPU-bound with an I/O well below what my disks could do :( Bye, Chris. PS1: someone off-list suggested using oprofile, which I will do. PS2: in case somebody is iterested, the test client is here: http://www.1006.org/tmp/20051027/ pgclient-1.1.c is prepared inserts, 2.0 is binary copy. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Perfomance of views
On 10/27/2005 7:29 AM, Richard Huxton wrote: Don't forget to CC the list Svenne Krap wrote: What do you mean exactly but "pushing conditions inside" ? If I have something like "SELECT * FROM complicated_view WHERE foo = 7" then the planner can look "inside" complicated_view and see where it can attach the condition "foo=7", rather than running the query and applying the condition at the end. Sorry, but the planner doesn't attach the condition anywhere. It is the rewriter that takes the actual query, replaces the views rangetable and expression entries with the actual underlying objects and adds the views condition with an AND to the queries condition. Simply example: Given a view create view v1 as select a1, b1, c2 from t1, t2 where a1 = a2; The statement select * from v1 where b1 = 'foo'; will result in a parsetree equivalent to what you would get if the original query was select a1, b1, c2 from t1, t2 where (b1 = 'foo') and (a1 = a2); It is the planners and optimizers job to recognize where in the execution plan it can push qualifications down into filters or even scankeys. The planner should be able to realize that select * from v1 where a1 = 42; is in fact equivalent to select a1, b1, c2 from t1, t2 where a1 = 42 and a1 = a2; as well as select a1, b1, c2 from t1, t2 where a1 = 42 and a1 = a2 and a2 = 42; This very last addition of "a2 = 42" because of "a2 = a1 = 42" allows it to put a constant scankey onto the scan of t2. The 8.0 planner does that, so the resulting query plan for the last three selects above is absolutely identical. There are cases where it is safe for the planner to do this, but it isn't smart enough to do so. Example? Jan I don't think I will have the option of testing on the full queries, as these take many days to write (the current ones, they are replacing on a mssql takes up more that 5kb of query). The current ones are nightmares from a maintaince standpoint. Hmm - it sounds like they would be. Basicly what the application is doing is selecting some base data from the "large" table for a point in time (usually a quarter) and selects all matching auxilliare data from the other tables. They are made in a time-travel like manner with a first and last useable date. The ways I have considered was : 1) write a big query in hand (not preferred as it gets hard to manage) Agreed. 2) write layers of views (still not prefered as I still have to remember to put on the right conditions everywhere) This is what I'd probably do, but of course I don't have full information about your situation. 3) write layers of sql-functions (returning the right sets of rows from the underlying tables) - which I prefer from a development angel .. it gets very clean and I cant forget a parameter anywhere. But I seem to remember (and I have used PGSQL in production since 7.0) that the planner has some problems with solution 3 (i.e. estimating the cost and rearranging the query), but frankly that would be the way I would like to go. Well, 8.x can "inline" a simple sql function into a larger query, but it doesn't sound like that will be enough in your case. Once a function becomes a "black box" then there's not much the planner can do to figure out what to do. Based on the current (non-optimal) design and hardware constraints, I still have to make sure, the query runs fairly optimal - that means the planner must use indexes intelligently and other stuff as if it was (well-)written using solution 1. Well, #1,#2 are likely to be the most efficient, but you won't know for sure about #2 until you test it. There are a couple of other options though: #4 - Write a set-returning function that breaks the query into steps and executes each in turn. So - fetch IDs from the main table in step 1 and store them in a temporary table, join other tables in later steps. #5 - Write a function that writes your big query for you and either returns the SQL to your application, or runs it and returns the results. What do you think of the three solutions ? And is there some ressource about the planners capabilites for someone like me (that is very used to write reasonably fast and complex sql, can read c-code, but does not really want to dig into the source code) There is some stuff in the "Internals" section of the manuals and it might be worth rummaging around on http://techdocs.postgresql.org -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # -
[PERFORM] how postgresql request the computer resources
Is there something that tells postgres to take the resorces from computer (RAM, HDD, SWAP on linux) as it need, not modifying variables on postgresql.conf and other operating system things? A days ago i am trying to show that postgres is better than mssql but when execute a simple query like: (1) select count(*) from ( select archivo from fotos except select archivo from archivos ) x; Aggregate (cost=182162.83..182162.84 rows=1 width=0) (actual time=133974.495..133974.498 rows=1 loops=1) -> Subquery Scan x (cost=173857.98..181830.63 rows=132878 width=0) (actual time=109148.158..15.279 rows=169672 loops=1) -> SetOp Except (cost=173857.98..180501.86 rows=132878 width=58) (actual time=109148.144..132094.382 rows=169672 loops=1) -> Sort (cost=173857.98..177179.92 rows=1328775 width=58) (actual time=109147.656..113870.975 rows=1328775 loops=1) Sort Key: archivo -> Append (cost=0.00..38710.50 rows=1328775 width=58) (actual time=27.062..29891.075 rows=1328775 loops=1) -> Subquery Scan "*SELECT* 1" (cost=0.00..17515.62 rows=523431 width=58) (actual time=27.052..9560.719 rows=523431 loops=1) -> Seq Scan on fotos (cost=0.00..12281.31 rows=523431 width=58) (actual time=27.038..5390.238 rows=523431 loops=1) -> Subquery Scan "*SELECT* 2" (cost=0.00..21194.88 rows=805344 width=58) (actual time=10.803..12117.788 rows=805344 loops=1) -> Seq Scan on archivos (cost=0.00..13141.44 rows=805344 width=58) (actual time=10.784..5420.164 rows=805344 loops=1) Total runtime: 134552.325 ms (2) select count(*) from fotos where archivo not in (select archivo from archivos) Aggregate (cost=29398.98..29398.99 rows=1 width=0) (actual time=26660.565..26660.569 rows=1 loops=1) -> Seq Scan on fotos (cost=15154.80..28744.69 rows=261716 width=0) (actual time=13930.060..25859.340 rows=169799 loops=1) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on archivos (cost=0.00..13141.44 rows=805344 width=58) (actual time=0.319..5647.043 rows=805344 loops=1) Total runtime: 26747.236 ms (3) select count(1) from fotos f where not exists (select a.archivo from archivos a where a.archivo=f.archivo) Aggregate (cost=1761354.08..1761354.09 rows=1 width=0) (actual time=89765.384..89765.387 rows=1 loops=1) -> Seq Scan on fotos f (cost=0.00..1760699.79 rows=261716 width=0) (actual time=75.556..0.234 rows=169799 loops=1) Filter: (NOT (subplan)) SubPlan -> Index Scan using archivos_archivo_idx on archivos a (cost=0.00..13451.40 rows=4027 width=58) (actual time=0.147..0.147 rows=1 loops=523431) Index Cond: ((archivo)::text = ($0)::text) Total runtime: 89765.714 ms (4) SELECT count(*) FROM fotos f LEFT JOIN archivos a USING(archivo) WHERE a.archivo IS NULL Aggregate (cost=31798758.40..31798758.41 rows=1 width=0) (actual time=114267.337..114267.341 rows=1 loops=1) -> Merge Left Join (cost=154143.73..31772412.02 rows=10538550 width=0) (actual time=85472.696..113392.399 rows=169799 loops=1) Merge Cond: ("outer"."?column2?" = "inner"."?column2?") Filter: ("inner".archivo IS NULL) -> Sort (cost=62001.08..63309.66 rows=523431 width=58) (actual time=38018.343..39998.201 rows=523431 loops=1) Sort Key: (f.archivo)::text -> Seq Scan on fotos f (cost=0.00..12281.31 rows=523431 width=58) (actual time=0.158..4904.410 rows=523431 loops=1) -> Sort (cost=92142.65..94156.01 rows=805344 width=58) (actual time=47453.790..50811.216 rows=805701 loops=1) Sort Key: (a.archivo)::text -> Seq Scan on archivos a (cost=0.00..13141.44 rows=805344 width=58) (actual time=0.206..7160.148 rows=805344 loops=1) Total runtime: 114893.116 ms WITH ANY OF THIS QUERIES MSSQL TAKES NOT MUCH OF 7 SECONDS PLEASE HELP ME _ Consigue aquí las mejores y mas recientes ofertas de trabajo en América Latina y USA: http://latam.msn.com/empleos/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Performance issues with custom functions
Tom Lane wrote: This is fairly hard to read ... it would help a lot if you had shown the view definitions that the query relies on, so that we could match up the plan elements with the query a bit better. I wasn't sure how helpful it would be. Here they are: create view development.network as select * from odbc_select('amsterdam', 'bob.dbo.network') as ( network_id varchar , status_cd varchar , name varchar , network_action varchar , physical_type_cd varchar , service_type_cd varchar , parent_network_id varchar , commission_network_id varchar , rep_id varchar , tax_id varchar , url varchar , entry_method_cd varchar , entry_individual_type_cd varchar , entry_individual_id varchar , service varchar (30), cost_routine varchar (150), commission_rate numeric(5, 5) , directory_number varchar (11), search_url varchar (200), member_rate numeric(15, 2) , free_months numeric(18, 0) , eligibility_hound varchar (60) ) create view development.network_state as select * from odbc_select('amsterdam', 'bob.dbo.network_state') as ( network_id varchar, state_cd varchar, product varchar (100) , status_cd varchar, entry_method_cd varchar, entry_individual_type_cd varchar, entry_individual_id varchar, logo_id int , from_date timestamp , thru_date timestamp ) create view development.xlat_tbl as select * from odbc_select('amsterdam', 'xlat_tbl') as ( field_name varchar , field_value varchar , status_cd varchar , descr varchar , descrshort varchar , entry_method_cd varchar , entry_individual_type_cd varchar , entry_individual_id varchar ) However, I'm thinking the problem is with this IN clause: where pl.network_id in (select ns.network_id from development.network_state ns where ns.from_date < current_time and (ns.thru_date > current_time or ns.thru_date is null) and (ns.state_cd = pl.state_cd or ns.state_cd='') ) Because the sub-SELECT references pl.state_cd (an outer variable reference), there's no chance of optimizing this into a join-style IN. So the sub-SELECT has to be re-executed for each row of the outer query. BTW, it's not apparent to me that your "flattened" query gives the same answers as the original. What if a pl row can join to more than one row of the ns output? Well, I guess you are right. As far as the database can tell, the queries aren't the same. In practice, they are. network_state is essentially tracking our contract dates with different discount healthcare networks. from_date and thru_date track the timeframe we use that network, with thru_date being null for the current networks. Some networks cover all states, in which case state_cd is an empty string. Otherwise, there will be a row per state covered. I can't think of any way to enforce data integrity on this other than maybe via triggers. Is there any way to make things more clear to the database (both in general and on the postgres end of this) ? At the moment, the SQL Server table has the primary key defined as (network_id, state_cd, product), which is ok for now, but I'm realizing going forward could be an issue if we ever stopped using a network in a state and then went back to it. I guess the next question is, is there any way I can give postgres hints about what constraints exist on the data in these views? Ed
Re: [PERFORM] browsing table with 2 million records
I've done it... First of all I totally agree with PFC's rant regarding absolute positioning while browsing datasets. Among other things, it has serious problems if you have multiple updating your table. Also it's kind of silly to be doing this in a set based data paradigm. Recently I've been browsing some site and it had this problem : as users kept adding new entries as I was browsing the list page after page, when I hit "next page" I got on the next page half of what I already saw on the previous page. Of course the webmaster has set the "visited links" color the same as "unvisited links", so I couldn't tell, and had to use my brain, which was quite upsetting XDDD And bookmarking a page to resume browsing at some later time does not work either, because if I bookmark page 15, then when I come back, users have added 10 pages of content and what I bookmarked is now on page 25... All very well and good, but now do it generically... Hehe. I like ranting... It is not possible to do it in a generic way that works in all cases. For instance : Forum topic case : - posts are added at the bottom and not at the top - page number is relevant and meaningful However, in most cases, you can use a multipart key and get it right. Suppose that, for instance, you have a base of several million records, organized according to : - date (like the original poster) or : - country, region, city, customer last name, first name. You could ask for the first three, but then you'll get 5 Smiths in New York and 1 Van Bliezinsky. Or you could precalculate, once a week, a key interval distribution that creates reasonable sized intervals (for instance, 100 values in each), maybe asking that each interval should only contain only one city. So, you would get : Country Region City LastNameFirstName USA NYC NY Smith, '' USA NYC NY Smith, Albert USA NYC NY Smith, Bernard . USA NYC NY Smith, William ... USA NYC NY Von Braun ... So you'd predetermine your "page breaks" ahead of time, and recompute them once in a while. You won't get identically sized pages, but if the statistical distribution of the data plays nice, you should get evenly sized pages. The interesting part is that you can present the user with a selector which presents meaningful and useful data, AND is fast to compute, AND is fast to use. In this case, it would amount to "Select country, region, city", then, display a list like this : Smith, ...Albert Smith, Albus...Bernard ... Smith, William... ... Von Braun...Von Schwarts ... So Jeannette Smith would be easy to find, being in the link "Smith, Jean...John" for instance. If the aim is to quickly locate a particular record, I like javascript-powered autocompletion better ; but for browsing, this pagination method is cool. Regards ! ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Perfomance of views
Svenne Krap <[EMAIL PROTECTED]> writes: > The ways I have considered was : > 1) write a big query in hand (not preferred as it gets hard to manage) > 2) write layers of views (still not prefered as I still have to remember > to put on the right conditions everywhere) > 3) write layers of sql-functions (returning the right sets of rows from > the underlying tables) - which I prefer from a development angel .. it > gets very clean and I cant forget a parameter anywhere. #1 and #2 should behave pretty similarly, assuming that the "one big query" would have been structured the same way as the nest of views is. #3 unfortunately will pretty much suck, because there's no chance for cross-level optimization. There's been some discussion of inline-expanding SQL functions that return sets when they are called in FROM, which would make a SQL function that contains just a SELECT effectively equivalent to a view as far as the planner's powers of optimization go. No one's tried to make it happen yet though. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Perfomance of views
Tom Lane wrote: There's been some discussion of inline-expanding SQL functions that return sets when they are called in FROM, which would make a SQL function that contains just a SELECT effectively equivalent to a view as far as the planner's powers of optimization go. No one's tried to make it happen yet though. This is exactly what would be brilliant in my case. Use the functions as a kind of strict, parameterized views, that in the planner (or wherever) gets replaced down to a simple (?!?!) sql-statement. This would imho be highly valuable for almost any kind of complex time-travel application (and surely dozens of other applications). And before anyone suggests it, I don't code C well enough (*cough* rusty) to try to do it myself. I would apriciate if it went on the todo for 8.2 though. (I might even be willing to sponsor some money (a single or perhpas two thousands of US dollars) for getting it done and release it immediately under postgresql standard license (BSD)). I by the way also support the idea of a way to force a table into a PgSQL managed cache like discussed a while ago. Sometimes overall speed for the system is less important than speed of a single query. I must also say, that I am very impressed with the performance enhancements of 8.1 beta, the bitmap index scans are amazing ! Good job, guys - PgSQL has come a far way from 7.0 (where I started) and the future looks bright ;) Svenne ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Update using primary key slow
The following update was captured in the database log and the elapsed time was 1058.956 ms. A later explain analyze shows total run time of 730 ms. Although isn't the actual time to update the row 183 ms. Where is the other 547 ms coming from? Updating the two secondary indexes?? Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-1] LOG: duration: 1058.956 ms statement: UPDATE CONTRACT SET CUSTOMER_KEY = 143567 ,SOURCE_CODE_KEY = null ,PRIOR_CONTRACT = Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-2] '265985' ,PRIOR_CONTRACT_ELITE = null ,CONTRACT_NEW = 'N' ,RENEWAL_CONTRACT = '1373990' ,RENEWAL_CONTRACT_ELITE = null Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-3] ,CONTRACT_DROPPED = 'N' ,TOTAL_SALE_DOLLARS = 3492.00 ,TARGET_DOLLARS = 3576 ,ASSIGN_DOLLARS_OVERRIDE = null ,BOOK_KEY = 160 Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-4] ,PUB_SEQUENCE = 25 ,DROP_DATE = null ,OUT_OF_BUSINESS = 'N' ,RENEWAL_SALESREP_KEY = 3639 ,SALESREP_KEY = 3639 ,NEW_CATEGORY = Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-5] 'NEW_INSERT' ,PENDING_DELETE_DATE = null ,CLIENT_NAME = null ,DATE_SOLD = '2004-08-30' ,DATE_RECEIVED = '2004-09-03' Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-6] ,DATE_ENTERED = '2004-09-07' ,DATE_SHELLED = null ,DATE_APPROVED = '2004-09-09' WHERE REGION_KEY = 14 AND CONTRACT_KEY = Oct 27 08:09:48 ybcdrdwdb01 postgres[12315]: [145-7] 1070135 The explain for this update is as follows. dw=# begin; BEGIN dw=# dw=# explain analyze UPDATE CONTRACT SET CUSTOMER_KEY = 143567, SOURCE_CODE_KEY = null , dw-# PRIOR_CONTRACT = '265985' ,PRIOR_CONTRACT_ELITE = null ,CONTRACT_NEW = 'N' ,RENEWAL_CONTRACT = '1373990' ,RENEWAL_CONTRACT_ELITE = null dw-# ,CONTRACT_DROPPED = 'N' ,TOTAL_SALE_DOLLARS = 3492.00 ,TARGET_DOLLARS = 3576 ,ASSIGN_DOLLARS_OVERRIDE = null ,BOOK_KEY = 160 dw-# ,PUB_SEQUENCE = 25 ,DROP_DATE = null ,OUT_OF_BUSINESS = 'N' ,RENEWAL_SALESREP_KEY = 3639 ,SALESREP_KEY = 3639 dw-# ,NEW_CATEGORY = 'NEW_INSERT' ,PENDING_DELETE_DATE = null ,CLIENT_NAME = null ,DATE_SOLD = '2004-08-30' ,DATE_RECEIVED = '2004-09-03' dw-# ,DATE_ENTERED = '2004-09-07' ,DATE_SHELLED = null ,DATE_APPROVED = '2004-09-09' dw-# WHERE REGION_KEY = 14 AND CONTRACT_KEY = 1070135; QUERY PLAN --- Index Scan using contract_pkey on contract (cost=0.00..10.61 rows=3 width=115) (actual time=0.181..0.183 rows=1 loops=1) Index Cond: ((contract_key = 1070135) AND (region_key = 14)) Total runtime: 0.730 ms (3 rows) dw=# rollback; ROLLBACK Here is the table and index definitions dw=# \d contract Table "ods.contract" Column |Type | Modifiers -+-+--- contract_key| integer | not null customer_key| integer | not null source_code_key | integer | prior_contract | character varying(20) | prior_contract_elite| character varying(20) | renewal_contract| character varying(20) | contract_dropped| character varying(1)| not null renewal_contract_elite | character varying(20) | total_sale_dollars | numeric(9,2)| not null assign_dollars_override | numeric(9,2)| target_dollars | numeric(9,2)| book_key| integer | not null pub_sequence| integer | drop_date | timestamp without time zone | out_of_business | character varying(1)| not null salesrep_key| integer | renewal_salesrep_key| integer | new_category| character varying(20) | region_key | integer | not null contract_new| character varying(1)| not null pending_delete_date | timestamp without time zone | client_name | character varying(150) | fuzzy_client_name | character varying(150) | last_update_date| timestamp without time zone | default now() date_sold | date| date_received | date| date_entered| date| date_shelled| date| date_approved | date| Indexes: "contract_pkey" primary key, btree (contract_key, region_key) "XIE1_Contract" btree (region_key, book_key, pub_sequence) "XIE2_Contract" btree (customer_key, region_key) The table contains approximately 5 million rows Postgres version is PostgreSQL 7.4.3 on i686-pc-linux-gnu, compile
Re: [PERFORM] Perfomance of views
Jan Wieck wrote: On 10/27/2005 7:29 AM, Richard Huxton wrote: Svenne Krap wrote: What do you mean exactly but "pushing conditions inside" ? If I have something like "SELECT * FROM complicated_view WHERE foo = 7" then the planner can look "inside" complicated_view and see where it can attach the condition "foo=7", rather than running the query and applying the condition at the end. Sorry, but the planner doesn't attach the condition anywhere. It is the rewriter that takes the actual query, replaces the views rangetable and expression entries with the actual underlying objects and adds the views condition with an AND to the queries condition. Simply example: Thanks for the correction Jan. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] how postgresql request the computer resources
Sidar López Cruz wrote: Is there something that tells postgres to take the resorces from computer (RAM, HDD, SWAP on linux) as it need, not modifying variables on postgresql.conf and other operating system things? Ah, and how is it to know what to share with other processes? A days ago i am trying to show that postgres is better than mssql but when execute a simple query like: (1) select count(*) from Total runtime: 134552.325 ms (2) select count(*) from fotos where archivo not in (select archivo from Total runtime: 26747.236 ms (3) select count(1) from fotos f where not exists (select a.archivo from Total runtime: 89765.714 ms (4) SELECT count(*) Total runtime: 114893.116 ms WITH ANY OF THIS QUERIES MSSQL TAKES NOT MUCH OF 7 SECONDS In which case they make a bad choice for showing PostgreSQL is faster than MSSQL. Is this the only query you have, or are others giving you problems too? I think count(*) is about the weakest point in PG, but I don't think there'll be a general solution available soon. As I'm sure someone has mentioned, whatever else, PG needs to check the row for its visibility information. From the start of your email, you seem to suspect your configuration needs some work. Once you are happy that your settings in general are good, you can override some by issuing set statements before your query. For example: SET work_mem = 1; might well improve example #2 where you had a hash. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Update using primary key slow
Denis <[EMAIL PROTECTED]> writes: > The following update was captured in the database log and the elapsed time > was 1058.956 ms. A later explain analyze shows total run time of 730 ms. > Although isn't the actual time to update the row 183 ms. Where is the > other 547 ms coming from? Updating the two secondary indexes?? The 183 msec is the time needed to *fetch* the row, not the time to update it. So it could well be that the other time is just the time needed to update the table and indexes. If this seems slower than your hardware ought to be able to handle, I'd wonder about how recently the table has been vacuumed. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Update using primary key slow
[EMAIL PROTECTED] (Tom Lane) wrote in news:[EMAIL PROTECTED]: > Denis <[EMAIL PROTECTED]> writes: >> The following update was captured in the database log and the elapsed >> time was 1058.956 ms. A later explain analyze shows total run time >> of 730 ms. Although isn't the actual time to update the row 183 ms. >> Where is the other 547 ms coming from? Updating the two secondary >> indexes?? > > The 183 msec is the time needed to *fetch* the row, not the time to > update it. So it could well be that the other time is just the time > needed to update the table and indexes. If this seems slower than > your hardware ought to be able to handle, I'd wonder about how > recently the table has been vacuumed. > >regards, tom lane > > ---(end of > broadcast)--- TIP 1: if posting/reading > through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that >your message can get through to the mailing list cleanly > > There is a vacuumdb done at 6 AM and 5PM In addition this table is vacuumed at 2AM, 8AM, 10AM, 12PM, 2PM, and 4PM This is the vacuum from last night at 5PM INFO: vacuuming "ods.contract" INFO: index "XIE1_Contract" now contains 5105322 row versions in 27710 pages DETAIL: 2174 index row versions were removed. 893 index pages have been deleted, 893 are currently reusable. CPU 1.91s/1.58u sec elapsed 34.14 sec. INFO: index "XIE2_Contract" now contains 5105331 row versions in 21701 pages DETAIL: 2174 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 1.40s/1.42u sec elapsed 22.73 sec. INFO: index "contract_pkey" now contains 5105337 row versions in 21480 pages DETAIL: 2174 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 1.80s/1.52u sec elapsed 18.59 sec. INFO: "contract": removed 2174 row versions in 893 pages DETAIL: CPU 0.42s/0.08u sec elapsed 1.22 sec. INFO: "contract": found 2174 removable, 5105321 nonremovable row versions in 129154 pages DETAIL: 1357 dead row versions cannot be removed yet. There were 1967941 unused item pointers. 0 pages are entirely empty. CPU 11.38s/5.09u sec elapsed 85.48 sec. INFO: analyzing "ods.contract" INFO: "contract": 129154 pages, 3000 rows sampled, 5277622 estimated total rows Here is the latest vacuum today. INFO: vacuuming "ods.contract" INFO: index "XIE1_Contract" now contains 5106346 row versions in 28233 pages DETAIL: 64146 index row versions were removed. 706 index pages have been deleted, 669 are currently reusable. CPU 2.03s/2.33u sec elapsed 20.08 sec. INFO: index "XIE2_Contract" now contains 5106347 row versions in 21951 pages DETAIL: 64146 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 2.16s/3.39u sec elapsed 12.23 sec. INFO: index "contract_pkey" now contains 5106347 row versions in 21516 pages DETAIL: 64146 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 1.76s/2.47u sec elapsed 11.80 sec. INFO: "contract": removed 64146 row versions in 26115 pages DETAIL: CPU 1.94s/2.55u sec elapsed 7.78 sec. INFO: "contract": found 64146 removable, 5106307 nonremovable row versions in 129154 pages DETAIL: 890 dead row versions cannot be removed yet. There were 1905028 unused item pointers. 0 pages are entirely empty. CPU 14.83s/11.48u sec elapsed 60.96 sec. INFO: analyzing "ods.contract" INFO: "contract": 129154 pages, 3000 rows sampled, 5236929 estimated total rows I would think this should be very fast. I already described the CPU and memory. THe disk is backed by an EMC DMX2000. This particular server has 1 physical volume group of 500GB which is split over two logical volumes. One for $PGDATA and the other ofr PG_XLOG. THis split was not really done for performance since it comes from the same physical volume group, but more for space manageability. The physical volume group consists of 11GB stripes from across the EMC san. So that would be about 50 stripes which is really coming from dozens of backend disk drives. Typical I/O response times for these is 3-5 ms. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] How much memory?
Is there a rule-of-thumb for determining the amount of system memory a database requres (other than "all you can afford")? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Update using primary key slow
Denis <[EMAIL PROTECTED]> writes: > [EMAIL PROTECTED] (Tom Lane) wrote in > news:[EMAIL PROTECTED]: >> The 183 msec is the time needed to *fetch* the row, not the time to >> update it. So it could well be that the other time is just the time >> needed to update the table and indexes. If this seems slower than >> your hardware ought to be able to handle, I'd wonder about how >> recently the table has been vacuumed. > Here is the latest vacuum today. > INFO: "contract": removed 64146 row versions in 26115 pages > DETAIL: CPU 1.94s/2.55u sec elapsed 7.78 sec. > INFO: "contract": found 64146 removable, 5106307 nonremovable row > versions in 129154 pages > DETAIL: 890 dead row versions cannot be removed yet. > There were 1905028 unused item pointers. The "unused item pointers" number seems a bit high, but otherwise that looks pretty reasonable. Is it possible that the particular row you were updating has been updated quite a lot of times since the last vacuum? Or even quite a few times within a single transaction? The only thing I can think of that would explain such a slow fetch is if the code has to reject a bunch of recently-dead versions of the row. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] What gets cached?
Thank each of you for your replies. I'm just beginning to understand the scope of my opportunities. Someone (I apologize, I forgot who) recently posted this query: SELECT oid::regclass, reltuples, relpages FROM pg_class ORDER BY 3 DESC Though the application is a relatively low-volume TP system, it is structured a lot like a data warehouse with one primary table that everything else hangs off. What the query above shows is that my largest table, at 34 million rows, takes almost 1.4 million pages or 10+ Gb if my math is good. The same table has 14 indexes, totaling another 12Gb. All this is running on a box with 4Gb of memory. So what I believe I see happening is that almost every query is clearing out memory to load the particular index it needs. Hence my "first queries are the fastest" observation at the beginning of this thread. There are certainly design improvements to be done, but I've already started the process of getting the memory increased on our production db server. We are btw running 8.1 beta 3. ""Steinar H. Gunderson"" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On Mon, Oct 24, 2005 at 11:09:55AM -0400, Alex Turner wrote: >> Just to play devils advocate here for as second, but if we have an >> algorithm >> that is substational better than just plain old LRU, which is what I >> believe >> the kernel is going to use to cache pages (I'm no kernel hacker), then >> why >> don't we apply that and have a significantly larger page cache a la >> Oracle? > > There have (AFAIK) been reports of setting huge amounts of shared_buffers > (close to the total amount of RAM) performing much better in 8.1 than in > earlier versions, so this might actually be okay these days. > > I haven't heard of anybody reporting increase setting such values, though. > > /* Steinar */ > -- > Homepage: http://www.sesse.net/ > > > ---(end of broadcast)--- > TIP 2: Don't 'kill -9' the postmaster > ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Reasons and drawbacks for unused item pointers (was: Update using primary key slow)
Tom Lane <[EMAIL PROTECTED]> writes: > Denis <[EMAIL PROTECTED]> writes: >> There were 1905028 unused item pointers. > The "unused item pointers" number seems a bit high, but otherwise that > looks pretty reasonable. > > Is it possible that the particular row you were updating has been > updated quite a lot of times since the last vacuum? Or even quite > a few times within a single transaction? What causes this "unused item pointers" and which impact do they have regarding performance? If I understood your last posting correctly more than one update on a single row between two vacuum's would i.e. result in one ore more "unused item pointer". Does this slow down the vacuum process and/or other processes? Until now I could not find an answer what this number implies. Regards Martin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Reasons and drawbacks for unused item pointers (was: Update using primary key slow)
Martin Lesser <[EMAIL PROTECTED]> writes: > What causes this "unused item pointers" and which impact do they have > regarding performance? Those are item pointer slots that were once used but aren't used at the moment. VACUUM leaves an empty slot behind when it removes a dead tuple, and the slot is then available for re-use next time a tuple is created on that page. See http://developer.postgresql.org/docs/postgres/storage-page-layout.html The direct performance impact is really pretty minimal (and none at all on indexscans, AFAIR). The reason Denis' number drew my attention was that it implied that the table had gone un-vacuumed for awhile at some time in the past. His stats were showing about 64000 tuples deleted per vacuum pass, which would have created 64000 unused item pointers --- but in a steady-state situation those would be eaten up again by the time of the next vacuum. To have 1905028 unused pointers in a table with only 5106307 live entries suggests that at some point there were 1.9 million (or so) dead but not-yet-vacuumed tuples, which suggests insufficient vacuuming. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] how postgresql request the computer resources
Richard Huxton wrote: WITH ANY OF THIS QUERIES MSSQL TAKES NOT MUCH OF 7 SECONDS In which case they make a bad choice for showing PostgreSQL is faster than MSSQL. Is this the only query you have, or are others giving you problems too? I think count(*) is about the weakest point in PG, but I don't think there'll be a general solution available soon. As I'm sure someone has mentioned, whatever else, PG needs to check the row for its visibility information. From the start of your email, you seem to suspect your configuration needs some work. Once you are happy that your settings in general are good, you can override some by issuing set statements before your query. For example: SET work_mem = 1; might well improve example #2 where you had a hash. -- Richard Huxton Archonet Ltd Someone had suggested keeping a vector table with +1 and -1 for row insertion and deletion and then running a cron to sum the vectors and update a table so that you could select from that table to get the row count. Perhaps some sort of SUM() on a column function. Since this seems like a reasonable approach (or perhaps there may be yet another better mechanism), cannot someone add this sort of functionality to Postgresql to do behind the scenes? -Mike ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Reasons and drawbacks for unused item pointers
Tom Lane <[EMAIL PROTECTED]> writes: > Martin Lesser <[EMAIL PROTECTED]> writes: >> What causes this "unused item pointers" and which impact do they have >> regarding performance? > The direct performance impact is really pretty minimal (and none at > all on indexscans, AFAIR). The reason Denis' number drew my attention > was that it implied that the table had gone un-vacuumed for awhile at > some time in the past. [...] To have 1905028 unused pointers in a > table with only 5106307 live entries suggests that at some point there > were 1.9 million (or so) dead but not-yet-vacuumed tuples, which > suggests insufficient vacuuming. Does each update of a single row result in an "unused item pointer"? I.e. if I update one row 10 times between VACUUMing the table the result are 10 unused pointers? Some rows in some of my tables are updated much more frequently than others so I'm not sure whether the number of unused pointers implie that I should VACUUM more often than every 24 hours. Martin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] How much memory?
Databases basically come in 4 sizes: 1= The entire DB fits into memory. 2= The performance critical table(s) fit(s) into memory 3= The indexes of the performance critical table(s) fit into memory. 4= Neither the performance critical tables nor their indexes fit into memory. Performance decreases (exponentially), and development + maintenance cost/difficulty/pain increases (exponentially), as you go down the list. While it is often not possible to be in class "1" above, do everything you can to be in at least class "3" and do everything you can to avoid class "4". At ~$75-$150 per GB as of this post, RAM is the cheapest investment you can make in a high perfomance, low hassle DBMS. IWill's and Tyan's 16 DIMM slot mainboards are worth every penny. ron -Original Message- From: PostgreSQL <[EMAIL PROTECTED]> Sent: Oct 27, 2005 3:31 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] How much memory? Is there a rule-of-thumb for determining the amount of system memory a database requres (other than "all you can afford")? ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] What gets cached?
Did the patch that allows multiple seqscans to piggyback on each other make it into 8.1? It might help in this situation. BTW, if a query requires loading more than a few percent of an index PostgreSQL will usually go with a sequential scan instead. You should check explain/explain analyze on your queries and see what's actually happening. If you've got stats turned on you can also look at pg_stat_user_indexes to get a better idea of what indexes are and aren't being used. On Thu, Oct 27, 2005 at 03:41:10PM -0500, PostgreSQL wrote: > Thank each of you for your replies. I'm just beginning to understand the > scope of my opportunities. > > Someone (I apologize, I forgot who) recently posted this query: > SELECT oid::regclass, reltuples, relpages > FROM pg_class > ORDER BY 3 DESC > > Though the application is a relatively low-volume TP system, it is > structured a lot like a data warehouse with one primary table that > everything else hangs off. What the query above shows is that my largest > table, at 34 million rows, takes almost 1.4 million pages or 10+ Gb if my > math is good. The same table has 14 indexes, totaling another 12Gb. All > this is running on a box with 4Gb of memory. > > So what I believe I see happening is that almost every query is clearing out > memory to load the particular index it needs. Hence my "first queries are > the fastest" observation at the beginning of this thread. > > There are certainly design improvements to be done, but I've already started > the process of getting the memory increased on our production db server. We > are btw running 8.1 beta 3. > > ""Steinar H. Gunderson"" <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] > > On Mon, Oct 24, 2005 at 11:09:55AM -0400, Alex Turner wrote: > >> Just to play devils advocate here for as second, but if we have an > >> algorithm > >> that is substational better than just plain old LRU, which is what I > >> believe > >> the kernel is going to use to cache pages (I'm no kernel hacker), then > >> why > >> don't we apply that and have a significantly larger page cache a la > >> Oracle? > > > > There have (AFAIK) been reports of setting huge amounts of shared_buffers > > (close to the total amount of RAM) performing much better in 8.1 than in > > earlier versions, so this might actually be okay these days. > > > > I haven't heard of anybody reporting increase setting such values, though. > > > > /* Steinar */ > > -- > > Homepage: http://www.sesse.net/ > > > > > > ---(end of broadcast)--- > > TIP 2: Don't 'kill -9' the postmaster > > > > > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly > -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] how postgresql request the computer resources
On Thu, Oct 27, 2005 at 03:58:55PM -0600, Michael Best wrote: > Richard Huxton wrote: > >>WITH ANY OF THIS QUERIES MSSQL TAKES NOT MUCH OF 7 SECONDS > > > > > >In which case they make a bad choice for showing PostgreSQL is faster > >than MSSQL. Is this the only query you have, or are others giving you > >problems too? > > > >I think count(*) is about the weakest point in PG, but I don't think > >there'll be a general solution available soon. As I'm sure someone has > >mentioned, whatever else, PG needs to check the row for its visibility > >information. > > > > From the start of your email, you seem to suspect your configuration > >needs some work. Once you are happy that your settings in general are > >good, you can override some by issuing set statements before your query. > >For example: > >SET work_mem = 1; > >might well improve example #2 where you had a hash. > > > >-- > > Richard Huxton > > Archonet Ltd > > Someone had suggested keeping a vector table with +1 and -1 for row > insertion and deletion and then running a cron to sum the vectors and > update a table so that you could select from that table to get the row > count. Perhaps some sort of SUM() on a column function. > > Since this seems like a reasonable approach (or perhaps there may be yet > another better mechanism), cannot someone add this sort of functionality > to Postgresql to do behind the scenes? There's all kinds of things that could be added; the issue is ascertaining what the performance trade-offs are (there's no such thing as a free lunch) and if the additional code complexity is worth it. Note that your suggestion probably wouldn't work in this case because the user isn't doing a simple SELECT count(*) FROM table;. I'd bet that MSSQL is using index covering to answer his queries so quickly, something that currently just isn't possible with PostgreSQL. But if you search the -hackers archives, you'll find a discussion on adding limited heap tuple visibility information to indexes. That would allow for partial index covering in many cases, which would probably be a huge win for the queries the user was asking about. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] How much memory?
On Thu, Oct 27, 2005 at 06:39:33PM -0400, Ron Peacetree wrote: > Databases basically come in 4 sizes: > > 1= The entire DB fits into memory. > 2= The performance critical table(s) fit(s) into memory > 3= The indexes of the performance critical table(s) fit into memory. > 4= Neither the performance critical tables nor their indexes fit into memory. > > Performance decreases (exponentially), and development + maintenance > cost/difficulty/pain increases (exponentially), as you go down the list. > > While it is often not possible to be in class "1" above, do everything you > can to be in at least class "3" and do everything you can to avoid class "4". > > At ~$75-$150 per GB as of this post, RAM is the cheapest investment you can > make in a high perfomance, low hassle DBMS. IWill's and Tyan's 16 DIMM slot > mainboards are worth every penny. And note that your next investment after RAM should be better disk IO. More CPUs *generally* don't buy you much (if anything). My rule of thumb: the only time your database should be CPU-bound is if you've got a bad design*. *NOTE: before everyone goes off about query parallelism and big in-memory sorts and what-not, keep in mind I said "rule of thumb". :) -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster