Re: [PERFORM] Help with EXPLAIN ANALYZE runtimes
In case anyone is interested, I finally found what I believe to be the cause ... Or at least, I found the solution, and think I understand why. Having read "Section 10.3 Controlling the Planner with Explicit JOIN Clauses" (http://postgresql.org/docs/7.3/interactive/explicit-joins.html), I modified the query to use INNER JOINS with the table datastatus, instead of the implicit cross joins. The INNER JOINS now seem to reduce the choices the planner has to make. The clue was the high number of tables involved, and the repeated reference to the same table. All's well that ends well ... with or without the Karnak headgear. Martin -Original Message- From: Guenzl, Martin [mailto:[EMAIL PROTECTED] Sent: Sunday, 9 January 2005 3:45 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Help with EXPLAIN ANALYZE runtimes Hi, I am a recent convert to Postgresql, and am trying to tune a very slow query across ten tables all with only a few rows at this stage (<20), and was looking for some help to get me out of a dead-end. It runs very slowly both on a hosted Postgresql 7.3.4 server running on FreeBSD UNIX box, and also on a Postgresql 8.0.0.0-rc2 server running on a Win XP box. On the latter, the EXPLAIN ANALYZE returned what I thought was a strange result - here is the excerpt ... (Start) SQL: Query Results QUERY PLAN Unique (cost=7.16..7.32 rows=3 width=188) (actual time=51.000..51.000 rows=16 loops=1) -> Sort (cost=7.16..7.16 rows=3 width=188) (actual time=51.000..51.000 rows=16 loops=1) Sort Key: am.id_assessment, c.id_claim, c.nm_claim, p.id_provider, p.nm_title, p.nm_first, p.nm_last, ad.id_address, ad.nm_address_1, ad.nm_address_2, ad.nm_address_3, ad.nm_suburb, ad.nm_city, s.nm_state_short, ad.nm_postcode, am.dt_assessment, am.dt_booking, ast.nm_assessmentstatus, ast.b_offer_report, asn.id_assessmentstatus, asn.nm_assessmentstatus -> Merge Join (cost=4.60..7.13 rows=3 width=188) (actual time=41.000..51.000 rows=16 loops=1) Merge Cond: ("outer".id_datastatus = "inner".id_datastatus) Join Filter: (("inner".id_claim = "outer".id_claim) AND ("inner".id_assessment = "outer".id_assessment)) : : : -> Index Scan using address_pkey on address ad (cost=0.00..14.14 rows=376 width=76) (actual time=10.000..10.000 rows=82 loops=1) -> Sort (cost=1.05..1.06 rows=3 width=36) (actual time=0.000..0.000 rows=3 loops=1) Sort Key: am.id_address -> Seq Scan on assessment am (cost=0.00..1.03 rows=3 width=36) (actual time=0.000..0.000 rows=3 loops=1) Total runtime: 51.000 ms 44 row(s) Total runtime: 11,452.979 ms (End) It's the bit at the bottom that throws me - I can't work out why one Total runtime says 51ms, and yet the next Total runtime would be 11,452ms. (I'm assuming that the clue to getting the query time down is to solve this puzzle.) I've done vacuum analyze on all tables, but that didn't help. This query stands out among others as being very slow. Any ideas or suggestions? Thanks in advance, Martin ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] PostgreSQL vs. Oracle vs. Microsoft
I'm looking for recent performance statistics on PostgreSQL vs. Oracle vs. Microsoft SQL Server. Recently someone has been trying to convince my client to switch from SyBASE to Microsoft SQL Server (they originally wanted to go with Oracle but have since fallen in love with Microsoft). All this time I've been recommending PostgreSQL for cost and stability (my own testing has shown it to be better at handling abnormal shutdowns and using fewer system resources) in addition to true cross-platform compatibility. If I can show my client some statistics that PostgreSQL outperforms these (I'm more concerned about it beating Oracle because I know that Microsoft's stuff is always slower, but I need the information anyway to protect my client from falling victim to a 'sales job'), then PostgreSQL will be the solution of choice as the client has always believed that they need a high-performance solution. I've already convinced them on the usual price, cross-platform compatibility, open source, long history, etc. points, and I've been assured that if the performance is the same or better than Oracle's and Microsoft's solutions that PostgreSQL is what they'll choose. Thanks in advance. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Some Performance Advice Needed
"[EMAIL PROTECTED] ("Joshua D. Drake")" wrote in pgsql.performance: > Jeff wrote: > >> >> On Dec 23, 2004, at 9:27 AM, Alex wrote: >> >> >>> Running hdparm reported >>> A) 920mb/s (SCSI 10k) >>> B) 270mb/s (SCSI 10k) >>> C) 1750mb/s (IDE 7.2k) >> >> >> >> IDE disks lie about write completion (This can be disabled on some >> drives) whereas SCSI drives wait for the data to actually be written >> before they report success. It is quite >> easy to corrupt a PG (Or most any db really) on an IDE drive. Check >> the archives for more info. > > Do we have any real info on this? Specifically which drives? Is SATA the > same way? What about SATA-II? > I am not saying it isn't true (I know it is) but this is a blanket > statement that may or may not be > true with newer tech. The name hasn't changed, but don't let that give you the wrong impression because SCSI continues to improve. I only use SCSI drives in all my servers, and that's because they always seem to outperform SATA and IDE when there's a multi-user[1] requirement (of course, the choice of OS [2] is an important factor here too). Disk fragmentation also plays a role, but can actually become a hinderance when in a multi-user environment. I find that the caching algorithm in the OS that I usually choose [2] actually performs extremely well when more users are accessing data on volumes where the data is fragmented. I'm told that this is very similar in the Unix environment as well. Defragmentation makes more sense in a single-user environment because there are generally a very small number of files being loaded at one time, and so a user can benefit hugely from defragmentation. Here's an interesting article (it comes complete with anonymous non- logical emotion-based reader comments too): SCSI vs. IDE: Which is really faster? http://hardware.devchannel.org/hardwarechannel/03/10/20/1953249.shtml? tid=20&tid=38&tid=49 [1] A somewhat busy web and/or eMail server certainly counts as a multi- user requirement. Put a database on it where the data isn't being accessed sequentially, and that can certainly meet the requirements too. [2] Nearly all my servers run Novell NetWare. -- Randolf Richardson, pro-active spam fighter - [EMAIL PROTECTED] Vancouver, British Columbia, Canada Sending eMail to other SMTP servers is a privilege. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[PERFORM] Null integer columns
I have an integer column that is not needed for some rows in the table (whether it is necessary is a factor of other row attributes, and it isn't worth putting in a separate table). What are the performance tradeoffs (storage space, query speed) of using NULL versus a sentinel integer value? Not that it matters, but in the event where the column values matter, the numberic value is a foreign key. Advice on that welcome too. Thanks! ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft
On Thu, 6 Jan 2005 19:01:38 + (UTC) Randolf Richardson <[EMAIL PROTECTED]> wrote: > I'm looking for recent performance statistics on PostgreSQL vs. > Oracle > vs. Microsoft SQL Server. Recently someone has been trying to > convince my client to switch from SyBASE to Microsoft SQL Server (they > originally wanted to go with Oracle but have since fallen in love with > Microsoft). All this time I've been recommending PostgreSQL for cost > and stability (my own testing has shown it to be better at handling > abnormal shutdowns and using fewer system resources) in addition to > true cross-platform compatibility. > > If I can show my client some statistics that PostgreSQL > outperforms > these (I'm more concerned about it beating Oracle because I know that > Microsoft's stuff is always slower, but I need the information anyway > to protect my client from falling victim to a 'sales job'), then > PostgreSQL will be the solution of choice as the client has always > believed that they need a high-performance solution. > > I've already convinced them on the usual price, cross-platform > compatibility, open source, long history, etc. points, and I've been > assured that if the performance is the same or better than Oracle's > and Microsoft's solutions that PostgreSQL is what they'll choose. While this doesn't exactly answer your question, I use this little tidbit of information when "selling" people on PostgreSQL. PostgreSQL was chosen over Oracle as the database to handle all of the .org TLDs information. While I don't believe the company that won was chosen solely because they used PostgreSQL vs Oracle ( vs anything else ), it does go to show that PostgreSQL can be used in a large scale environment. Another tidbit you can use in this particular case: I was involved in moving www.ljworld.com, www.lawrence.com, and www.kusports.com from a Sybase backend to a PostgreSQL backend back in 2000-2001. We got roughly a 200% speed improvement at that time and PostgreSQL has only improved since then. I would be more than happy to elaborate on this migration off list if you would like. kusports.com gets a TON of hits especially during "March Madness" and PostgreSQL has never been an issue in the performance of the site. - Frank Wiles <[EMAIL PROTECTED]> http://www.wiles.org - ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft
Quoting Randolf Richardson <[EMAIL PROTECTED]>: > I'm looking for recent performance statistics on PostgreSQL vs. Oracle > > vs. Microsoft SQL Server. Recently someone has been trying to convince my I don't know anything about your customer's requirements other than that they have a DB currently and somebody(ies) is(are) trying to get them to switch to another. I don't think you'll find meaningful numbers unless you do your own benchmarks. DB performance is very largely determined by how the application functions, hardware, OS and the DBA's familiarity with the platform. I would suspect that for any given workload on relatively similar hardware that just about any of the DB's you mention would perform similarly if tuned appropriately. > client to switch from SyBASE to Microsoft SQL Server (they originally wanted > > to go with Oracle but have since fallen in love with Microsoft). All this > time I've been recommending PostgreSQL for cost and stability (my own testing > > has shown it to be better at handling abnormal shutdowns and using fewer > system resources) in addition to true cross-platform compatibility. Right for the customer? How about "Don't fix it if it ain't broke"? Replacing a DB backend isn't always trivial (understatement). I suppose if their application is very simple and uses few if any proprietary features of Sybase then changing the DB would be simple. That depends heavily on the application. In general, though, you probably shouldn't rip and replace DB platforms unless there's a very good strategic reason. I don't know about MSSQL, but I know that, if managed properly, Sybase and Oracle can be pretty rock-solid and high performing. If *you* have found FooDB to be the most stable and highest performing, then that probably means that FooDB is the one you're most familiar with rather than FooDB being the best in all circumstances. PostgreSQL is great. I love it. In the right hands and under the right circumstances, it is the best DB. So is Sybase. And Oracle. And MSSQL. > > If I can show my client some statistics that PostgreSQL outperforms > these (I'm more concerned about it beating Oracle because I know that > Microsoft's stuff is always slower, but I need the information anyway to > protect my client from falling victim to a 'sales job'), then PostgreSQL will > > be the solution of choice as the client has always believed that they need a > > high-performance solution. > Unless there's a really compelling reason to switch, optimizing what they already have is probably the best thing for them. They've already paid for it. They've already written their own application and have some familiarity with managing the DB. According to Sybase, Sybase is the fastest thing going. :-) Which is probably pretty close to the truth if the application and DB are tuned appropriately. > I've already convinced them on the usual price, cross-platform > compatibility, open source, long history, etc. points, and I've been assured > > that if the performance is the same or better than Oracle's and Microsoft's > > solutions that PostgreSQL is what they'll choose. Are you telling me that they're willing to pay $40K per CPU for Oracle if it performs 1% better than PostgreSQL, which is $0? Not to mention throw away Sybase, which is a highly scalable platform in and of itself. The best DB platform is what they currently have, regardless of what they have, unless there is a very compelling reason to switch. > > Thanks in advance. > > ---(end of broadcast)--- > TIP 8: explain analyze is your friend > ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft
Randolf Richardson wrote: I'm looking for recent performance statistics on PostgreSQL vs. Oracle vs. Microsoft SQL Server. Recently someone has been trying to convince my client to switch from SyBASE to Microsoft SQL Server (they originally wanted to go with Oracle but have since fallen in love with Microsoft). All this time I've been recommending PostgreSQL for cost and stability (my own testing has shown it to be better at handling abnormal shutdowns and using fewer system resources) in addition to true cross-platform compatibility. I'm not sure that you are going to get a simple answer to this one. It really depends on what you are trying to do. The only way you will know for sure what the performance of PostgreSQL is is to try it with samples of your common queries, updates etc. I have recently ported a moderately complex database from MS SQLServer to Postgres with reasonable success. 70% selects, 20% updates, 10% insert/deletes. I had to do a fair bit of work to get the best performance out of Postgres, but most of the SQL has as good or better performance then SQLServer. There are still areas where SQLServer outperforms Postgres. For me these tend to be the larger SQL Statements with correlated subqueries. SQLServer tends to optimise them better a lot of the time. Updates tend to be a fair bit faster on SQLServer too, this may be MS taking advantage of Windows specific optimisations in the filesystem. I did give Oracle a try out of curiosity. I never considered it seriously because of the cost. The majority of my SQL was *slower* under Oracle than SQLServer. I spent some time with it and did get good performance, but it took a *lot* of work tuning to Oracle specific ways of doing things. My Summary: SQLServer: A good all round database, fast, stable. Moderately expensive to buy, cheap and easy to work with and program for (on Windows) PostgreSQL: A good all rounder, fast most of the time, stable. Free to acquire, more expensive to work with and program for. Client drivers may be problematic depending on platform and programming language. Needs more work than SQLServer to get the best out of it. Improving all the time and worth serious consideration. Oracle: A bit of a monstrosity. Can be very fast with a lot of work, can't comment on stability but I guess it's pretty good. Very expensive to acquire and work with. Well supported server and clients. Cheers, Gary. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster