Re: [PERFORM] Help with EXPLAIN ANALYZE runtimes

2005-01-09 Thread Guenzl, Martin
In case anyone is interested, I finally found what I believe to be the cause
... Or at least, I found the solution, and think I understand why.

Having read "Section 10.3 Controlling the Planner with Explicit JOIN
Clauses" (http://postgresql.org/docs/7.3/interactive/explicit-joins.html), I
modified the query to use INNER JOINS with the table datastatus, instead of
the implicit cross joins.

The INNER JOINS now seem to reduce the choices the planner has to make.  The
clue was the high number of tables involved, and the repeated reference to
the same table.

All's well that ends well ... with or without the Karnak headgear.

Martin

-Original Message-
From: Guenzl, Martin [mailto:[EMAIL PROTECTED] 
Sent: Sunday, 9 January 2005 3:45 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Help with EXPLAIN ANALYZE runtimes

Hi,
 
I am a recent convert to Postgresql, and am trying to tune a very slow query
across ten tables all with only a few rows at this stage (<20), and was
looking for some help to get me out of a dead-end.
 
It runs very slowly both on a hosted Postgresql 7.3.4 server running on
FreeBSD UNIX box, and also on a Postgresql 8.0.0.0-rc2 server running on a
Win XP box.
 
On the latter, the EXPLAIN ANALYZE returned what I thought was a strange
result - here is the excerpt ...

(Start)
 
SQL: Query Results
QUERY PLAN
Unique  (cost=7.16..7.32 rows=3 width=188) (actual time=51.000..51.000
rows=16 loops=1)
  ->  Sort  (cost=7.16..7.16 rows=3 width=188) (actual time=51.000..51.000
rows=16 loops=1)
Sort Key: am.id_assessment, c.id_claim, c.nm_claim, p.id_provider,
p.nm_title, p.nm_first, p.nm_last, ad.id_address, ad.nm_address_1,
ad.nm_address_2, ad.nm_address_3, ad.nm_suburb, ad.nm_city,
s.nm_state_short, ad.nm_postcode, am.dt_assessment, am.dt_booking,
ast.nm_assessmentstatus, ast.b_offer_report, asn.id_assessmentstatus,
asn.nm_assessmentstatus
->  Merge Join  (cost=4.60..7.13 rows=3 width=188) (actual
time=41.000..51.000 rows=16 loops=1)
  Merge Cond: ("outer".id_datastatus = "inner".id_datastatus)
  Join Filter: (("inner".id_claim = "outer".id_claim) AND
("inner".id_assessment = "outer".id_assessment))

:
:
:

  ->  Index Scan using address_pkey on
address ad  (cost=0.00..14.14 rows=376 width=76) (actual time=10.000..10.000
rows=82 loops=1)
  ->  Sort  (cost=1.05..1.06 rows=3
width=36) (actual time=0.000..0.000 rows=3 loops=1)
Sort Key: am.id_address
->  Seq Scan on assessment am
(cost=0.00..1.03 rows=3 width=36) (actual time=0.000..0.000 rows=3 loops=1)
Total runtime: 51.000 ms

44 row(s)

Total runtime: 11,452.979 ms

(End)

It's the bit at the bottom that throws me - I can't work out why one Total
runtime says 51ms, and yet the next Total runtime would be 11,452ms.  (I'm
assuming that the clue to getting the query time down is to solve this
puzzle.)

I've done vacuum analyze on all tables, but that didn't help.  This query
stands out among others as being very slow.

Any ideas or suggestions?  

Thanks in advance,

Martin



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match




---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] PostgreSQL vs. Oracle vs. Microsoft

2005-01-09 Thread Randolf Richardson
I'm looking for recent performance statistics on PostgreSQL vs. Oracle 
vs. Microsoft SQL Server.  Recently someone has been trying to convince my 
client to switch from SyBASE to Microsoft SQL Server (they originally wanted 
to go with Oracle but have since fallen in love with Microsoft).  All this 
time I've been recommending PostgreSQL for cost and stability (my own testing 
has shown it to be better at handling abnormal shutdowns and using fewer 
system resources) in addition to true cross-platform compatibility.

If I can show my client some statistics that PostgreSQL outperforms 
these (I'm more concerned about it beating Oracle because I know that 
Microsoft's stuff is always slower, but I need the information anyway to 
protect my client from falling victim to a 'sales job'), then PostgreSQL will 
be the solution of choice as the client has always believed that they need a 
high-performance solution.

I've already convinced them on the usual price, cross-platform 
compatibility, open source, long history, etc. points, and I've been assured 
that if the performance is the same or better than Oracle's and Microsoft's 
solutions that PostgreSQL is what they'll choose.

Thanks in advance.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Some Performance Advice Needed

2005-01-09 Thread Randolf Richardson
"[EMAIL PROTECTED] ("Joshua D. Drake")" wrote in pgsql.performance:

> Jeff wrote:
> 
>>
>> On Dec 23, 2004, at 9:27 AM, Alex wrote:
>>
>>
>>> Running hdparm reported
>>> A) 920mb/s   (SCSI 10k)
>>> B) 270mb/s   (SCSI 10k)
>>> C) 1750mb/s  (IDE  7.2k)
>>
>>
>>
>> IDE disks lie about write completion (This can be disabled on some 
>> drives) whereas SCSI drives wait for the data to actually be written 
>> before they report success.  It is quite
>> easy to corrupt a PG (Or most any db really) on an IDE drive.  Check 
>> the archives for more info.
> 
> Do we have any real info on this? Specifically which drives? Is SATA the 
> same way? What about SATA-II?
> I am not saying it isn't true (I know it is) but this is a blanket 
> statement that may or may not be
> true with newer tech.

The name hasn't changed, but don't let that give you the wrong 
impression because SCSI continues to improve.  I only use SCSI drives in 
all my servers, and that's because they always seem to outperform SATA and 
IDE when there's a multi-user[1] requirement (of course, the choice of OS
[2] is an important factor here too).

Disk fragmentation also plays a role, but can actually become a 
hinderance when in a multi-user environment.  I find that the caching 
algorithm in the OS that I usually choose [2] actually performs extremely 
well when more users are accessing data on volumes where the data is 
fragmented.  I'm told that this is very similar in the Unix environment as 
well.  Defragmentation makes more sense in a single-user environment 
because there are generally a very small number of files being loaded at 
one time, and so a user can benefit hugely from defragmentation.

Here's an interesting article (it comes complete with anonymous non-
logical emotion-based reader comments too):

SCSI vs. IDE: Which is really faster?

http://hardware.devchannel.org/hardwarechannel/03/10/20/1953249.shtml?
tid=20&tid=38&tid=49

[1] A somewhat busy web and/or eMail server certainly counts as a multi-
user requirement.  Put a database on it where the data isn't being accessed 
sequentially, and that can certainly meet the requirements too.
[2] Nearly all my servers run Novell NetWare.

-- 
Randolf Richardson, pro-active spam fighter - [EMAIL PROTECTED]
Vancouver, British Columbia, Canada

Sending eMail to other SMTP servers is a privilege.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Null integer columns

2005-01-09 Thread Jeffrey Tenny
I have an integer column that is not needed for some rows in the table
(whether it is necessary is a factor of other row attributes, and it 
isn't worth putting in a separate table).

What are the performance tradeoffs (storage space, query speed) of using 
NULL versus a sentinel integer value?

Not that it matters, but in the event where the column values matter,
the numberic value is a foreign key.  Advice on that welcome too.
Thanks!
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft

2005-01-09 Thread Frank Wiles
On Thu, 6 Jan 2005 19:01:38 + (UTC)
Randolf Richardson <[EMAIL PROTECTED]> wrote:

>   I'm looking for recent performance statistics on PostgreSQL vs.
>   Oracle 
> vs. Microsoft SQL Server.  Recently someone has been trying to
> convince my client to switch from SyBASE to Microsoft SQL Server (they
> originally wanted to go with Oracle but have since fallen in love with
> Microsoft).  All this time I've been recommending PostgreSQL for cost
> and stability (my own testing has shown it to be better at handling
> abnormal shutdowns and using fewer system resources) in addition to
> true cross-platform compatibility.
> 
>   If I can show my client some statistics that PostgreSQL
>   outperforms 
> these (I'm more concerned about it beating Oracle because I know that 
> Microsoft's stuff is always slower, but I need the information anyway
> to protect my client from falling victim to a 'sales job'), then
> PostgreSQL will be the solution of choice as the client has always
> believed that they need a high-performance solution.
> 
>   I've already convinced them on the usual price, cross-platform 
> compatibility, open source, long history, etc. points, and I've been
> assured that if the performance is the same or better than Oracle's
> and Microsoft's solutions that PostgreSQL is what they'll choose.

  While this doesn't exactly answer your question, I use this little
  tidbit of information when "selling" people on PostgreSQL.  PostgreSQL
  was chosen over Oracle as the database to handle all of the .org TLDs
  information.  While I don't believe the company that won was chosen 
  solely because they used PostgreSQL vs Oracle ( vs anything else ),
  it does go to show that PostgreSQL can be used in a large scale
  environment.  

  Another tidbit you can use in this particular case: I was involved
  in moving www.ljworld.com, www.lawrence.com, and www.kusports.com from
  a Sybase backend to a PostgreSQL backend back in 2000-2001.  We got
  roughly a 200% speed improvement at that time and PostgreSQL has only
  improved since then.  I would be more than happy to elaborate on this
  migration off list if you would like.  kusports.com gets a TON of 
  hits especially during "March Madness" and PostgreSQL has never been
  an issue in the performance of the site. 

 -
   Frank Wiles <[EMAIL PROTECTED]>
   http://www.wiles.org
 -


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft

2005-01-09 Thread mudfoot
Quoting Randolf Richardson <[EMAIL PROTECTED]>:

>   I'm looking for recent performance statistics on PostgreSQL vs. Oracle
> 
> vs. Microsoft SQL Server.  Recently someone has been trying to convince my 

I don't know anything about your customer's requirements other than that they
have a DB currently and somebody(ies) is(are) trying to get them to switch to
another.

I don't think you'll find meaningful numbers unless you do your own benchmarks.

 DB performance is very largely determined by how the application functions,
hardware, OS and the DBA's familiarity with the platform.  I would suspect that
for any given workload on relatively similar hardware that just about any of the
DB's you mention would perform similarly if tuned appropriately.

> client to switch from SyBASE to Microsoft SQL Server (they originally wanted
> 
> to go with Oracle but have since fallen in love with Microsoft).  All this 
> time I've been recommending PostgreSQL for cost and stability (my own testing
> 
> has shown it to be better at handling abnormal shutdowns and using fewer 
> system resources) in addition to true cross-platform compatibility.

Right for the customer?  How about "Don't fix it if it ain't broke"?  Replacing
a DB backend isn't always trivial (understatement).  I suppose if their
application is very simple and uses few if any proprietary features of Sybase
then changing the DB would be simple.  That depends heavily on the application.
In general, though, you probably shouldn't rip and replace DB platforms unless
there's a very good strategic reason.

I don't know about MSSQL, but I know that, if managed properly, Sybase and
Oracle can be pretty rock-solid and high performing.  If *you* have found FooDB
to be the most stable and highest performing, then that probably means that
FooDB is the one you're most familiar with rather than FooDB being the best in
all circumstances.  PostgreSQL is great.  I love it.  In the right hands and
under the right circumstances, it is the best DB.  So is Sybase.  And Oracle. 
And MSSQL.

> 
>   If I can show my client some statistics that PostgreSQL outperforms 
> these (I'm more concerned about it beating Oracle because I know that 
> Microsoft's stuff is always slower, but I need the information anyway to 
> protect my client from falling victim to a 'sales job'), then PostgreSQL will
> 
> be the solution of choice as the client has always believed that they need a
> 
> high-performance solution.
> 

Unless there's a really compelling reason to switch, optimizing what they
already have is probably the best thing for them.  They've already paid for it.
 They've already written their own application and have some familiarity with
managing the DB.  According to Sybase, Sybase is the fastest thing going. :-)
Which is probably pretty close to the truth if the application and DB are tuned
appropriately.

>   I've already convinced them on the usual price, cross-platform 
> compatibility, open source, long history, etc. points, and I've been assured
> 
> that if the performance is the same or better than Oracle's and Microsoft's
> 
> solutions that PostgreSQL is what they'll choose.

Are you telling me that they're willing to pay $40K per CPU for Oracle if it
performs 1% better than PostgreSQL, which is $0?  Not to mention throw away
Sybase, which is a highly scalable platform in and of itself.

The best DB platform is what they currently have, regardless of what they have,
unless there is a very compelling reason to switch.

> 
>   Thanks in advance.
> 
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
> 



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft

2005-01-09 Thread Gary Doades
Randolf Richardson wrote:
	I'm looking for recent performance statistics on PostgreSQL vs. Oracle 
vs. Microsoft SQL Server.  Recently someone has been trying to convince my 
client to switch from SyBASE to Microsoft SQL Server (they originally wanted 
to go with Oracle but have since fallen in love with Microsoft).  All this 
time I've been recommending PostgreSQL for cost and stability (my own testing 
has shown it to be better at handling abnormal shutdowns and using fewer 
system resources) in addition to true cross-platform compatibility.

I'm not sure that you are going to get a simple answer to this one. It
really depends on what you are trying to do. The only way you will know
for sure what the performance of PostgreSQL is is to try it with samples
of your common queries, updates etc.
I have recently ported a moderately complex database from MS SQLServer
to Postgres with reasonable success. 70% selects, 20% updates, 10%
insert/deletes. I had to do a fair bit of work to get the best
performance out of Postgres, but most of the SQL has as good or better
performance then SQLServer. There are still areas where SQLServer
outperforms Postgres. For me these tend to be the larger SQL Statements
with correlated subqueries. SQLServer tends to optimise them better a
lot of the time. Updates tend to be a fair bit faster on SQLServer too,
this may be MS taking advantage of Windows specific optimisations in the
filesystem.
I did give Oracle a try out of curiosity. I never considered it
seriously because of the cost. The majority of my SQL was *slower* under
Oracle than SQLServer. I spent some time with it and did get good
performance, but it took a *lot* of work tuning to Oracle specific ways
of doing things.
My Summary:
SQLServer: A good all round database, fast, stable. Moderately expensive
to buy, cheap and easy to work with and program for (on Windows)
PostgreSQL: A good all rounder, fast most of the time, stable. Free to
acquire, more expensive to work with and program for. Client drivers may
be problematic depending on platform and programming language. Needs
more work than SQLServer to get the best out of it. Improving all the
time and worth serious consideration.
Oracle: A bit of a monstrosity. Can be very fast with a lot of work,
can't comment on stability but I guess it's pretty good. Very expensive
to acquire and work with. Well supported server and clients.
Cheers,
Gary.
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster