Re: [PERFORM] Benchmark
Tom Lane wrote: Great Bridge did essentially that years ago, but I think we only got away with it because we didn't say which DBs "Commercial Database A" and "Commercial Database B" actually were. Even off the record, we were only allowed to tell people that the commercial DBs were Oracle and SQL Server ... but not which was which. IMHO clues like: "What versions of the databases did you use? - PostgreSQL - 7.0 release version - Proprietary 1 - 8.1.5 - Proprietary 2 - 7.0 - MySQL - 3.22.32 - Interbase - 6.0 " and "PostgreSQL" and "Proprietary 1" was running "red hat linux 6.1" and "Proprietary 2" was running Windows NT server 4 - service pack 4" in articles like this one: http://www.xperts.com/news/press1.htm helped some people narrow it down a bit. :) ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Benchmark
Christopher Browne <[EMAIL PROTECTED]> writes: > After all, the cost of a computer system to run the transactions is > likely to be comprised of some combination of software licenses and > hardware costs. Even if the software is free, the hardware isn't. And labour costs. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Benchmark
Dear, We are using PostgreSQL for 4 Years now, one can say it is a blessing to maintain. Our previous database was number one (;-), it was much harder to maintain so labor is a pro for PostgreSQL ... Kind Regards Patrick Meylemans IT Manager WTCM-CRIF Celestijnenlaan 300C 3001 Helerlee At 11:34 13/02/2005 -0500, Greg Stark wrote: Christopher Browne <[EMAIL PROTECTED]> writes: > After all, the cost of a computer system to run the transactions is > likely to be comprised of some combination of software licenses and > hardware costs. Even if the software is free, the hardware isn't. And labour costs. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] estimated rows vs. actual rows
Hi, in the #postgresql-es channel someone shows me this: pgsql-7.4.5 + postgis --- begin context --- CREATE TABLE calles ( gid int4 NOT NULL DEFAULT nextval('public.callesstgo_gid_seq'::text), nombre varchar, inicio int4, termino int4, comuna varchar, ciudad varchar, region numeric, pais varchar, the_geom geometry, id_comuna numeric, CONSTRAINT callesstgo_pkey PRIMARY KEY (gid), CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = 'MULTILINESTRING'::text OR the_geom IS NULL), CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = -1) ) WITH OIDS; CREATE INDEX idx_region_comunas ON calles USING btree (id_comuna, region); select count(*) from calles; 143902 --- end context --- Ok . here is the problem (BTW, the database has been analyzed just before this query was execured) explain analyze select * from calles where id_comuna = 92 and region=13; QUERY PLAN Seq Scan on calles (cost=0.00..7876.53 rows=2610 width=279) (actual time=182.590..454.195 rows=4612 loops=1) Filter: ((id_comuna = 92::numeric) AND (region = 13::numeric)) Total runtime: 456.876 ms Why is this query using a seq scan rather than a index scan? i notice the diff between the estimated rows and actual rows (almost 2000). Can this affect the query plan? i think this is a problem of statistics, am i right? if so, what can be done? regards, Jaime Casanova ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] estimated rows vs. actual rows
Jaime, > Why is this query using a seq scan rather than a index scan? Because it thinks a seq scan will be faster. > i notice > the diff between the estimated rows and actual rows (almost 2000). Yes, ANALYZE, and possibly increasing the column stats, should help that. > Can this affect the query plan? i think this is a problem of > statistics, am i right? if so, what can be done? Well, if the estimate was accurate, PG would be even *more* likely to use a seq scan (more rows). I think maybe you should establish whether a seq scan actually *is* faster? Perhaps do SET enable_seqscan = false and then re-run the query a few times? -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] estimated rows vs. actual rows
On Sun, 13 Feb 2005 13:41:09 -0800, Josh Berkus wrote: > Jaime, > > > Why is this query using a seq scan rather than a index scan? > > Because it thinks a seq scan will be faster. > I will suggest him to probe with seq scans disabled. But, IMHO, if the table has 143902 and it thinks will retrieve 2610 (almost 1.81% of the total). it won't be faster with an index? i know, i will suggest him to probe to be sure. just an opinion. regards, Jaime Casanova ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] estimated rows vs. actual rows
Jaime Casanova <[EMAIL PROTECTED]> writes: > But, IMHO, if the table has 143902 and it thinks will retrieve 2610 > (almost 1.81% of the total). it won't be faster with an index? That's almost one row in fifty. We don't know how wide the table is, but it's certainly possible that there are order-of-a-hundred rows on each page; in which case the indexscan is likely to hit every page. Twice. Not in sequence. Only if the selected rows are pretty well clustered in a small part of the table is this going to be a win over a seqscan. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] estimated rows vs. actual rows
Jaime Casanova wrote: But, IMHO, if the table has 143902 and it thinks will retrieve 2610 (almost 1.81% of the total). it won't be faster with an index? Depends on how those 2610 rows are distributed amongst the 143902. The worst case scenario is each one of them in its own page. In that case you have to read 2610 *pages*, which is probably a significant percentage of the table. You can find out this information from the pg_stats view (particularly the correlation column). Mark ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings