Re: **SPAM** [SQL] Faster count(*)?
On Tue, Aug 09, 2005 at 10:49:14PM -0400, Tom Lane wrote: > > Current best practice is to run the explain and parse out the "rows" > figure using a perl (or axe-of-choice) regexp, though we could be > persuaded to supply a simpler API if there's enough demand for it. FWIW, this was another one of those things I must have heard a dozen times at OSCON. I suspect the simpler API would be popular, particularly since post-8.0 the estimates are more reliable than they used to be. A -- Andrew Sullivan | [EMAIL PROTECTED] The whole tendency of modern prose is away from concreteness. --George Orwell ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: **SPAM** [SQL] Faster count(*)?
On Tue, Aug 09, 2005 at 09:29:13PM -0600, Michael Fuhr wrote:
> On Tue, Aug 09, 2005 at 10:49:14PM -0400, Tom Lane wrote:
> > Current best practice is to run the explain and parse out the "rows"
> > figure using a perl (or axe-of-choice) regexp, though we could be
> > persuaded to supply a simpler API if there's enough demand for it.
>
> Somebody else requested a row-count-estimate function a couple of
> weeks ago:
>
> http://archives.postgresql.org/pgsql-admin/2005-07/msg00256.php
Here's a simple example that parses EXPLAIN output. It should work
in 8.0.2 and later:
CREATE FUNCTION count_estimate(query text) RETURNS integer AS $$
DECLARE
rec record;
rows integer;
BEGIN
FOR rec IN EXECUTE 'EXPLAIN ' || query LOOP
rows := substring(rec."QUERY PLAN" FROM ' rows=([[:digit:]]+)');
EXIT WHEN rows IS NOT NULL;
END LOOP;
RETURN rows;
END;
$$ LANGUAGE plpgsql VOLATILE STRICT;
CREATE TABLE foo (r double precision);
INSERT INTO foo SELECT random() FROM generate_series(1, 1000);
ANALYZE foo;
SELECT count_estimate('SELECT * FROM foo WHERE r < 0.1');
count_estimate
97
(1 row)
EXPLAIN SELECT * FROM foo WHERE r < 0.1;
QUERY PLAN
-
Seq Scan on foo (cost=0.00..17.50 rows=97 width=8)
Filter: (r < 0.1::double precision)
(2 rows)
--
Michael Fuhr
---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
Re: [SQL] Faster count(*)?
Owen Jacobson wrote: Salve. I understand from various web searches and so on that PostgreSQL's MVCC mechanism makes it very hard to use indices or table metadata to optimise count(*). Is there a better way to guess the "approximate size" of a table? Plenty of good answers on how to estimate table-size, but it sounds like you just want to run your maintenance function "every so often". 1. Create a sequence "my_table_tracker_seq" 2. On insert, call nextval(my_table_tracker_seq) 3. If value modulo 1000 = 0, run the maintenance routine That's about as fast as you can get, and might meet your needs. Of course you'll need to be more complex if you insert multiple rows at a time. If you do a lot of deletion on the table and want to take that into account, have a second sequence you increment on deletion and subtract the one from the other. Not always accurate enough, but it is quick. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: **SPAM** [SQL] Faster count(*)?
Tom Lane wrote: > [EMAIL PROTECTED] writes: > > I believe running count(*) means fulltable scan, and there's no way > > to do it without it. But what about some "intermediate" table, with > > the necessary counts? > > There's a fairly complete discussion in the PG list archives of a > reasonably-efficient scheme for maintaining such counts via triggers. > It wasn't efficient enough that we were willing to impose the overhead > on every application ... but if you really NEED a fast count(*) you > could implement it. I'd like to see someone actually do it and put > up working code on pgfoundry; AFAIK it's only a paper design so far. > > If you only want a very-approximate count, the best bet is to rely on > the planner's estimates, eg > > regression=# explain select * from tenk1; > QUERY PLAN > - > Seq Scan on tenk1 (cost=0.00..458.00 rows=1 width=244) > ^ > > Current best practice is to run the explain and parse out the "rows" > figure using a perl (or axe-of-choice) regexp, though we could be > persuaded to supply a simpler API if there's enough demand for it. Yick. Ok, given all of that, I've rewritten the trigger in question to fire on different, indexable criteria (difference between "earliest" and "latest" rows in the table). Thanks, everyone. Owen ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [SQL] Breakdown results by month
On Tue, Aug 02, 2005 at 14:34:46 -0400, Henry Ortega <[EMAIL PROTECTED]> wrote: > I have the ff table: > > id |total| effective|end_date > john 6 01-01-200502-28-2005 > john 8 03-01-200506-30-2005 > > How can I return: > id |total| effective|end_date > john 6 01-01-200501-31-2005 > john 6 02-01-200502-28-2005 > john 8 03-01-200503-31-2005 > john 8 04-01-200504-30-2005 > john 8 05-01-200505-31-2005 > john 8 06-01-200506-30-2005 > > Any help would be appreciated. Thanks One approach would be to generate the monthly dates using generate_series and some date math and join those rows to your ff table where the generated dates are covered by the effective and end dates in the ff table. You may need some more trickery if some of the dates aren't on month boundries. ---(end of broadcast)--- TIP 6: explain analyze is your friend
[SQL] How to alias table columns in result?
Hi, If I have: 1.) table car with columns index and name 2.) table driver with columns index, name and car_index and query: SELECT d.*, c.* FROM driver as d LEFT OUTER JOIN car AS c ON d.car_index=c.index; How can I get results that have distinct names for columns (ex. d.name, d.index, c.name, c.index,...)? tnx boris ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] How to alias table columns in result?
Quoting nori <[EMAIL PROTECTED]>: > Hi, > > If I have: > 1.) table car with columns index and name > 2.) table driver with columns index, name and car_index > > and query: > SELECT d.*, c.* FROM driver as d LEFT OUTER JOIN car AS c ON > d.car_index=c.index; > > How can I get results that have distinct names for columns (ex. > d.name, d.index, c.name, c.index,...)? Here's where you have to get explicit; d.* won't work. If you want to have names with (.) in them, they have to be quoted, too. Me, I'd use "d_name, d_index, c_name, ..." SELECT d.name as "d.name", d.index as "d.index", c.name as "c.name", ... ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
