date:20050810

Re: SPAM [SQL] Faster count(*)?

2005-08-10 Thread Andrew Sullivan

On Tue, Aug 09, 2005 at 10:49:14PM -0400, Tom Lane wrote:
> 
> Current best practice is to run the explain and parse out the "rows"
> figure using a perl (or axe-of-choice) regexp, though we could be
> persuaded to supply a simpler API if there's enough demand for it.

FWIW, this was another one of those things I must have heard a dozen
times at OSCON.  I suspect the simpler API would be popular,
particularly since post-8.0 the estimates are more reliable than they
used to be.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The whole tendency of modern prose is away from concreteness.
--George Orwell

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: SPAM [SQL] Faster count(*)?

2005-08-10 Thread Michael Fuhr

On Tue, Aug 09, 2005 at 09:29:13PM -0600, Michael Fuhr wrote:
> On Tue, Aug 09, 2005 at 10:49:14PM -0400, Tom Lane wrote:
> > Current best practice is to run the explain and parse out the "rows"
> > figure using a perl (or axe-of-choice) regexp, though we could be
> > persuaded to supply a simpler API if there's enough demand for it.
> 
> Somebody else requested a row-count-estimate function a couple of
> weeks ago:
> 
> http://archives.postgresql.org/pgsql-admin/2005-07/msg00256.php

Here's a simple example that parses EXPLAIN output.  It should work
in 8.0.2 and later:

CREATE FUNCTION count_estimate(query text) RETURNS integer AS $$
DECLARE
rec   record;
rows  integer;
BEGIN
FOR rec IN EXECUTE 'EXPLAIN ' || query LOOP
rows := substring(rec."QUERY PLAN" FROM ' rows=([[:digit:]]+)');
EXIT WHEN rows IS NOT NULL;
END LOOP;

RETURN rows;
END;
$$ LANGUAGE plpgsql VOLATILE STRICT;

CREATE TABLE foo (r double precision);
INSERT INTO foo SELECT random() FROM generate_series(1, 1000);
ANALYZE foo;

SELECT count_estimate('SELECT * FROM foo WHERE r < 0.1');
 count_estimate 

 97
(1 row)

EXPLAIN SELECT * FROM foo WHERE r < 0.1;
 QUERY PLAN  
-
 Seq Scan on foo  (cost=0.00..17.50 rows=97 width=8)
   Filter: (r < 0.1::double precision)
(2 rows)

-- 
Michael Fuhr

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [SQL] Faster count(*)?

2005-08-10 Thread Richard Huxton


Owen Jacobson wrote:

Salve.

I understand from various web searches and so on that PostgreSQL's MVCC
mechanism makes it very hard to use indices or table metadata to optimise
count(*).  Is there a better way to guess the "approximate size" of a table?


Plenty of good answers on how to estimate table-size, but it sounds like 
you just want to run your maintenance function "every so often".


1. Create a sequence "my_table_tracker_seq"
2. On insert, call nextval(my_table_tracker_seq)
3. If value modulo 1000 = 0, run the maintenance routine

That's about as fast as you can get, and might meet your needs. Of 
course you'll need to be more complex if you insert multiple rows at a time.


If you do a lot of deletion on the table and want to take that into 
account, have a second sequence you increment on deletion and subtract 
the one from the other.


Not always accurate enough, but it is quick.
--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: SPAM [SQL] Faster count(*)?

2005-08-10 Thread Owen Jacobson

Tom Lane wrote:

> [EMAIL PROTECTED] writes:
> > I believe running count(*) means fulltable scan, and there's no way
> > to do it without it. But what about some "intermediate" table, with
> > the necessary counts?
>
> There's a fairly complete discussion in the PG list archives of a
> reasonably-efficient scheme for maintaining such counts via triggers.
> It wasn't efficient enough that we were willing to impose the overhead
> on every application ... but if you really NEED a fast count(*) you
> could implement it.  I'd like to see someone actually do it and put
> up working code on pgfoundry; AFAIK it's only a paper design so far.
>
> If you only want a very-approximate count, the best bet is to rely on
> the planner's estimates, eg
>
> regression=# explain select * from tenk1;
>  QUERY PLAN
> -
>  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
> ^
>
> Current best practice is to run the explain and parse out the "rows"
> figure using a perl (or axe-of-choice) regexp, though we could be
> persuaded to supply a simpler API if there's enough demand for it.

Yick.  Ok, given all of that, I've rewritten the trigger in question to fire
on different, indexable criteria (difference between "earliest" and "latest"
rows in the table).

Thanks, everyone.

Owen


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [SQL] Breakdown results by month

2005-08-10 Thread Bruno Wolff III

On Tue, Aug 02, 2005 at 14:34:46 -0400,
  Henry Ortega <[EMAIL PROTECTED]> wrote:
> I have the ff table:
> 
> id   |total| effective|end_date
> john  6  01-01-200502-28-2005
> john  8  03-01-200506-30-2005
> 
> How can I return:
> id   |total| effective|end_date
> john  6  01-01-200501-31-2005
> john  6  02-01-200502-28-2005
> john  8  03-01-200503-31-2005
> john  8  04-01-200504-30-2005
> john  8  05-01-200505-31-2005
> john  8  06-01-200506-30-2005
> 
> Any help would be appreciated. Thanks

One approach would be to generate the monthly dates using generate_series
and some date math and join those rows to your ff table where the generated
dates are covered by the effective and end dates in the ff table. You may
need some more trickery if some of the dates aren't on month boundries.

---(end of broadcast)---
TIP 6: explain analyze is your friend

[SQL] How to alias table columns in result?

2005-08-10 Thread nori

Hi,

If I have:
1.) table car with columns index and name 
2.) table driver with columns index, name and car_index

and query:
SELECT d.*, c.* FROM driver as d LEFT OUTER JOIN car AS c ON
d.car_index=c.index;

How can I get results that have distinct names for columns (ex.
d.name, d.index, c.name, c.index,...)?

tnx
boris

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [SQL] How to alias table columns in result?

2005-08-10 Thread Mischa Sandberg

Quoting nori <[EMAIL PROTECTED]>:

> Hi,
> 
> If I have:
> 1.) table car with columns index and name 
> 2.) table driver with columns index, name and car_index
> 
> and query:
> SELECT d.*, c.* FROM driver as d LEFT OUTER JOIN car AS c ON
> d.car_index=c.index;
> 
> How can I get results that have distinct names for columns (ex.
> d.name, d.index, c.name, c.index,...)?

Here's where you have to get explicit; d.* won't work.
If you want to have names with (.) in them, they have to be quoted, too.
Me, I'd use "d_name, d_index, c_name, ..."

SELECT d.name as "d.name",
   d.index as "d.index",
   c.name as "c.name",
   ...


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: SPAM [SQL] Faster count(*)?

Re: SPAM [SQL] Faster count(*)?

Re: [SQL] Faster count(*)?

Re: SPAM [SQL] Faster count(*)?

Re: [SQL] Breakdown results by month

[SQL] How to alias table columns in result?

Re: [SQL] How to alias table columns in result?

7 matches

Site Navigation

Mail list logo

Footer information

Re: **SPAM** [SQL] Faster count(*)?

Re: **SPAM** [SQL] Faster count(*)?

Re: [SQL] Faster count(*)?

Re: **SPAM** [SQL] Faster count(*)?

Re: [SQL] Breakdown results by month

[SQL] How to alias table columns in result?

Re: [SQL] How to alias table columns in result?

7 matches

Mail list logo

Re: SPAM [SQL] Faster count(*)?

Re: SPAM [SQL] Faster count(*)?

Re: SPAM [SQL] Faster count(*)?