Re: [PERFORM] Are folks running 8.4 in production environments? and 8.4 and slon 1.2?

2009-10-13 Thread Scott Marlowe
On Mon, Oct 12, 2009 at 1:06 PM, Tory M Blue  wrote:
> Any issues, has it baked long enough, is it time for us 8.3 folks to deal
> with the pain and upgrade?

I am running 8.4.1 for my stats and search databases, and it's working fine.

> Anymore updates regarding 8.4 and slon 1.2 as well, since I usually
> build/upgrade both at the same time.

I don't think 1.2 supports 8.4 just yet, and 2.0.3  or so is still not
stable enough for production (I had major unexplained outages with it)
so for now, no 8.4 with slony.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] index on two tables or Howto speedup max/aggregate-function

2009-10-13 Thread Jacques Caron

Hi,

CREATE INDEX mail_id_sent_idx ON mail(id,sent)

should do the trick? Of course you can only replace one of the two 
scans by an index scan since there are no other conditions...


Jacques.

At 09:59 13/10/2009, Michael Schwipps wrote:

Hi,

I want to select the last contact of person via mail.
My sample database is build with the following shell-commands

| createdb -U postgres test2
| psql -U postgres test2 < mail_db.sql
| mailtest.sh | psql -U postgres

I call to get the answer

| SELECT address, max(sent) from mail inner join
| tomail on (mail.id=tomail.mail) group by address;

The result is ok, but it's to slow.
The query plan, see below,  tells that there two seq scans.
Howto transforms them into index scans?

postgres ignores simple indexes on column sent.
An Index on two tables is not possible (if I understand the manual
correctly).

Any other idea howto speed up?

Ciao

Michael

===

test2=# explain analyze SELECT address, max(sent) from mail inner join
tomail on (mail.id=tomail.mail) group by address;
  QUERY PLAN
---
 HashAggregate  (cost=36337.00..36964.32 rows=50186 width=20) (actual
time=3562.136..3610.238 rows=5 loops=1)
   ->  Hash Join  (cost=14191.00..33837.00 rows=50 width=20) 
(actual time=1043.537..2856.933 rows=50 loops=1)

 Hash Cond: (tomail.mail = mail.id)
 ->  Seq Scan on tomail  (cost=0.00..8396.00 rows=50 
width=20) (actual time=0.014..230.264 rows=50 loops=1)
 ->  Hash  (cost=7941.00..7941.00 rows=50 width=8) 
(actual time=1042.996..1042.996 rows=50 loops=1)
   ->  Seq Scan on mail  (cost=0.00..7941.00 
rows=50 width=8) (actual time=0.018..362.101 rows=50 loops=1)

 Total runtime: 3629.449 ms
(7 rows)




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] index on two tables or Howto speedup max/aggregate-function

2009-10-13 Thread Grzegorz Jaśkiewicz
On Tue, Oct 13, 2009 at 9:59 AM, Michael Schwipps wrote:

> Hi,
>
> I want to select the last contact of person via mail.
> My sample database is build with the following shell-commands
>
> | createdb -U postgres test2
> | psql -U postgres test2 < mail_db.sql
> | mailtest.sh | psql -U postgres
>
> I call to get the answer
>
> | SELECT address, max(sent) from mail inner join
> | tomail on (mail.id=tomail.mail) group by address;
>
> you are missing vacuumdb -z test2
after mailtest.sh ..




-- 
GJ


[PERFORM] index on two tables or Howto speedup max/aggregate-function

2009-10-13 Thread Michael Schwipps
Hi,

I want to select the last contact of person via mail.
My sample database is build with the following shell-commands 

| createdb -U postgres test2
| psql -U postgres test2 < mail_db.sql
| mailtest.sh | psql -U postgres

I call to get the answer

| SELECT address, max(sent) from mail inner join
| tomail on (mail.id=tomail.mail) group by address;

The result is ok, but it's to slow.
The query plan, see below,  tells that there two seq scans.
Howto transforms them into index scans?

postgres ignores simple indexes on column sent.
An Index on two tables is not possible (if I understand the manual
correctly).

Any other idea howto speed up?

Ciao

Michael

===

test2=# explain analyze SELECT address, max(sent) from mail inner join
tomail on (mail.id=tomail.mail) group by address;
  QUERY PLAN
---
 HashAggregate  (cost=36337.00..36964.32 rows=50186 width=20) (actual
time=3562.136..3610.238 rows=5 loops=1)
   ->  Hash Join  (cost=14191.00..33837.00 rows=50 width=20) (actual 
time=1043.537..2856.933 rows=50 loops=1)
 Hash Cond: (tomail.mail = mail.id)
 ->  Seq Scan on tomail  (cost=0.00..8396.00 rows=50 width=20) 
(actual time=0.014..230.264 rows=50 loops=1)
 ->  Hash  (cost=7941.00..7941.00 rows=50 width=8) (actual 
time=1042.996..1042.996 rows=50 loops=1)
   ->  Seq Scan on mail  (cost=0.00..7941.00 rows=50 width=8) 
(actual time=0.018..362.101 rows=50 loops=1)
 Total runtime: 3629.449 ms
(7 rows)


CREATE TABLE mail (
id serial primary key,
sent date,
subject character varying(256)
);


CREATE TABLE tomail (
id serial primary key,
mail integer references mail,
address character varying(256)
);



mailtest.sh
Description: Bourne shell script

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Are folks running 8.4 in production environments? and 8.4 and slon 1.2?

2009-10-13 Thread Kenneth Marshall
On Tue, Oct 13, 2009 at 01:03:10AM -0600, Scott Marlowe wrote:
> On Mon, Oct 12, 2009 at 1:06 PM, Tory M Blue  wrote:
> > Any issues, has it baked long enough, is it time for us 8.3 folks to deal
> > with the pain and upgrade?
> 
> I am running 8.4.1 for my stats and search databases, and it's working fine.
> 
> > Anymore updates regarding 8.4 and slon 1.2 as well, since I usually
> > build/upgrade both at the same time.
> 
> I don't think 1.2 supports 8.4 just yet, and 2.0.3  or so is still not
> stable enough for production (I had major unexplained outages with it)
> so for now, no 8.4 with slony.
> 
slony-1.2.17-rc2 works fine with version 8.4 in my limited testing.
I have not been able to get replication to work reliably with any
current release of slony-2.x. There was a recent comment that the
latest version in CVS has the 2.x bug fixed but I have not had a
chance to try.

Regards,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Getting a random row

2009-10-13 Thread Shaul Dar
Hi,

I am running performance simulation against a DB. I want to randomly pull
different records from a large table. However the table has no columns that
hold sequential integer values (1..MAX), i.e. the columns all have "holes"
(due to earlier filtering). Also PG does not have a concept of an
auto-increment pseudo-column like Oracle's "rownum". Any suggestions?

Thanks,

-- Shaul


Re: [PERFORM] Getting a random row

2009-10-13 Thread Grzegorz Jaśkiewicz
On Tue, Oct 13, 2009 at 4:17 PM, Shaul Dar  wrote:

> Hi,
>
> I am running performance simulation against a DB. I want to randomly pull
> different records from a large table. However the table has no columns that
> hold sequential integer values (1..MAX), i.e. the columns all have "holes"
> (due to earlier filtering).
>
what do yo umean ? you can restrict range of integer column (or any other
type) with constraints, for instance CHECK foo( a between 1 and 100);



> Also PG does not have a concept of an auto-increment pseudo-column like
> Oracle's "rownum". Any suggestions?
>
not true - it has sequences, and pseudo type serial. Rtfm!.



-- 
GJ


Re: [PERFORM] Getting a random row

2009-10-13 Thread Thomas Kellerer

Shaul Dar, 13.10.2009 17:17:

Also PG does not have a concept of an auto-increment pseudo-column
like Oracle's "rownum". Any suggestions?


Yes it does (at least 8.4)

SELECT row_number() over(), the_other_columns...
FROM your_table

So you could do something like:

SELECT * 
FROM (
 SELECT row_number() over() as rownum, 
the_other_columns...

 FROM your_table
) t 
WHERE t.rownum = a_random_integer_value_lower_than_rowcount;


Thomas




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Getting a random row

2009-10-13 Thread Shaul Dar
Sorry, I guess I wasn't clear.
I have an existing table in my DB, and it doesn't have a column with serial
values (actually it did originally, but due to later deletions of about 2/3
of the rows the column now has "holes"). I realize I could add a new serial
column, but prefer not to change table + the new column would also become
nonconsecutive after further deletions. The nice thing about Oracle's
"rownum" is that it' a pseudo-column", not a real one, and AFAIK is always
valid.

Suggestions?

-- Shaul

2009/10/13 Grzegorz Jaśkiewicz 

>
>
> On Tue, Oct 13, 2009 at 4:17 PM, Shaul Dar  wrote:
>
>> Hi,
>>
>> I am running performance simulation against a DB. I want to randomly pull
>> different records from a large table. However the table has no columns that
>> hold sequential integer values (1..MAX), i.e. the columns all have "holes"
>> (due to earlier filtering).
>>
> what do yo umean ? you can restrict range of integer column (or any other
> type) with constraints, for instance CHECK foo( a between 1 and 100);
>
>
>> Also PG does not have a concept of an auto-increment pseudo-column like
>> Oracle's "rownum". Any suggestions?
>>
> not true - it has sequences, and pseudo type serial. Rtfm!.
>
>
>
> --
> GJ
>


Re: [PERFORM] Getting a random row

2009-10-13 Thread Michael Glaesemann


On Oct 13, 2009, at 11:19 , Grzegorz Jaśkiewicz wrote:


On Tue, Oct 13, 2009 at 4:17 PM, Shaul Dar  wrote:


Also PG does not have a concept of an auto-increment pseudo-column  
like

Oracle's "rownum". Any suggestions?


not true - it has sequences, and pseudo type serial. Rtfm!.


AIUI, rownum applies numbering to output rows in a SELECT statement,  
rather than some actual column of the table, which is likely what the  
OP is getting at.


http://www.oracle.com/technology/oramag/oracle/06-sep/o56asktom.html

Michael Glaesemann
grzm seespotcode net




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Getting a random row

2009-10-13 Thread Grzegorz Jaśkiewicz
2009/10/13 Shaul Dar 

> Sorry, I guess I wasn't clear.
> I have an existing table in my DB, and it doesn't have a column with serial
> values (actually it did originally, but due to later deletions of about 2/3
> of the rows the column now has "holes"). I realize I could add a new serial
> column, but prefer not to change table + the new column would also become
> nonconsecutive after further deletions. The nice thing about Oracle's
> "rownum" is that it' a pseudo-column", not a real one, and AFAIK is always
> valid.
>
change the default of that column to use sequence.
For instance, lookup CREATE SEQUENCE in manual, and ALTER TABLE .. SET
DEFAULT ..

for example of how it looks, just create table foo(a serial), and check its
definition with \d+ foo



-- 
GJ


Re: [PERFORM] Getting a random row

2009-10-13 Thread Shaul Dar
Michael,

You are right. I didn't remember the semantics, and Oracle's rownum would
not have been helpful. But the new row_number() in 8.4 would probably do the
trick (though I use 8.3.7 :-( )

-- Shaul

2009/10/13 Michael Glaesemann 

>
> On Oct 13, 2009, at 11:19 , Grzegorz Jaśkiewicz wrote:
>
>  On Tue, Oct 13, 2009 at 4:17 PM, Shaul Dar  wrote:
>>
>>
>>  Also PG does not have a concept of an auto-increment pseudo-column like
>>> Oracle's "rownum". Any suggestions?
>>>
>>>  not true - it has sequences, and pseudo type serial. Rtfm!.
>>
>
> AIUI, rownum applies numbering to output rows in a SELECT statement, rather
> than some actual column of the table, which is likely what the OP is getting
> at.
>
> http://www.oracle.com/technology/oramag/oracle/06-sep/o56asktom.html
>
> Michael Glaesemann
> grzm seespotcode net
>
>
>
>


Re: [PERFORM] Getting a random row

2009-10-13 Thread Robert Haas
2009/10/13 Grzegorz Jaśkiewicz :
>
>
> 2009/10/13 Shaul Dar 
>>
>> Sorry, I guess I wasn't clear.
>> I have an existing table in my DB, and it doesn't have a column with
>> serial values (actually it did originally, but due to later deletions of
>> about 2/3 of the rows the column now has "holes"). I realize I could add a
>> new serial column, but prefer not to change table + the new column would
>> also become nonconsecutive after further deletions. The nice thing about
>> Oracle's "rownum" is that it' a pseudo-column", not a real one, and AFAIK is
>> always valid.
>
> change the default of that column to use sequence.
> For instance, lookup CREATE SEQUENCE in manual, and ALTER TABLE .. SET
> DEFAULT ..
>
> for example of how it looks, just create table foo(a serial), and check its
> definition with \d+ foo

This is not really what he's trying to do.  Oracle's rownum has
completely different semantics than this.

But, on 8.4, a window function should do it.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Getting a random row

2009-10-13 Thread bricklen
2009/10/13 Grzegorz Jaśkiewicz :
>
>
> 2009/10/13 Shaul Dar 
>>
>> Sorry, I guess I wasn't clear.
>> I have an existing table in my DB, and it doesn't have a column with
>> serial values (actually it did originally, but due to later deletions of
>> about 2/3 of the rows the column now has "holes"). I realize I could add a
>> new serial column, but prefer not to change table + the new column would
>> also become nonconsecutive after further deletions. The nice thing about
>> Oracle's "rownum" is that it' a pseudo-column", not a real one, and AFAIK is
>> always valid.
>
> change the default of that column to use sequence.
> For instance, lookup CREATE SEQUENCE in manual, and ALTER TABLE .. SET
> DEFAULT ..
>
> for example of how it looks, just create table foo(a serial), and check its
> definition with \d+ foo
>
>
>
> --
> GJ
>


You could emulate rownum (aka rank) using a TEMPORARY sequence applied
to your result set.

http://www.postgresql.org/docs/8.3/interactive/sql-createsequence.html

Not sure if this is what you're after though?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Getting a random row

2009-10-13 Thread Scott Marlowe
On Tue, Oct 13, 2009 at 9:17 AM, Shaul Dar  wrote:
> Hi,
>
> I am running performance simulation against a DB. I want to randomly pull
> different records from a large table. However the table has no columns that
> hold sequential integer values (1..MAX), i.e. the columns all have "holes"
> (due to earlier filtering). Also PG does not have a concept of an
> auto-increment pseudo-column like Oracle's "rownum". Any suggestions?

If what you're trying to do is emulate a real world app which randomly
grabs rows, then you want to setup something ahead of time that has a
pseudo random order and not rely on using anything like order by
random() limit 1 or anything like that.  Easiest way is to do
something like:

select id into randomizer from maintable order by random();

then use a cursor to fetch from the table to get "random" rows from
the real table.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Getting a random row

2009-10-13 Thread Thom Brown
2009/10/14 Scott Marlowe :
>
> If what you're trying to do is emulate a real world app which randomly
> grabs rows, then you want to setup something ahead of time that has a
> pseudo random order and not rely on using anything like order by
> random() limit 1 or anything like that.  Easiest way is to do
> something like:
>
> select id into randomizer from maintable order by random();
>
> then use a cursor to fetch from the table to get "random" rows from
> the real table.
>
>

Why not just do something like:

SELECT thisfield, thatfield
FROM my_table
WHERE thisfield IS NOT NULL
ORDER BY RANDOM()
LIMIT 1;

Thom

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance