Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread FERREIRA, William (VALTECH)

i'm using Solaris8
i tried changing only postgresql parameters
and time has increased of 10mn

i keep in mind your idea, we will soon upgraded to solaris 10

regards

Will

-Message d'origine-
De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Envoyé : mardi 14 février 2006 22:47
À : FERREIRA, William (VALTECH)
Cc : Albert Cervera Areny; pgsql-performance@postgresql.org
Objet : Re: [PERFORM] copy and postgresql.conf



What version of Solaris are you using?

Do you have the recommendations while using COPY on Solaris?
http://blogs.sun.com/roller/page/jkshah?entry=postgresql_on_solaris_better_use

wal_sync_method = fsync
wal_buffers = 128
checkpoint_segments = 128
bgwriter_percent = 0
bgwriter_maxpages = 0


And also for /etc/system on Solaris 10, 9 SPARC use the following

set maxphys=1048576
set md:md_maxphys=1048576
set segmap_percent=50
set ufs:freebehind=0
set msgsys:msginfo_msgmni = 3584
set semsys:seminfo_semmni = 4096
set shmsys:shminfo_shmmax = 15392386252
set shmsys:shminfo_shmmni = 4096


Can you try putting in one run with this values and send back your
experiences on whether it helps your workload or not?

Atleast I saw improvements using the above settings with COPY with
Postgres 8.0 and Postgres 8.1 on Solaris.

Regards,
Jignesh




FERREIRA, William (VALTECH) wrote:

>30% faster !!! i will test this new version ...
>
>thanks a lot
>
>-Message d'origine-
>De : [EMAIL PROTECTED]
>[mailto:[EMAIL PROTECTED] la part de Albert
>Cervera Areny
>Envoyé : mardi 14 février 2006 17:07
>À : pgsql-performance@postgresql.org
>Objet : Re: [PERFORM] copy and postgresql.conf
>
>
>
>Sorry, COPY improvements came with 8.1
>
>(http://www.postgresql.org/docs/whatsnew)
>
>A Dimarts 14 Febrer 2006 14:26, FERREIRA, William (VALTECH) va escriure:
> 
>
>>thanks,
>>
>>i'm using postgresql 8.0.3
>>there is no primary key and no index on my tables
>>
>>regards
>>
>>-Message d'origine-
>>De : [EMAIL PROTECTED]
>>[mailto:[EMAIL PROTECTED] la part de Albert
>>Cervera Areny
>>Envoyé : mardi 14 février 2006 12:38
>>À : pgsql-performance@postgresql.org
>>Objet : Re: [PERFORM] copy and postgresql.conf
>>
>>
>>
>>Hi William,
>>  which PostgreSQL version are you using? Newer (8.0+) versions have some
>>
>>important performance improvements for the COPY command.
>>
>>  Also, you'll notice significant improvements by creating primary & 
>> foreign
>>
>>keys after the copy command. I think config tweaking can improve key and
>>
>>index creation but I don't think you can improve the COPY command itself.
>>
>>  There are also many threads in this list commenting on this issue, 
>> you'll
>>
>>find it easely in the archives.
>>
>>A Dimarts 14 Febrer 2006 10:44, FERREIRA, William (VALTECH) va escriure:
>>   
>>
>>>hi,
>>>
>>>i load data from files using copy method.
>>>Files contain between 2 and 7 millions of rows, spread on 5 tables.
>>>
>>>For loading all the data, it takes 40mn, and the same processing takes
>>>17mn with Oracle. I think that this time can be improved by changing
>>>postgresql configuration file. But which parameters i need to manipulate
>>>and with which values ?
>>>
>>>Here are the specifications of my system :
>>>V250 architecture sun4u
>>>2xCPU UltraSparc IIIi 1.28 GHz.
>>>8 Go RAM.
>>>
>>>Regards.
>>>
>>> Will
>>>
>>>
>>>This e-mail is intended only for the above addressee. It may contain
>>>privileged information. If you are not the addressee you must not copy,
>>>distribute, disclose or use any of the information in it. If you have
>>>received it in error please delete it and immediately notify the sender.
>>>Security Notice: all e-mail, sent to or from this address, may be
>>>accessed by someone other than the recipient, for system management and
>>>security reasons. This access is controlled under Regulation of
>>>Investigatory Powers Act 2000, Lawful Business Practises.
>>>
>>>---(end of broadcast)---
>>>TIP 4: Have you searched our list archives?
>>>
>>>   http://archives.postgresql.org
>>> 
>>>
>>--
>>
>>Albert Cervera Areny
>>Dept. Informàtica Sedifa, S.L.
>>
>>Av. Can Bordoll, 149
>>08202 - Sabadell (Barcelona)
>>Tel. 93 715 51 11
>>Fax. 93 715 51 12
>>
>>
>>  AVISO LEGAL  
>>La   presente  comunicación  y sus anexos tiene como destinatario la
>>persona a  la  que  va  dirigida, por  lo  que  si  usted lo  recibe
>>por error  debe  notificarlo  al  remitente  y   eliminarlo   de  su
>>sistema,  no  pudiendo  utilizarlo,  total  o   parcialmente,   para
>>ningún  fin.  Su  contenido  puede  tener información confidencial o
>>protegida legalmente   y   únicamente   expresa  la  opinión del
>>remitente.  El   uso   del   correo   electrónico   vía Internet  no
>>permite   asegurarni  la   confidencialidad   de   los  mensajes
>>nisucorrecta recepción.   En

Re: [PERFORM] could not send data to client: Broken pipe

2006-02-15 Thread Richard Huxton

Pradeep Parmar wrote:

Hi,

I'm using Postgres 7.4. I have a web application built with php4 using
postgres7.4

I was going through /var/log/messages of my linux box ( SLES 9). I
encountered the following messages quite a few times.


postgres[20199]: [4-1] ERROR:  could not send data to client: Broken pipe
postgres[30391]: [6-1] LOG:  could not send data to client: Broken pipe
postgres[30570]: [6-1] LOG:  could not send data to client: Broken pipe


Can anyone help me in interpreting these messages?
What is causing this error msg? What is the severity?


Not really a performance question, but at a guess your client went away. 
Is there anything to indicate this in your php/apache logs? Can you 
reproduce it by hitting cancel in your web-browser?


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread Markus Schaber
Hi, Ferreira,

FERREIRA, William (VALTECH) wrote:

> i load data from files using copy method.
> Files contain between 2 and 7 millions of rows, spread on 5 tables.
> 
> For loading all the data, it takes 40mn, and the same processing takes 17mn 
> with Oracle.
> I think that this time can be improved by changing postgresql configuration 
> file.
> But which parameters i need to manipulate and with which values ?

Increase the size of the wal.

If its just a develpoment environment, or you don't mind data
inconsistency in case of a crash, disable fsync.

HTH,
Markus


-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread FERREIRA, William (VALTECH)

i tested the last version version of PostgreSQL
and for the same test :
before : 40mn
and now : 12mn :)
faster than Oracle (exactly what i wanted :p )

thanks to everybody

Will


-Message d'origine-
De : [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] la part de Albert
Cervera Areny
Envoyé : mardi 14 février 2006 17:07
À : pgsql-performance@postgresql.org
Objet : Re: [PERFORM] copy and postgresql.conf



Sorry, COPY improvements came with 8.1

(http://www.postgresql.org/docs/whatsnew)

A Dimarts 14 Febrer 2006 14:26, FERREIRA, William (VALTECH) va escriure:
> thanks,
>
> i'm using postgresql 8.0.3
> there is no primary key and no index on my tables
>
> regards
>
> -Message d'origine-
> De : [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] la part de Albert
> Cervera Areny
> Envoyé : mardi 14 février 2006 12:38
> À : pgsql-performance@postgresql.org
> Objet : Re: [PERFORM] copy and postgresql.conf
>
>
>
> Hi William,
>   which PostgreSQL version are you using? Newer (8.0+) versions have some
>
> important performance improvements for the COPY command.
>
>   Also, you'll notice significant improvements by creating primary & 
> foreign
>
> keys after the copy command. I think config tweaking can improve key and
>
> index creation but I don't think you can improve the COPY command itself.
>
>   There are also many threads in this list commenting on this issue, 
> you'll
>
> find it easely in the archives.
>
> A Dimarts 14 Febrer 2006 10:44, FERREIRA, William (VALTECH) va escriure:
> > hi,
> >
> > i load data from files using copy method.
> > Files contain between 2 and 7 millions of rows, spread on 5 tables.
> >
> > For loading all the data, it takes 40mn, and the same processing takes
> > 17mn with Oracle. I think that this time can be improved by changing
> > postgresql configuration file. But which parameters i need to manipulate
> > and with which values ?
> >
> > Here are the specifications of my system :
> > V250 architecture sun4u
> > 2xCPU UltraSparc IIIi 1.28 GHz.
> > 8 Go RAM.
> >
> > Regards.
> >
> > Will
> >
> >
> > This e-mail is intended only for the above addressee. It may contain
> > privileged information. If you are not the addressee you must not copy,
> > distribute, disclose or use any of the information in it. If you have
> > received it in error please delete it and immediately notify the sender.
> > Security Notice: all e-mail, sent to or from this address, may be
> > accessed by someone other than the recipient, for system management and
> > security reasons. This access is controlled under Regulation of
> > Investigatory Powers Act 2000, Lawful Business Practises.
> >
> > ---(end of broadcast)---
> > TIP 4: Have you searched our list archives?
> >
> >http://archives.postgresql.org
>
> --
>
> Albert Cervera Areny
> Dept. Informàtica Sedifa, S.L.
>
> Av. Can Bordoll, 149
> 08202 - Sabadell (Barcelona)
> Tel. 93 715 51 11
> Fax. 93 715 51 12
>
> 
>   AVISO LEGAL  
> La   presente  comunicación  y sus anexos tiene como destinatario la
> persona a  la  que  va  dirigida, por  lo  que  si  usted lo  recibe
> por error  debe  notificarlo  al  remitente  y   eliminarlo   de  su
> sistema,  no  pudiendo  utilizarlo,  total  o   parcialmente,   para
> ningún  fin.  Su  contenido  puede  tener información confidencial o
> protegida legalmente   y   únicamente   expresa  la  opinión del
> remitente.  El   uso   del   correo   electrónico   vía Internet  no
> permite   asegurarni  la   confidencialidad   de   los  mensajes
> nisucorrecta recepción.   Enel  caso   de   que   el
> destinatario no consintiera la utilización  del correo  electrónico,
> deberá ponerlo en nuestro conocimiento inmediatamente.
> 
> ... DISCLAIMER .
> This message and its  attachments are  intended  exclusively for the
> named addressee. If you  receive  this  message  in   error,  please
> immediately delete it from  your  system  and notify the sender. You
> may  not  use  this message  or  any  part  of it  for any  purpose.
> The   message   may  contain  information  that  is  confidential or
> protected  by  law,  and  any  opinions  expressed  are those of the
> individualsender.  Internet  e-mail   guarantees   neither   the
> confidentiality   nor  the  proper  receipt  of  the  message  sent.
> If  the  addressee  of  this  message  does  not  consent to the use
> of   internete-mail,pleaseinform usinmmediately.
> 
>
>
>
>
>
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings
>
>
>
>

Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread Jignesh K. Shah
What's your postgresql.conf parameter for the equivalent ones that I 
suggested?
I believe your wal_buffers and checkpoint_segments could be bigger. If 
that's the case then yep you are fine.


As for the background writer I am seeing mixed results yet so not sure 
about that.


But thanks for the feedback.

-Jignesh


FERREIRA, William (VALTECH) wrote:


i tested the last version version of PostgreSQL
and for the same test :
before : 40mn
and now : 12mn :)
faster than Oracle (exactly what i wanted :p )

thanks to everybody

Will


-Message d'origine-
De : [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] la part de Albert
Cervera Areny
Envoyé : mardi 14 février 2006 17:07
À : pgsql-performance@postgresql.org
Objet : Re: [PERFORM] copy and postgresql.conf



Sorry, COPY improvements came with 8.1

(http://www.postgresql.org/docs/whatsnew)

A Dimarts 14 Febrer 2006 14:26, FERREIRA, William (VALTECH) va escriure:
 


thanks,

i'm using postgresql 8.0.3
there is no primary key and no index on my tables

regards

-Message d'origine-
De : [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] la part de Albert
Cervera Areny
Envoyé : mardi 14 février 2006 12:38
À : pgsql-performance@postgresql.org
Objet : Re: [PERFORM] copy and postgresql.conf



Hi William,
which PostgreSQL version are you using? Newer (8.0+) versions have some

important performance improvements for the COPY command.

Also, you'll notice significant improvements by creating primary & 
foreign

keys after the copy command. I think config tweaking can improve key and

index creation but I don't think you can improve the COPY command itself.

There are also many threads in this list commenting on this issue, 
you'll

find it easely in the archives.

A Dimarts 14 Febrer 2006 10:44, FERREIRA, William (VALTECH) va escriure:
   


hi,

i load data from files using copy method.
Files contain between 2 and 7 millions of rows, spread on 5 tables.

For loading all the data, it takes 40mn, and the same processing takes
17mn with Oracle. I think that this time can be improved by changing
postgresql configuration file. But which parameters i need to manipulate
and with which values ?

Here are the specifications of my system :
V250 architecture sun4u
2xCPU UltraSparc IIIi 1.28 GHz.
8 Go RAM.

Regards.

Will


This e-mail is intended only for the above addressee. It may contain
privileged information. If you are not the addressee you must not copy,
distribute, disclose or use any of the information in it. If you have
received it in error please delete it and immediately notify the sender.
Security Notice: all e-mail, sent to or from this address, may be
accessed by someone other than the recipient, for system management and
security reasons. This access is controlled under Regulation of
Investigatory Powers Act 2000, Lawful Business Practises.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org
 


--

Albert Cervera Areny
Dept. Informàtica Sedifa, S.L.

Av. Can Bordoll, 149
08202 - Sabadell (Barcelona)
Tel. 93 715 51 11
Fax. 93 715 51 12


  AVISO LEGAL  
La   presente  comunicación  y sus anexos tiene como destinatario la
persona a  la  que  va  dirigida, por  lo  que  si  usted lo  recibe
por error  debe  notificarlo  al  remitente  y   eliminarlo   de  su
sistema,  no  pudiendo  utilizarlo,  total  o   parcialmente,   para
ningún  fin.  Su  contenido  puede  tener información confidencial o
protegida legalmente   y   únicamente   expresa  la  opinión del
remitente.  El   uso   del   correo   electrónico   vía Internet  no
permite   asegurarni  la   confidencialidad   de   los  mensajes
nisucorrecta recepción.   Enel  caso   de   que   el
destinatario no consintiera la utilización  del correo  electrónico,
deberá ponerlo en nuestro conocimiento inmediatamente.

... DISCLAIMER .
This message and its  attachments are  intended  exclusively for the
named addressee. If you  receive  this  message  in   error,  please
immediately delete it from  your  system  and notify the sender. You
may  not  use  this message  or  any  part  of it  for any  purpose.
The   message   may  contain  information  that  is  confidential or
protected  by  law,  and  any  opinions  expressed  are those of the
individualsender.  Internet  e-mail   guarantees   neither   the
confidentiality   nor  the  proper  receipt  of  the  message  sent.
If  the  addressee  of  this  message  does  not  consent to the use
of   internete-mail,pleaseinform usinmmediately.






-

Re: [PERFORM] Postgres slower than MS ACCESS

2006-02-15 Thread Jeff Trout


On Feb 14, 2006, at 3:56 PM, Jay Greenfield wrote:

How do you get 4,000+ lines of explain analyze for one update  
query in a
database with only one table?  Something a bit fishy there.   
Perhaps you

mean explain verbose, though I don't really see how that'd be so long
either, but it'd be closer.  Could you provide some more sane
information?


My mistake - there was 4,000 lines in the EXPLAIN ANALYZE VERBOSE  
output.

Here is the output of EXPLAIN ANALYZE:

QUERY PLAN
"Seq Scan on ntdn  (cost=0.00..3471884.39 rows=1221391 width=1592)  
(actual

time=57292.580..1531300.003 rows=1221391 loops=1)"
"Total runtime: 4472646.988 ms"



Have you been vacuuming or running autovacuum?
If you keep running queries like this you're certianly going to have  
a ton of dead tuples, which would def explain these times too.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread FERREIRA, William (VALTECH)

with PostgreSQL 8.1.3, here are my parameters (it's the default configuration)

wal_sync_method = fsync
wal_buffers = 8
checkpoint_segments = 3
bgwriter_lru_percent = 1.0
bgwriter_lru_maxpages = 5
bgwriter_all_percent = 0.333
bgwriter_all_maxpages = 5

and you think times can be improved again ?

-Message d'origine-
De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Envoyé : mercredi 15 février 2006 15:14
À : FERREIRA, William (VALTECH)
Cc : Albert Cervera Areny; pgsql-performance@postgresql.org
Objet : Re: [PERFORM] copy and postgresql.conf



What's your postgresql.conf parameter for the equivalent ones that I
suggested?
I believe your wal_buffers and checkpoint_segments could be bigger. If
that's the case then yep you are fine.

As for the background writer I am seeing mixed results yet so not sure
about that.

But thanks for the feedback.

-Jignesh


FERREIRA, William (VALTECH) wrote:

>i tested the last version version of PostgreSQL
>and for the same test :
>before : 40mn
>and now : 12mn :)
>faster than Oracle (exactly what i wanted :p )
>
>thanks to everybody
>
>   Will
>
>
>-Message d'origine-
>De : [EMAIL PROTECTED]
>[mailto:[EMAIL PROTECTED] la part de Albert
>Cervera Areny
>Envoyé : mardi 14 février 2006 17:07
>À : pgsql-performance@postgresql.org
>Objet : Re: [PERFORM] copy and postgresql.conf
>
>
>
>Sorry, COPY improvements came with 8.1
>
>(http://www.postgresql.org/docs/whatsnew)
>
>A Dimarts 14 Febrer 2006 14:26, FERREIRA, William (VALTECH) va escriure:
> 
>
>>thanks,
>>
>>i'm using postgresql 8.0.3
>>there is no primary key and no index on my tables
>>
>>regards
>>
>>-Message d'origine-
>>De : [EMAIL PROTECTED]
>>[mailto:[EMAIL PROTECTED] la part de Albert
>>Cervera Areny
>>Envoyé : mardi 14 février 2006 12:38
>>À : pgsql-performance@postgresql.org
>>Objet : Re: [PERFORM] copy and postgresql.conf
>>
>>
>>
>>Hi William,
>>  which PostgreSQL version are you using? Newer (8.0+) versions have some
>>
>>important performance improvements for the COPY command.
>>
>>  Also, you'll notice significant improvements by creating primary & 
>> foreign
>>
>>keys after the copy command. I think config tweaking can improve key and
>>
>>index creation but I don't think you can improve the COPY command itself.
>>
>>  There are also many threads in this list commenting on this issue, 
>> you'll
>>
>>find it easely in the archives.
>>
>>A Dimarts 14 Febrer 2006 10:44, FERREIRA, William (VALTECH) va escriure:
>>   
>>
>>>hi,
>>>
>>>i load data from files using copy method.
>>>Files contain between 2 and 7 millions of rows, spread on 5 tables.
>>>
>>>For loading all the data, it takes 40mn, and the same processing takes
>>>17mn with Oracle. I think that this time can be improved by changing
>>>postgresql configuration file. But which parameters i need to manipulate
>>>and with which values ?
>>>
>>>Here are the specifications of my system :
>>>V250 architecture sun4u
>>>2xCPU UltraSparc IIIi 1.28 GHz.
>>>8 Go RAM.
>>>
>>>Regards.
>>>
>>> Will
>>>
>>>
>>>This e-mail is intended only for the above addressee. It may contain
>>>privileged information. If you are not the addressee you must not copy,
>>>distribute, disclose or use any of the information in it. If you have
>>>received it in error please delete it and immediately notify the sender.
>>>Security Notice: all e-mail, sent to or from this address, may be
>>>accessed by someone other than the recipient, for system management and
>>>security reasons. This access is controlled under Regulation of
>>>Investigatory Powers Act 2000, Lawful Business Practises.
>>>
>>>---(end of broadcast)---
>>>TIP 4: Have you searched our list archives?
>>>
>>>   http://archives.postgresql.org
>>> 
>>>
>>--
>>
>>Albert Cervera Areny
>>Dept. Informàtica Sedifa, S.L.
>>
>>Av. Can Bordoll, 149
>>08202 - Sabadell (Barcelona)
>>Tel. 93 715 51 11
>>Fax. 93 715 51 12
>>
>>
>>  AVISO LEGAL  
>>La   presente  comunicación  y sus anexos tiene como destinatario la
>>persona a  la  que  va  dirigida, por  lo  que  si  usted lo  recibe
>>por error  debe  notificarlo  al  remitente  y   eliminarlo   de  su
>>sistema,  no  pudiendo  utilizarlo,  total  o   parcialmente,   para
>>ningún  fin.  Su  contenido  puede  tener información confidencial o
>>protegida legalmente   y   únicamente   expresa  la  opinión del
>>remitente.  El   uso   del   correo   electrónico   vía Internet  no
>>permite   asegurarni  la   confidencialidad   de   los  mensajes
>>nisucorrecta recepción.   Enel  caso   de   que   el
>>destinatario no consintiera la utilización  del correo  electrónico,
>>deberá ponerlo en nuestro conocimiento inmediatamente.
>>
>>... DISCLAIMER 

Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread Jignesh K. Shah

Actually  fsync is not the default on solaris (verify using "show all;)

(If you look closely in postgresql.conf it is commented out and 
mentioned as default but show all tells a different story)


In all my cases I saw the default as
wal_sync_method | open_datasync

Also I had seen quite an   improvement by changing the default 
checkpoint_segments from 3 to 64 or 128 and also increasing wal_buffers 
to 64  depending on how heavy is your load.


Also open_datasync type of operations benefit with forcedirectio on 
Solaris and hence either move wal to forcedirectio mounted file system 
or try changing default sync to fsync (the *said* default)


Now if you use fsync then you need a bigger file system cache since by 
default Solaris's segmap mechanism only maps 12% of your physical ram to 
be used for file system buffer cache. Increasing segmap_percent to 50 on 
SPARC allows to use 50% of your RAM to be mapped to be used for 50% ( 
NOTE: It does not reserve but just allow mapping of the memory which can 
be used for file system buffer cache)


Changing maxphys allows the file system buffer cache to coalesce writes 
from the 8Ks that PostgreSQL is doing to bigger writes/reads. Also since 
you are now exploiting file system buffer cache, file system Logging is 
very much recommended (available from a later update of Solaris 8 I 
believe).



Regards,
Jignesh





FERREIRA, William (VALTECH) wrote:


with PostgreSQL 8.1.3, here are my parameters (it's the default configuration)

wal_sync_method = fsync
wal_buffers = 8
checkpoint_segments = 3
bgwriter_lru_percent = 1.0
bgwriter_lru_maxpages = 5
bgwriter_all_percent = 0.333
bgwriter_all_maxpages = 5

and you think times can be improved again ?

-Message d'origine-
De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Envoyé : mercredi 15 février 2006 15:14
À : FERREIRA, William (VALTECH)
Cc : Albert Cervera Areny; pgsql-performance@postgresql.org
Objet : Re: [PERFORM] copy and postgresql.conf



What's your postgresql.conf parameter for the equivalent ones that I
suggested?
I believe your wal_buffers and checkpoint_segments could be bigger. If
that's the case then yep you are fine.

As for the background writer I am seeing mixed results yet so not sure
about that.

But thanks for the feedback.

-Jignesh


FERREIRA, William (VALTECH) wrote:

 


i tested the last version version of PostgreSQL
and for the same test :
before : 40mn
and now : 12mn :)
faster than Oracle (exactly what i wanted :p )

thanks to everybody

Will


-Message d'origine-
De : [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] la part de Albert
Cervera Areny
Envoyé : mardi 14 février 2006 17:07
À : pgsql-performance@postgresql.org
Objet : Re: [PERFORM] copy and postgresql.conf



Sorry, COPY improvements came with 8.1

(http://www.postgresql.org/docs/whatsnew)

A Dimarts 14 Febrer 2006 14:26, FERREIRA, William (VALTECH) va escriure:


   


thanks,

i'm using postgresql 8.0.3
there is no primary key and no index on my tables

regards

-Message d'origine-
De : [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] la part de Albert
Cervera Areny
Envoyé : mardi 14 février 2006 12:38
À : pgsql-performance@postgresql.org
Objet : Re: [PERFORM] copy and postgresql.conf



Hi William,
which PostgreSQL version are you using? Newer (8.0+) versions have some

important performance improvements for the COPY command.

Also, you'll notice significant improvements by creating primary & 
foreign

keys after the copy command. I think config tweaking can improve key and

index creation but I don't think you can improve the COPY command itself.

There are also many threads in this list commenting on this issue, 
you'll

find it easely in the archives.

A Dimarts 14 Febrer 2006 10:44, FERREIRA, William (VALTECH) va escriure:
 

 


hi,

i load data from files using copy method.
Files contain between 2 and 7 millions of rows, spread on 5 tables.

For loading all the data, it takes 40mn, and the same processing takes
17mn with Oracle. I think that this time can be improved by changing
postgresql configuration file. But which parameters i need to manipulate
and with which values ?

Here are the specifications of my system :
V250 architecture sun4u
2xCPU UltraSparc IIIi 1.28 GHz.
8 Go RAM.

Regards.

Will


This e-mail is intended only for the above addressee. It may contain
privileged information. If you are not the addressee you must not copy,
distribute, disclose or use any of the information in it. If you have
received it in error please delete it and immediately notify the sender.
Security Notice: all e-mail, sent to or from this address, may be
accessed by someone other than the recipient, for system management and
security reasons. This access is controlled under Regulation of
Investigatory Powers Act 2000, Lawful Business Practises.

---(end of broadcast)---
TIP 4: Have you searched our list archives

Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread Tom Lane
"FERREIRA, William (VALTECH)" <[EMAIL PROTECTED]> writes:
> with PostgreSQL 8.1.3, here are my parameters (it's the default configuration)

> wal_sync_method = fsync
> wal_buffers = 8
> checkpoint_segments = 3
> bgwriter_lru_percent = 1.0
> bgwriter_lru_maxpages = 5
> bgwriter_all_percent = 0.333
> bgwriter_all_maxpages = 5

> and you think times can be improved again ?

Increasing checkpoint_segments will definitely help for any write-intensive
situation.  It costs you in disk space of course, as well as the time
needed for post-crash recovery.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Stored proc and optimizer question

2006-02-15 Thread Antal Attila

Hi!

I have a question about the query optimizer and the function scan. See
the next case:

CREATE TABLE a (id SERIAL PRIMARY KEY, userid INT4, col  TEXT);
CREATE TABLE b (id SERIAL PRIMARY KEY, userid INT4, a_id INT4 REFERENCES
a (id), col  TEXT);
CREATE INDEX idx_a_uid ON a(userid);
CREATE INDEX idx_b_uid ON b(userid);
CREATE INDEX idx_a_col ON a(col);
CREATE INDEX idx_b_col ON b(col);

First solution:

   CREATE VIEW ab_view AS
   SELECT a.id AS id,
  a.userid AS userid_a, b.userid AS userid_b,
  a.col AS col_a, b.col AS col_b
   FROM a LEFT JOIN b ON (a.id = b.a_id);

   EXPLAIN ANALYSE SELECT * FROM ab_view
   WHERE userid_a = 23 AND userid_b = 23 AND col_a LIKE 's%'
   ORDER BY col_b
   LIMIT 10 OFFSET 10;

   QUERY PLAN
--
Limit  (cost=15.70..15.70 rows=1 width=76) (actual time=0.108..0.108
rows=0 loops=1)
  ->  Sort  (cost=15.69..15.70 rows=1 width=76) (actual
time=0.104..0.104 rows=0 loops=1)
Sort Key: b.col
->  Nested Loop  (cost=3.32..15.68 rows=1 width=76) (actual
time=0.085..0.085 rows=0 loops=1)
  Join Filter: ("outer".id = "inner".a_id)
  ->  Bitmap Heap Scan on a  (cost=2.30..6.13 rows=1
width=40) (actual time=0.082..0.082 rows=0 loops=1)
Recheck Cond: (userid = 23)
Filter: (col ~~ 's%'::text)
->  BitmapAnd  (cost=2.30..2.30 rows=1 width=0)
(actual time=0.077..0.077 rows=0 loops=1)
  ->  Bitmap Index Scan on idx_a_uid
(cost=0.00..1.02 rows=6 width=0) (actual time=0.075..0.075 rows=0 loops=1)
Index Cond: (userid = 23)
  ->  Bitmap Index Scan on idx_a_col
(cost=0.00..1.03 rows=6 width=0) (never executed)
Index Cond: ((col >= 's'::text) AND
(col < 't'::text))
  ->  Bitmap Heap Scan on b  (cost=1.02..9.49 rows=5
width=40) (never executed)
Recheck Cond: (userid = 23)
->  Bitmap Index Scan on idx_b_uid
(cost=0.00..1.02 rows=5 width=0) (never executed)
  Index Cond: (userid = 23)
Total runtime: 0.311 ms


In the first solution the query optimizer can work on the view and the
full execution of the query will be optimal. But I have to use 2
condition for the userid fields (userid_a = 23 AND userid_b = 23 ). If I
have to eliminate the duplication I can try to use stored function.

Second solution:
   CREATE FUNCTION ab_select(INT4)  RETURNS setof ab_view AS $$
   SELECT a.id AS id,
  a.userid AS userid_a, b.userid AS userid_b,
  a.col AS col_a, b.col AS col_b
   FROM a LEFT JOIN b ON (a.id = b.a_id AND b.userid = $1)
   WHERE a.userid = $1;
   $$ LANGUAGE SQL STABLE;

   EXPLAIN ANALYSE SELECT * FROM ab_select(23)
   WHERE col_a LIKE 's%'
   ORDER BY col_b
   LIMIT 10 OFFSET 10;

 QUERY PLAN
--
Limit  (cost=15.07..15.07 rows=1 width=76) (actual time=1.034..1.034
rows=0 loops=1)
  ->  Sort  (cost=15.06..15.07 rows=5 width=76) (actual
time=1.030..1.030 rows=0 loops=1)
Sort Key: col_b
->  Function Scan on ab_select  (cost=0.00..15.00 rows=5
width=76) (actual time=1.004..1.004 rows=0 loops=1)
  Filter: (col_a ~~ 's%'::text)
Total runtime: 1.103 ms

The second solution have 2 advantage:
 1. The second query is more beautiful and shorter.
 2. You can rewrite easier the stored function without modify the query.

But I have heartache, because the optimizer give up the game. It cannot
optimize the query globally (inside and outside the stored function) in
spite of the STABLE keyword. It use function scan on the result of the
stored function.

How can I eliminate the function scan while I want to keep the advantages?

In my opinion the optimizer cannot replace the function scan with a more
optimal plan, but this feature may be implemented in the next versions
of  PostgreSQL. I would like to suggest this.

I built this case theoretically, but I have more stored procedure which
works with bad performance therefore.

Regards,
Antal Attila


---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] out of memory

2006-02-15 Thread martial . bizel

Good morning,




I've increased sort_mem until 2Go !!
and the error "out of memory" appears again.

Here the request I try to pass with her explain plan,

 Nested Loop  (cost=2451676.23..2454714.73 rows=1001 width=34)
   ->  Subquery Scan "day"  (cost=2451676.23..2451688.73 rows=1000 width=16)
 ->  Limit  (cost=2451676.23..2451678.73 rows=1000 width=12)
   ->  Sort  (cost=2451676.23..2451684.63 rows=3357 width=12)
 Sort Key: sum(occurence)
 ->  HashAggregate  (cost=2451471.24..2451479.63 rows=3357
width=12)
   ->  Index Scan using test_date on
queries_detail_statistics  (cost=0.00..2449570.55 rows=380138 width=12)
 Index Cond: ((date >= '2006-01-01'::date) AND
(date <= '2006-01-30'::date))
 Filter: (((portal)::text = '1'::text) OR
((portal)::text = '2'::text))
   ->  Index Scan using query_string_pkey on query_string  (cost=0.00..3.01
rows=1 width=34)
 Index Cond: ("outer".query = query_string.id)
(11 rows)

Any new ideas ?,
thanks

MB.




> On Tue, 2006-02-14 at 10:32, [EMAIL PROTECTED] wrote:
> > command explain analyze crash with the "out of memory" error
> >
> > I precise that I've tried a lot of values from parameters shared_buffer and
> > sort_mem
> >
> > now, in config file, values are :
> > sort_mem=32768
> > and shared_buffer=3
>
> OK, on the command line, try increasing the sort_mem until hash_agg can
> work.  With a 4 gig machine, you should be able to go as high as needed
> here, I'd think.  Try as high as 50 or so or more.  Then when
> explain analyze works, compare the actual versus estimated number of
> rows.
>


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] explain hashAggregate

2006-02-15 Thread martial . bizel
Good morning,

I try to understand how optimizer uses HashAggregate instead of GroupAggregate
and I want to know what is exactly this two functionnality (benefits
/inconvenients)

In my case, I've this explain plan.
---
 Nested Loop  (cost=2451676.23..2454714.73 rows=1001 width=34)
   ->  Subquery Scan "day"  (cost=2451676.23..2451688.73 rows=1000 width=16)
 ->  Limit  (cost=2451676.23..2451678.73 rows=1000 width=12)
   ->  Sort  (cost=2451676.23..2451684.63 rows=3357 width=12)
 Sort Key: sum(occurence)
 ->  HashAggregate  (cost=2451471.24..2451479.63 rows=3357
width=12)
   ->  Index Scan using test_date on
queries_detail_statistics  (cost=0.00..2449570.55 rows=380138 width=12)
 Index Cond: ((date >= '2006-01-01'::date) AND
(date <= '2006-01-30'::date))
 Filter: (((portal)::text = '1'::text) OR
((portal)::text = '2'::text))
   ->  Index Scan using query_string_pkey on query_string  (cost=0.00..3.01
rows=1 width=34)
 Index Cond: ("outer".query = query_string.id)

How to get necessary memory RAM for this explain plan ?



thanks a lot


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] Reliability recommendations

2006-02-15 Thread Jeremy Haile
We are a small company looking to put together the most cost effective
solution for our production database environment.  Currently in
production Postgres 8.1 is running on this machine:

Dell 2850
2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
4 GB DDR2 400 Mhz
2 x 73 GB 10K SCSI RAID 1 (for xlog and OS)
4 x 146 GB 10K SCSI RAID 10 (for postgres data)
Perc4ei controller

The above is a standard Dell box with nothing added or modified beyond
the options available directly through Dell. We had a bad processor last
week that effectively put us down for an entire weekend. Though it was
the web server that failed, the experience has caused us to step back
and spend time coming up with a more reliable/fail-safe solution that
can reduce downtime.

Our load won't be substantial so extreme performance and load balancing
are not huge concerns. We are looking for good performance, at a good
price, configured in the most redundant, high availability manner
possible. Availability is the biggest priority.

I sent our scenario to our sales team at Dell and they came back with
all manner of SAN, DAS, and configuration costing as much as $50k.

We have the budget to purchase 2-3 additional machines along the lines
of the one listed above. As a startup with a limited budget, what would
this list suggest as options for clustering/replication or setting our
database up well in general?

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] out of memory

2006-02-15 Thread martial . bizel
You're right, release is 7.4.7.

there's twenty millions records "query"

> On Tue, 2006-02-14 at 11:36, Tom Lane wrote:
> > [EMAIL PROTECTED] writes:
> > > Yes, I've launched ANALYZE command before sending request.
> > > I precise that's postgres version is 7.3.4
> >
> > Can't possibly be 7.3.4, that version didn't have HashAggregate.
> >
> > How many distinct values of "query" actually exist in the table?
>
> I thought that looked odd.
>
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings
>



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] out of memory

2006-02-15 Thread Scott Marlowe
On Wed, 2006-02-15 at 09:55, [EMAIL PROTECTED] wrote:
> Good morning,
> 
> 
> 
> 
> I've increased sort_mem until 2Go !!
> and the error "out of memory" appears again.
> 
> Here the request I try to pass with her explain plan,
> 
>  Nested Loop  (cost=2451676.23..2454714.73 rows=1001 width=34)
>->  Subquery Scan "day"  (cost=2451676.23..2451688.73 rows=1000 width=16)
>  ->  Limit  (cost=2451676.23..2451678.73 rows=1000 width=12)
>->  Sort  (cost=2451676.23..2451684.63 rows=3357 width=12)
>  Sort Key: sum(occurence)
>  ->  HashAggregate  (cost=2451471.24..2451479.63 rows=3357
> width=12)
>->  Index Scan using test_date on
> queries_detail_statistics  (cost=0.00..2449570.55 rows=380138 width=12)
>  Index Cond: ((date >= '2006-01-01'::date) AND
> (date <= '2006-01-30'::date))
>  Filter: (((portal)::text = '1'::text) OR
> ((portal)::text = '2'::text))
>->  Index Scan using query_string_pkey on query_string  (cost=0.00..3.01
> rows=1 width=34)
>  Index Cond: ("outer".query = query_string.id)
> (11 rows)

OK, so it looks like something is horrible wrong here.  Try running the
explain analyze query after running the following:

 set enable_hashagg=off;

and see what you get then.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] out of memory

2006-02-15 Thread martial . bizel
Here the result with hashAgg to false :
 Nested Loop  (cost=2487858.08..2490896.58 rows=1001 width=34) (actual
time=1028044.781..1030251.260 rows=1000 loops=1)
   ->  Subquery Scan "day"  (cost=2487858.08..2487870.58 rows=1000 width=16)
(actual time=1027996.748..1028000.969 rows=1000 loops=1)
 ->  Limit  (cost=2487858.08..2487860.58 rows=1000 width=12) (actual
time=1027996.737..1027999.199 rows=1000 loops=1)
   ->  Sort  (cost=2487858.08..2487866.47 rows=3357 width=12)
(actual time=1027996.731..1027998.066 rows=1000 loops=1)
 Sort Key: sum(occurence)
 ->  GroupAggregate  (cost=2484802.05..2487661.48 rows=3357
width=12) (actual time=810623.035..914550.262 rows=19422774 loops=1)
   ->  Sort  (cost=2484802.05..2485752.39 rows=380138
width=12) (actual time=810612.248..845427.013 rows=36724340 loops=1)
 Sort Key: query
 ->  Index Scan using test_date on
queries_detail_statistics  (cost=0.00..2449570.55 rows=380138 width=12) (actual
time=25.393..182029.205 rows=36724340 loops=1)
   Index Cond: ((date >= '2006-01-01'::date)
AND (date <= '2006-01-30'::date))
   Filter: (((portal)::text = '1'::text) OR
((portal)::text = '2'::text))
   ->  Index Scan using query_string_pkey on query_string  (cost=0.00..3.01
rows=1 width=34) (actual time=2.244..2.246 rows=1 loops=1000)
 Index Cond: ("outer".query = query_string.id)
 Total runtime: 1034357.390 ms
(14 rows)


thanks

table daily has 250 millions records
and field query (bigint) 2 millions, occurence is int.

request with HashAggregate is OK when date is restricted about 15 days like :

  SELECT query_string, DAY.ocu from search_data.query_string,
 (SELECT SUM(occurence) as ocu, query
FROM daily.queries_detail_statistics
 WHERE date >= '2006-01-01' AND date <= '2006-01-15'
 AND portal IN (1,2)
 GROUP BY query
 ORDER BY ocu DESC
 LIMIT 1000) as DAY
 WHERE DAY.query=id;

> On Wed, 2006-02-15 at 09:55, [EMAIL PROTECTED] wrote:
> > Good morning,
> >
> >
> >
> >
> > I've increased sort_mem until 2Go !!
> > and the error "out of memory" appears again.
> >
> > Here the request I try to pass with her explain plan,
> >
> >  Nested Loop  (cost=2451676.23..2454714.73 rows=1001 width=34)
> >->  Subquery Scan "day"  (cost=2451676.23..2451688.73 rows=1000
> width=16)
> >  ->  Limit  (cost=2451676.23..2451678.73 rows=1000 width=12)
> >->  Sort  (cost=2451676.23..2451684.63 rows=3357 width=12)
> >  Sort Key: sum(occurence)
> >  ->  HashAggregate  (cost=2451471.24..2451479.63
> rows=3357
> > width=12)
> >->  Index Scan using test_date on
> > queries_detail_statistics  (cost=0.00..2449570.55 rows=380138 width=12)
> >  Index Cond: ((date >= '2006-01-01'::date)
> AND
> > (date <= '2006-01-30'::date))
> >  Filter: (((portal)::text = '1'::text) OR
> > ((portal)::text = '2'::text))
> >->  Index Scan using query_string_pkey on query_string  (cost=0.00..3.01
> > rows=1 width=34)
> >  Index Cond: ("outer".query = query_string.id)
> > (11 rows)
>
> OK, so it looks like something is horrible wrong here.  Try running the
> explain analyze query after running the following:
>
>  set enable_hashagg=off;
>
> and see what you get then.
>
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly
>



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Craig A. James

Jeremy Haile wrote:

We are a small company looking to put together the most cost effective
solution for our production database environment.  Currently in
production Postgres 8.1 is running on this machine:

Dell 2850
2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
4 GB DDR2 400 Mhz
2 x 73 GB 10K SCSI RAID 1 (for xlog and OS)
4 x 146 GB 10K SCSI RAID 10 (for postgres data)
Perc4ei controller

... I sent our scenario to our sales team at Dell and they came back with
all manner of SAN, DAS, and configuration costing as much as $50k.


Given what you've told us, a $50K machine is not appropriate.

Instead, think about a simple system with several clones of the database and a 
load-balancing web server, even if one machine could handle your load.  If a 
machine goes down, the load balancer automatically switches to the other.

Look at the MTBF figures of two hypothetical machines:

Machine 1: Costs $2,000, MTBF of 2 years, takes two days to fix on average.
Machine 2: Costs $50,000, MTBF of 100 years (!), takes one hour to fix on 
average.

Now go out and buy three of the $2,000 machines.  Use a load-balancer front end web 
server that can send requests round-robin fashion to a "server farm".  Clone 
your database.  In fact, clone the load-balancer too so that all three machines have all 
software and databases installed.  Call these A, B, and C machines.

At any given time, your Machine A is your web front end, serving requests to 
databases on A, B and C.  If B or C goes down, no problem - the system keeps 
running.  If A goes down, you switch the IP address of B or C and make it your 
web front end, and you're back in business in a few minutes.

Now compare the reliability -- in order for this system to be disabled, you'd have to have ALL THREE computers fail at the same time.  With the MTBF and repair time of two days, each machine has a 99.726% uptime.  The "MTBF", that is, the expected time until all three machines will fail simultaneously, is well over 100,000 years!  Of course, this is silly, machines don't last that long, but it illustrates the point:  Redundancy is beats reliability (which is why RAID is so useful). 


All for $6,000.

Craig

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Mark Lewis
Machine 1: $2000
Machine 2: $2000
Machine 3: $2000

Knowing how to rig them together and maintain them in a fully fault-
tolerant way: priceless.


(Sorry for the off-topic post, I couldn't resist).

-- Mark Lewis

On Wed, 2006-02-15 at 09:19 -0800, Craig A. James wrote:
> Jeremy Haile wrote:
> > We are a small company looking to put together the most cost effective
> > solution for our production database environment.  Currently in
> > production Postgres 8.1 is running on this machine:
> > 
> > Dell 2850
> > 2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
> > 4 GB DDR2 400 Mhz
> > 2 x 73 GB 10K SCSI RAID 1 (for xlog and OS)
> > 4 x 146 GB 10K SCSI RAID 10 (for postgres data)
> > Perc4ei controller
> > 
> > ... I sent our scenario to our sales team at Dell and they came back with
> > all manner of SAN, DAS, and configuration costing as much as $50k.
> 
> Given what you've told us, a $50K machine is not appropriate.
> 
> Instead, think about a simple system with several clones of the database and 
> a load-balancing web server, even if one machine could handle your load.  If 
> a machine goes down, the load balancer automatically switches to the other.
> 
> Look at the MTBF figures of two hypothetical machines:
> 
>  Machine 1: Costs $2,000, MTBF of 2 years, takes two days to fix on average.
>  Machine 2: Costs $50,000, MTBF of 100 years (!), takes one hour to fix on 
> average.
> 
> Now go out and buy three of the $2,000 machines.  Use a load-balancer front 
> end web server that can send requests round-robin fashion to a "server farm". 
>  Clone your database.  In fact, clone the load-balancer too so that all three 
> machines have all software and databases installed.  Call these A, B, and C 
> machines.
> 
> At any given time, your Machine A is your web front end, serving requests to 
> databases on A, B and C.  If B or C goes down, no problem - the system keeps 
> running.  If A goes down, you switch the IP address of B or C and make it 
> your web front end, and you're back in business in a few minutes.
> 
> Now compare the reliability -- in order for this system to be disabled, you'd 
> have to have ALL THREE computers fail at the same time.  With the MTBF and 
> repair time of two days, each machine has a 99.726% uptime.  The "MTBF", that 
> is, the expected time until all three machines will fail simultaneously, is 
> well over 100,000 years!  Of course, this is silly, machines don't last that 
> long, but it illustrates the point:  Redundancy is beats reliability (which 
> is why RAID is so useful). 
> 
> All for $6,000.
> 
> Craig
> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] out of memory

2006-02-15 Thread Scott Marlowe
On Wed, 2006-02-15 at 11:18, [EMAIL PROTECTED] wrote:
> Here the result with hashAgg to false :
>  Nested Loop  (cost=2487858.08..2490896.58 rows=1001 width=34) (actual
> time=1028044.781..1030251.260 rows=1000 loops=1)
>->  Subquery Scan "day"  (cost=2487858.08..2487870.58 rows=1000 width=16)
> (actual time=1027996.748..1028000.969 rows=1000 loops=1)
>  ->  Limit  (cost=2487858.08..2487860.58 rows=1000 width=12) (actual
> time=1027996.737..1027999.199 rows=1000 loops=1)
>->  Sort  (cost=2487858.08..2487866.47 rows=3357 width=12)
> (actual time=1027996.731..1027998.066 rows=1000 loops=1)
>  Sort Key: sum(occurence)
>  ->  GroupAggregate  (cost=2484802.05..2487661.48 
> rows=3357
> width=12) (actual time=810623.035..914550.262 rows=19422774 loops=1)
>->  Sort  (cost=2484802.05..2485752.39 rows=380138
> width=12) (actual time=810612.248..845427.013 rows=36724340 loops=1)
>  Sort Key: query
>  ->  Index Scan using test_date on
> queries_detail_statistics  (cost=0.00..2449570.55 rows=380138 width=12) 
> (actual
> time=25.393..182029.205 rows=36724340 loops=1)
>Index Cond: ((date >= 
> '2006-01-01'::date)
> AND (date <= '2006-01-30'::date))
>Filter: (((portal)::text = '1'::text) 
> OR
> ((portal)::text = '2'::text))
>->  Index Scan using query_string_pkey on query_string  (cost=0.00..3.01
> rows=1 width=34) (actual time=2.244..2.246 rows=1 loops=1000)
>  Index Cond: ("outer".query = query_string.id)
>  Total runtime: 1034357.390 ms

OK, in the index scan using test_date, you get 36724340 when the planner
expects 380138.  That's off by a factor of about 10, so I'm guessing
that your statistics aren't reflecting what's really in your db.  You
said before you'd run analyze, so I'd try increasing the stats target on
that column and rerun analyze to see if things get any better.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Joshua D. Drake

Jeremy Haile wrote:

We are a small company looking to put together the most cost effective
solution for our production database environment.  Currently in
production Postgres 8.1 is running on this machine:

Dell 2850
2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
4 GB DDR2 400 Mhz
2 x 73 GB 10K SCSI RAID 1 (for xlog and OS)
4 x 146 GB 10K SCSI RAID 10 (for postgres data)
Perc4ei controller

The above is a standard Dell box with nothing added or modified beyond
the options available directly through Dell.

You should probably review the archives for PostgreSQL user
experience with Dell's before you purchase one.


I sent our scenario to our sales team at Dell and they came back with
all manner of SAN, DAS, and configuration costing as much as $50k.
  

HAHAHAHAHA Don't do that. Dell is making the assumption
you won't do your homework. Make sure you cross quote with IBM,
Compaq and Penguin Computing...

Sincerely,

Joshua D. Drake

--
The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: PLphp, PLperl - http://www.commandprompt.com/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Greg Stark

"Joshua D. Drake" <[EMAIL PROTECTED]> writes:

> Jeremy Haile wrote:
> > We are a small company looking to put together the most cost effective
> > solution for our production database environment.  Currently in
> > production Postgres 8.1 is running on this machine:
> >
> > Dell 2850
> > 2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
> > 4 GB DDR2 400 Mhz
> > 2 x 73 GB 10K SCSI RAID 1 (for xlog and OS)
> > 4 x 146 GB 10K SCSI RAID 10 (for postgres data)
> > Perc4ei controller

You don't say how this box is performing. There's no way to give
recommendations in a vacuum. Some users really need $50k boxes and others
(most) don't. This looks like a pretty good setup for Postgres and you would
have to be pushing things pretty hard to need much more.

That said some users have reported problems with Dell's raid controllers even
when the same brand's regular controllers worked well. That's what Joshua is
referring to.

> > The above is a standard Dell box with nothing added or modified beyond
> > the options available directly through Dell.
>
> You should probably review the archives for PostgreSQL user
> experience with Dell's before you purchase one.


-- 
greg


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Christopher Browne
After takin a swig o' Arrakan spice grog, [EMAIL PROTECTED] ("Joshua D. Drake") 
belched out:
> Jeremy Haile wrote:
>> We are a small company looking to put together the most cost effective
>> solution for our production database environment.  Currently in
>> production Postgres 8.1 is running on this machine:
>>
>> Dell 2850
>> 2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
>> 4 GB DDR2 400 Mhz
>> 2 x 73 GB 10K SCSI RAID 1 (for xlog and OS)
>> 4 x 146 GB 10K SCSI RAID 10 (for postgres data)
>> Perc4ei controller
>>
>> The above is a standard Dell box with nothing added or modified beyond
>> the options available directly through Dell.

> You should probably review the archives for PostgreSQL user
> experience with Dell's before you purchase one.

Hear, hear!  We found Dell servers were big-time underperformers.

Generic hardware put together with generally the same brand names of
components (e.g. - for SCSI controllers and such) would generally play
much better.

For the cheapo desktop boxes they obviously have to buy the "cheapest
hardware available this week;" it sure seems as though they engage in
the same sort of thing with the "server class" hardware.

I don't think anyone has been able to forcibly point out any
completely precise shortcoming; just that they underperform what the
specs suggest they ought to be able to provide.

>> I sent our scenario to our sales team at Dell and they came back
>> with all manner of SAN, DAS, and configuration costing as much as
>> $50k.

> HAHAHAHAHA Don't do that. Dell is making the assumption you
> won't do your homework. Make sure you cross quote with IBM, Compaq
> and Penguin Computing...

Indeed.
-- 
(format nil "[EMAIL PROTECTED]" "cbbrowne" "gmail.com")
http://linuxdatabases.info/info/rdbms.html
Rules of  the Evil  Overlord #141.  "As  an alternative to  not having
children, I  will have _lots_ of  children.  My sons will  be too busy
jockeying for  position to  ever be a  real threat, and  the daughters
will all sabotage each other's attempts to win the hero."


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Scott Marlowe
On Wed, 2006-02-15 at 12:44, Christopher Browne wrote:
> After takin a swig o' Arrakan spice grog, [EMAIL PROTECTED] ("Joshua D. 
> Drake") belched out:

> > You should probably review the archives for PostgreSQL user
> > experience with Dell's before you purchase one.
> 
> Hear, hear!  We found Dell servers were big-time underperformers.
> 
> Generic hardware put together with generally the same brand names of
> components (e.g. - for SCSI controllers and such) would generally play
> much better.

My experience has been that:

A:  Their rebranded LSI and Adaptec RAID controllers underperform.
B:  Their BIOS updates for said cards and the mobos for the 26xx series
comes in a format that requires you to have a friggin bootable DOS
floppy.  What is this, 1987???
C:  They use poorly performing mobo chipsets.

We had a dual P-III-750 with a REAL LSI RAID card and an intel mobo, and
replaced it with a dual P-IV 2800 Dell 2600 with twice the RAM.  As a
database server the P-III-750 was easily a match for the new dell, and
in some ways (i/o) outran it.

We also had a dual 2400 PIV Intel generic box, and it spanked the Dell
handily at everything, was easier to work on, the parts cost less, and
it used bog standard RAID cards and such.  I would highly recommend the
Intel Generic hardware over Dell any day.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Ron

At 11:21 AM 2/15/2006, Jeremy Haile wrote:

We are a small company looking to put together the most cost effective
solution for our production database environment.  Currently in
production Postgres 8.1 is running on this machine:

Dell 2850
2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
4 GB DDR2 400 Mhz
2 x 73 GB 10K SCSI RAID 1 (for xlog and OS)
4 x 146 GB 10K SCSI RAID 10 (for postgres data)
Perc4ei controller

The above is a standard Dell box with nothing added or modified beyond
the options available directly through Dell. We had a bad processor last
week that effectively put us down for an entire weekend. Though it was
the web server that failed, the experience has caused us to step back
and spend time coming up with a more reliable/fail-safe solution that
can reduce downtime.

Our load won't be substantial so extreme performance and load balancing
are not huge concerns. We are looking for good performance, at a good
price, configured in the most redundant, high availability manner
possible. Availability is the biggest priority.

I sent our scenario to our sales team at Dell and they came back with
all manner of SAN, DAS, and configuration costing as much as $50k.

We have the budget to purchase 2-3 additional machines along the lines
of the one listed above. As a startup with a limited budget, what would
this list suggest as options for clustering/replication or setting our
database up well in general?


1= Tell Dell "Thanks but no thanks." and do not buy any more 
equipment from them.  Their value per $$ is less than other options 
available to you.


2= The current best bang for the buck HW (and in many cases, best 
performing as well) for pg:
  a= AMD K8 and K9 (dual core) CPUs.  Particularly the A64 X2 3800+ 
when getting the most for your $$ matters a lot

   pg gets a nice performance boost from running in 64b.
  b= Decent Kx server boards are available from Gigabyte, IWill, 
MSI, Supermicro, and Tyan to name a few.
   IWill has a 2P 16 DIMM slot board that is particularly nice 
for a server that needs lots of RAM.
  c= Don't bother with SCSI or FC HD's unless you are doing the most 
demanding kind of OLTP.  SATA II HD's provide better value.
  d= HW RAID controllers are only worth it in certain 
scenarios.  Using RAID 5 almost always means you should use a HW RAID 
controller.
  e= The only HW RAID controllers worth the $$ for you are 3ware 
Escalade 9550SX's and Areca ARC-11xx or ARC-12xx's.
  *For the vast majority of throughput situations, the ARC-1xxx's 
with >= 1GB of battery backed WB cache are the best value*
  f= 1GB RAM sticks are cheap enough and provide enough value that 
you should max out any system you get with them.
  g= for +high+ speed fail over, Chelsio and others are now making 
PCI-X and PCI-E 10GbE NICs at reasonable prices.
The above should serve as a good "pick list" for the components of 
any servers you need.


3= The most economically sound HW and SW architecture that best suits 
your performance and reliability needs is context dependent to your 
specific circumstances.



Where are you located?
Ron




---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Strange Create Index behaviour

2006-02-15 Thread Gary Doades

Platform: FreeBSD 6.0, Postgresql 8.1.2 compiled from the ports collection.

Not sure if this belongs in performance or bugs..

A pg_restore of my 2.5GB database was taking up to 2 hours to complete 
instead of the expected 10-15 minutes. Checking the server it was mostly 
CPU bound. Testing further I discovered that is was spending huge 
amounts of CPU time creating some indexes.


It took a while to find out, but basically it boils down to this:

If the column that is having the index created has a certain 
distribution of values then create index takes a very long time. If the 
data values (integer in this case) a fairly evenly distributed then 
create index is very quick, if the data values are all the same it is 
very quick. I discovered that in the slow cases the column had 
approximately half the values as zero and the rest fairly spread out. 
One column started off with around 400,000 zeros and the rest of the 
following rows spread between values of 1 to 500,000.


I have put together a test case that demonstrates the problem (see 
below). I create a simple table, as close in structure to one of my 
problem tables and populate an integer column with 100,000 zeros follow 
by 100,000 random integers between 0 and 100,000. Then create an index 
on this column. I then drop the table and repeat. The create index 
should take around 1-2 seconds. A fair proportion of the time it takes 
50 seconds!!!


If I fill the same row with all random data the create index always 
takes a second or two. If I fill the column with all zeros everything is 
still OK.


When my tables that I am trying to restore are over 2 million rows the 
creating one index can take an hour!! (almost all CPU time).


All other areas of performance, once the dump is restored and analysed 
seem to be OK, even large hash/merge joins and sorts


This is entirely repeatable in FreeBSD in that around half the time 
create index will be incredibly slow.


All postgresql.conf settings are at the defaults for the test initially 
(fresh install)


The final interesting thing is that as I increase shared buffers to 2000 
or 3000 the problem gets *worse*


The following text is output from the test script..

select version();
version 



 PostgreSQL 8.1.2 on i386-portbld-freebsd6.0, compiled by GCC cc (GCC) 
3.4.4 [FreeBSD] 20050518

(1 row)

\timing
Timing is on.

-  Many slow cases, note the 50+ seconds cases

create table atest(i int4, r int4,d1 timestamp, d2 timestamp);
CREATE TABLE
Time: 81.859 ms
insert into atest (i,r,d1,d2) select 
generate_series(1,10),0,now(),now();

INSERT 0 10
Time: 1482.141 ms
insert into atest (i,r,d1,d2) select 
generate_series(1,10),random()*10,now(),now();

INSERT 0 10
Time: 1543.508 ms
create index idx on atest(r);
CREATE INDEX
Time: 56685.230 ms

drop table atest;
DROP TABLE
Time: 4.616 ms
create table atest(i int4, r int4,d1 timestamp, d2 timestamp);
CREATE TABLE
Time: 6.889 ms
insert into atest (i,r,d1,d2) select 
generate_series(1,10),0,now(),now();

INSERT 0 10
Time: 2009.787 ms
insert into atest (i,r,d1,d2) select 
generate_series(1,10),random()*10,now(),now();

INSERT 0 10
Time: 1828.663 ms
create index idx on atest(r);
CREATE INDEX
Time: 3991.257 ms

drop table atest;
DROP TABLE
Time: 3.796 ms
create table atest(i int4, r int4,d1 timestamp, d2 timestamp);
CREATE TABLE
Time: 19.965 ms
insert into atest (i,r,d1,d2) select 
generate_series(1,10),0,now(),now();

INSERT 0 10
Time: 1625.059 ms
insert into atest (i,r,d1,d2) select 
generate_series(1,10),random()*10,now(),now();

INSERT 0 10
Time: 2622.827 ms
create index idx on atest(r);
CREATE INDEX
Time: 1082.799 ms

drop table atest;
DROP TABLE
Time: 4.627 ms
create table atest(i int4, r int4,d1 timestamp, d2 timestamp);
CREATE TABLE
Time: 2.953 ms
insert into atest (i,r,d1,d2) select 
generate_series(1,10),0,now(),now();

INSERT 0 10
Time: 2068.744 ms
insert into atest (i,r,d1,d2) select 
generate_series(1,10),random()*10,now(),now();

INSERT 0 10
Time: 2671.420 ms
create index idx on atest(r);
CREATE INDEX
Time: 8047.660 ms

drop table atest;
DROP TABLE
Time: 3.675 ms
create table atest(i int4, r int4,d1 timestamp, d2 timestamp);
CREATE TABLE
Time: 2.582 ms
insert into atest (i,r,d1,d2) select 
generate_series(1,10),0,now(),now();

INSERT 0 10
Time: 1723.987 ms
insert into atest (i,r,d1,d2) select 
generate_series(1,10),random()*10,now(),now();

INSERT 0 10
Time: 2263.131 ms
create index idx on atest(r);
CREATE INDEX
Time: 50050.308 ms

drop table atest;
DROP TABLE
Time: 52.744 ms
create table atest(i int4, r int4,d1 timestamp, d2 timestamp);
CREATE TABLE
Time: 25.370 ms
insert into atest (i,r,d1,d2) select 
generate_series(1,10),0,now(),now();

INSERT 0 10
Time: 2052.733 ms
insert into atest (i,r,d1,d2) sele

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Tom Lane
Gary Doades <[EMAIL PROTECTED]> writes:
> Platform: FreeBSD 6.0, Postgresql 8.1.2 compiled from the ports collection.

> If the column that is having the index created has a certain 
> distribution of values then create index takes a very long time. If the 
> data values (integer in this case) a fairly evenly distributed then 
> create index is very quick, if the data values are all the same it is 
> very quick. I discovered that in the slow cases the column had 
> approximately half the values as zero and the rest fairly spread out. 

Interesting.  I tried your test script and got fairly close times
for all the cases on two different machines:
old HPUX machine: shortest 5800 msec, longest 7960 msec
new Fedora 4 machine: shortest 461 msec, longest 608 msec
(the HPUX machine was doing other stuff at the same time, so some
of its variation is probably only noise).

So what this looks like to me is a corner case that FreeBSD's qsort
fails to handle well.

You might try forcing Postgres to use our private copy of qsort, as we
do on Solaris for similar reasons.  (The easy way to do this by hand
is to configure as normal, then alter the LIBOBJS setting in
src/Makefile.global to add "qsort.o", then proceed with normal build.)
However, I think that our private copy is descended from *BSD sources,
so it might have the same failure mode.  It'd be worth finding out.

> The final interesting thing is that as I increase shared buffers to 2000 
> or 3000 the problem gets *worse*

shared_buffers is unlikely to impact index build time noticeably in
recent PG releases.  maintenance_work_mem would affect it a lot, though.
What setting were you using for that?

Can anyone else try these test cases on other platforms?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Jeremy Haile
Thanks for everyone's feedback.  I will definitely take the hardware
comments into consideration when purchasing future hardware.  I am
located in Atlanta, GA.  If Dell has such a bad reputation with this
list, does anyone have good vendor recommendations?   

Although most of the responses were hardware-oriented (which was
probably my fault for not clearly stating my question), I am mostly
interested in replication/clustering ways of solving the issue.  My
example of Dell quoting us $50k for a SAN was meant to sound ridiculous
and is definitely not something we are considering.  

What we are really after is a good clustering or replication solution
where we can run PostgreSQL on a small set of servers and have failover
capabilities.  While RAID is great, our last failure was a CPU failure
so a multi-server approach is something we want.  Does anyone have any
recommendations as far as a clustering/replication solutions, regardless
of hardware?  I know there are several open-source and commercial
postgres replication solutions - any good or bad experiences?  Also, any
opinions on shared storage and clustering vs separate internal storage. 

Since performance is not our current bottleneck, I would imagine
Master->Slave replication would be sufficient, although performance
gains are always welcome.  I don't have much experience with setting
PostgreSQL in a replicated or clustered manner, so anything to point me
in the right direction both hardware and software wise would be
appreciated!

Thanks for all of the responses!

On Wed, 15 Feb 2006 14:53:28 -0500, "Ron" <[EMAIL PROTECTED]> said:
> At 11:21 AM 2/15/2006, Jeremy Haile wrote:
> >We are a small company looking to put together the most cost effective
> >solution for our production database environment.  Currently in
> >production Postgres 8.1 is running on this machine:
> >
> >Dell 2850
> >2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
> >4 GB DDR2 400 Mhz
> >2 x 73 GB 10K SCSI RAID 1 (for xlog and OS)
> >4 x 146 GB 10K SCSI RAID 10 (for postgres data)
> >Perc4ei controller
> >
> >The above is a standard Dell box with nothing added or modified beyond
> >the options available directly through Dell. We had a bad processor last
> >week that effectively put us down for an entire weekend. Though it was
> >the web server that failed, the experience has caused us to step back
> >and spend time coming up with a more reliable/fail-safe solution that
> >can reduce downtime.
> >
> >Our load won't be substantial so extreme performance and load balancing
> >are not huge concerns. We are looking for good performance, at a good
> >price, configured in the most redundant, high availability manner
> >possible. Availability is the biggest priority.
> >
> >I sent our scenario to our sales team at Dell and they came back with
> >all manner of SAN, DAS, and configuration costing as much as $50k.
> >
> >We have the budget to purchase 2-3 additional machines along the lines
> >of the one listed above. As a startup with a limited budget, what would
> >this list suggest as options for clustering/replication or setting our
> >database up well in general?
> 
> 1= Tell Dell "Thanks but no thanks." and do not buy any more 
> equipment from them.  Their value per $$ is less than other options 
> available to you.
> 
> 2= The current best bang for the buck HW (and in many cases, best 
> performing as well) for pg:
>a= AMD K8 and K9 (dual core) CPUs.  Particularly the A64 X2 3800+ 
> when getting the most for your $$ matters a lot
> pg gets a nice performance boost from running in 64b.
>b= Decent Kx server boards are available from Gigabyte, IWill, 
> MSI, Supermicro, and Tyan to name a few.
> IWill has a 2P 16 DIMM slot board that is particularly nice 
> for a server that needs lots of RAM.
>c= Don't bother with SCSI or FC HD's unless you are doing the most 
> demanding kind of OLTP.  SATA II HD's provide better value.
>d= HW RAID controllers are only worth it in certain 
> scenarios.  Using RAID 5 almost always means you should use a HW RAID 
> controller.
>e= The only HW RAID controllers worth the $$ for you are 3ware 
> Escalade 9550SX's and Areca ARC-11xx or ARC-12xx's.
>*For the vast majority of throughput situations, the ARC-1xxx's 
> with >= 1GB of battery backed WB cache are the best value*
>f= 1GB RAM sticks are cheap enough and provide enough value that 
> you should max out any system you get with them.
>g= for +high+ speed fail over, Chelsio and others are now making 
> PCI-X and PCI-E 10GbE NICs at reasonable prices.
> The above should serve as a good "pick list" for the components of 
> any servers you need.
> 
> 3= The most economically sound HW and SW architecture that best suits 
> your performance and reliability needs is context dependent to your 
> specific circumstances.
> 
> 
> Where are you located?
> Ron
> 
> 
> 

---(end of broadcast)---
TIP 9: In versions below 8.0, the pla

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Gary Doades

Tom Lane wrote:

Interesting.  I tried your test script and got fairly close times
for all the cases on two different machines:
old HPUX machine: shortest 5800 msec, longest 7960 msec
new Fedora 4 machine: shortest 461 msec, longest 608 msec
(the HPUX machine was doing other stuff at the same time, so some
of its variation is probably only noise).

So what this looks like to me is a corner case that FreeBSD's qsort
fails to handle well.

You might try forcing Postgres to use our private copy of qsort, as we
do on Solaris for similar reasons.  (The easy way to do this by hand
is to configure as normal, then alter the LIBOBJS setting in
src/Makefile.global to add "qsort.o", then proceed with normal build.)
However, I think that our private copy is descended from *BSD sources,
so it might have the same failure mode.  It'd be worth finding out.

The final interesting thing is that as I increase shared buffers to 2000 
or 3000 the problem gets *worse*


shared_buffers is unlikely to impact index build time noticeably in
recent PG releases.  maintenance_work_mem would affect it a lot, though.
What setting were you using for that?

Can anyone else try these test cases on other platforms?



Thanks for that.

I've since tried it on Windows (pg 8.1.2) and the times were all 
similar, around 1200ms so it might just be BSD.


I'll have to wait until tomorrow to get back to my BSD box. FreeBSD 
ports makes it easy to install, so I'll have to figure out how to get in 
and change things manually. I guess the appropriate files are still left 
around after the ports make command finishes, so I just edit the file 
and make again?


If it can't be fixed though I guess we may have a problem using BSD. I'm 
surprised this hasn't been brought up before, the case doesn't seem 
*that* rare. Maybe not that many using FreeBSD?


I'd certainly be interested if anyone else can repro it on FreeBSD though.

Regards,
Gary.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Gary Doades

Tom Lane wrote:

shared_buffers is unlikely to impact index build time noticeably in
recent PG releases.  maintenance_work_mem would affect it a lot, though.
What setting were you using for that?



Also, i tried upping maintenance_work_mem to 65536 and it didn't make 
much difference (maybe 10% faster for the "normal" cases). Upping the 
shared_buffers *definitely* makes the bad cases worse though, but I 
agree I don't see why...


Regards,
Gary.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Simon Riggs
On Wed, 2006-02-15 at 20:00 +, Gary Doades wrote:

> I have put together a test case 

Please enable trace_sort=on and then repeat tests and post the
accompanying log file.

I think this is simply the sort taking longer depending upon the data
distribution, but I'd like to know for sure.

Thanks,

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Tom Lane
I wrote:
> Interesting.  I tried your test script and got fairly close times
> for all the cases on two different machines:
>   old HPUX machine: shortest 5800 msec, longest 7960 msec
>   new Fedora 4 machine: shortest 461 msec, longest 608 msec

> So what this looks like to me is a corner case that FreeBSD's qsort
> fails to handle well.

I tried forcing PG to use src/port/qsort.c on the Fedora machine,
and lo and behold:
new Fedora 4 machine: shortest 434 msec, longest 8530 msec

So it sure looks like this script does expose a problem on BSD-derived
qsorts.  Curiously, the case that's much the worst for me is the third
in the script, while the shortest time is the first case, which was slow
for Gary.  So I'd venture that the *BSD code has been tweaked somewhere
along the way, in a manner that moves the problem around without really
fixing it.  (Anyone want to compare the actual FreeBSD source to what
we have?)

This is pretty relevant stuff, because there was a thread recently
advocating that we stop using the platform qsort on all platforms:
http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php

It's really interesting to see a case where port/qsort is radically
worse than other qsorts ... unless we figure that out and fix it,
I think the idea of using port/qsort everywhere has just taken a
major hit.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Postgres slower than MS ACCESS

2006-02-15 Thread Jay Greenfield

I've been vacuuming between each test run.  

Not vacuuming results in times all the way up to 121 minutes.  For a direct
comparison with Access, the vacuuming time with Postgres should really be
included as this is not required with Access.

By removing all of the indexes I have been able to get the Postgres time
down to 4.35 minutes with default setting for all except the following:
fsync:  off
work_mem:  1024000
shared_buffers:  1

I did a run with checkpoint_segments @ 30 (from 3 in 4.35 min run) and
posted a time of 6.78 minutes.  Any idea why this would increase the time?

Thanks, 

Jay.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeff Trout
Sent: Wednesday, February 15, 2006 6:23 AM
To: Jay Greenfield
Cc: 'Tom Lane'; 'Stephen Frost'; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Postgres slower than MS ACCESS 


On Feb 14, 2006, at 3:56 PM, Jay Greenfield wrote:

>> How do you get 4,000+ lines of explain analyze for one update  
>> query in a
>> database with only one table?  Something a bit fishy there.   
>> Perhaps you
>> mean explain verbose, though I don't really see how that'd be so long
>> either, but it'd be closer.  Could you provide some more sane
>> information?
>
> My mistake - there was 4,000 lines in the EXPLAIN ANALYZE VERBOSE  
> output.
> Here is the output of EXPLAIN ANALYZE:
>
> QUERY PLAN
> "Seq Scan on ntdn  (cost=0.00..3471884.39 rows=1221391 width=1592)  
> (actual
> time=57292.580..1531300.003 rows=1221391 loops=1)"
> "Total runtime: 4472646.988 ms"
>

Have you been vacuuming or running autovacuum?
If you keep running queries like this you're certianly going to have  
a ton of dead tuples, which would def explain these times too.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Gary Doades

Tom Lane wrote:
 > I tried forcing PG to use src/port/qsort.c on the Fedora machine,

and lo and behold:
new Fedora 4 machine: shortest 434 msec, longest 8530 msec

So it sure looks like this script does expose a problem on BSD-derived
qsorts.  Curiously, the case that's much the worst for me is the third
in the script, while the shortest time is the first case, which was slow
for Gary.  So I'd venture that the *BSD code has been tweaked somewhere
along the way, in a manner that moves the problem around without really
fixing it.  (Anyone want to compare the actual FreeBSD source to what
we have?)



If I run the script again, it is not always the first case that is slow, 
it varies from run to run, which is why I repeated it quite a few times 
for the test.


Interestingly, if I don't delete the table after a run, but just drop 
and re-create the index repeatedly it stays a pretty consistent time, 
either repeatedly good or repeatedly bad!


Regards,
Gary.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Postgres slower than MS ACCESS

2006-02-15 Thread Tom Lane
"Jay Greenfield" <[EMAIL PROTECTED]> writes:
> I did a run with checkpoint_segments @ 30 (from 3 in 4.35 min run) and
> posted a time of 6.78 minutes.  Any idea why this would increase the time?

The first time through might take longer while the machine creates empty
xlog segment files (though I'd not have expected a hit that big).  Once
it's fully populated pg_xlog it'll just recycle the files, so you might
find that a second try is faster.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> Please enable trace_sort=on and then repeat tests and post the
> accompanying log file.

I did this on my Fedora machine with port/qsort.c, and got the results
attached.  Curiously, this run has the spikes in completely different
places than the prior one did.  So the random component of the test data
is affecting the results quite a lot.  There seems absolutely no doubt
that we are looking at data-dependent qsort misbehavior, though.  The
CPU time eaten by performsort accounts for all but about 100 msec of the
elapsed time reported on the psql side.

regards, tom lane


LOG:  begin index sort: unique = f, workMem = 16384, randomAccess = f
LOG:  performsort starting: CPU 0.00s/0.15u sec elapsed 0.15 sec
LOG:  performsort done: CPU 0.00s/12.43u sec elapsed 12.44 sec
LOG:  internal sort ended, 9861 KB used: CPU 0.01s/12.51u sec elapsed 12.52 sec
LOG:  begin index sort: unique = f, workMem = 16384, randomAccess = f
LOG:  performsort starting: CPU 0.00s/0.14u sec elapsed 0.15 sec
LOG:  performsort done: CPU 0.00s/0.78u sec elapsed 0.78 sec
LOG:  internal sort ended, 9861 KB used: CPU 0.02s/0.85u sec elapsed 0.87 sec
LOG:  begin index sort: unique = f, workMem = 16384, randomAccess = f
LOG:  performsort starting: CPU 0.01s/0.14u sec elapsed 0.15 sec
LOG:  performsort done: CPU 0.01s/0.96u sec elapsed 0.97 sec
LOG:  internal sort ended, 9861 KB used: CPU 0.02s/1.03u sec elapsed 1.06 sec
LOG:  begin index sort: unique = f, workMem = 16384, randomAccess = f
LOG:  performsort starting: CPU 0.00s/0.14u sec elapsed 0.15 sec
LOG:  performsort done: CPU 0.00s/0.31u sec elapsed 0.32 sec
LOG:  internal sort ended, 9861 KB used: CPU 0.02s/0.38u sec elapsed 0.40 sec
LOG:  begin index sort: unique = f, workMem = 16384, randomAccess = f
LOG:  performsort starting: CPU 0.00s/0.14u sec elapsed 0.15 sec
LOG:  performsort done: CPU 0.00s/7.91u sec elapsed 7.92 sec
LOG:  internal sort ended, 9861 KB used: CPU 0.02s/7.99u sec elapsed 8.01 sec
LOG:  begin index sort: unique = f, workMem = 16384, randomAccess = f
LOG:  performsort starting: CPU 0.01s/0.13u sec elapsed 0.15 sec
LOG:  performsort done: CPU 0.01s/0.61u sec elapsed 0.63 sec
LOG:  internal sort ended, 9861 KB used: CPU 0.04s/0.67u sec elapsed 0.71 sec
LOG:  begin index sort: unique = f, workMem = 16384, randomAccess = f
LOG:  performsort starting: CPU 0.01s/0.13u sec elapsed 0.15 sec
LOG:  performsort done: CPU 0.01s/11.52u sec elapsed 11.54 sec
LOG:  internal sort ended, 9861 KB used: CPU 0.03s/11.59u sec elapsed 11.62 sec
LOG:  begin index sort: unique = f, workMem = 16384, randomAccess = f
LOG:  performsort starting: CPU 0.00s/0.14u sec elapsed 0.15 sec
LOG:  performsort done: CPU 0.00s/0.45u sec elapsed 0.46 sec
LOG:  internal sort ended, 9861 KB used: CPU 0.02s/0.55u sec elapsed 0.57 sec
LOG:  begin index sort: unique = f, workMem = 16384, randomAccess = f
LOG:  performsort starting: CPU 0.00s/0.14u sec elapsed 0.15 sec
LOG:  performsort done: CPU 0.00s/0.45u sec elapsed 0.46 sec
LOG:  internal sort ended, 9861 KB used: CPU 0.04s/0.54u sec elapsed 0.57 sec
LOG:  begin index sort: unique = f, workMem = 16384, randomAccess = f
LOG:  performsort starting: CPU 0.02s/0.12u sec elapsed 0.15 sec
LOG:  performsort done: CPU 0.02s/0.44u sec elapsed 0.46 sec
LOG:  internal sort ended, 9861 KB used: CPU 0.03s/0.55u sec elapsed 0.58 sec
LOG:  begin index sort: unique = f, workMem = 16384, randomAccess = f
LOG:  performsort starting: CPU 0.02s/0.13u sec elapsed 0.15 sec
LOG:  performsort done: CPU 0.02s/0.44u sec elapsed 0.46 sec
LOG:  internal sort ended, 9861 KB used: CPU 0.03s/0.54u sec elapsed 0.58 sec
LOG:  begin index sort: unique = f, workMem = 16384, randomAccess = f
LOG:  performsort starting: CPU 0.02s/0.13u sec elapsed 0.15 sec
LOG:  performsort done: CPU 0.02s/0.44u sec elapsed 0.46 sec
LOG:  internal sort ended, 9861 KB used: CPU 0.04s/0.54u sec elapsed 0.59 sec

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Josh Rovero

Jeremy Haile wrote:

Thanks for everyone's feedback.  I will definitely take the hardware
comments into consideration when purchasing future hardware.  I am
located in Atlanta, GA.  If Dell has such a bad reputation with this
list, does anyone have good vendor recommendations?   
  

I can recommend Penguin Computing (even our windows weenies like them),
ASA, and HP Proliant.  AMD Opteron *is* the way to go.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Tom Lane
Gary Doades <[EMAIL PROTECTED]> writes:
> Interestingly, if I don't delete the table after a run, but just drop 
> and re-create the index repeatedly it stays a pretty consistent time, 
> either repeatedly good or repeatedly bad!

This is consistent with the theory of a data-dependent performance
problem in qsort.  If you don't generate a fresh set of random test
data, then you get repeatable runtimes.  With a new set of test data,
you might or might not hit the not-so-sweet-spot that we seem to have
detected.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Gary Doades

Tom Lane wrote:


So it sure looks like this script does expose a problem on BSD-derived
qsorts.  Curiously, the case that's much the worst for me is the third
in the script, while the shortest time is the first case, which was slow
for Gary.  So I'd venture that the *BSD code has been tweaked somewhere
along the way, in a manner that moves the problem around without really
fixing it.  (Anyone want to compare the actual FreeBSD source to what
we have?)

It's really interesting to see a case where port/qsort is radically
worse than other qsorts ... unless we figure that out and fix it,
I think the idea of using port/qsort everywhere has just taken a
major hit.



More specifically to BSD, is there any way I can use a non-BSD qsort for 
building Postresql server?


Regards,
Gary.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Gary Doades <[EMAIL PROTECTED]> writes:
> If I run the script again, it is not always the first case that is slow, 
> it varies from run to run, which is why I repeated it quite a few times 
> for the test.

For some reason I hadn't immediately twigged to the fact that your test
script is just N repetitions of the exact same structure with random data.
So it's not so surprising that you get random variations in behavior
with different test data sets.

I did some experimentation comparing the qsort from Fedora Core 4
(glibc-2.3.5-10.3) with our src/port/qsort.c.  For those who weren't
following the pgsql-performance thread, the test case is just this
repeated a lot of times:

create table atest(i int4, r int4);
insert into atest (i,r) select generate_series(1,10), 0;
insert into atest (i,r) select generate_series(1,10), random()*10;
\timing
create index idx on atest(r);
\timing
drop table atest;

I did this 100 times and sorted the reported runtimes.  (Investigation
with trace_sort = on confirms that the runtime is almost entirely spent
in qsort() called from our performsort --- the Postgres overhead is
about 100msec on this machine.)  Results are below.

It seems clear that our qsort.c is doing a pretty awful job of picking
qsort pivots, while glibc is mostly managing not to make that mistake.
I haven't looked at the glibc code yet to see what they are doing
differently.

I'd say this puts a considerable damper on my enthusiasm for using our
qsort all the time, as was recently debated in this thread:
http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php
We need to fix our qsort.c before pushing ahead with that idea.

regards, tom lane


100 runtimes for glibc qsort, sorted ascending:

Time: 459.860 ms
Time: 460.209 ms
Time: 460.704 ms
Time: 461.317 ms
Time: 461.538 ms
Time: 461.652 ms
Time: 461.988 ms
Time: 462.573 ms
Time: 462.638 ms
Time: 462.716 ms
Time: 462.917 ms
Time: 463.219 ms
Time: 463.455 ms
Time: 463.650 ms
Time: 463.723 ms
Time: 463.737 ms
Time: 463.750 ms
Time: 463.852 ms
Time: 463.964 ms
Time: 463.988 ms
Time: 464.003 ms
Time: 464.135 ms
Time: 464.372 ms
Time: 464.458 ms
Time: 464.496 ms
Time: 464.551 ms
Time: 464.599 ms
Time: 464.655 ms
Time: 464.656 ms
Time: 464.722 ms
Time: 464.814 ms
Time: 464.827 ms
Time: 464.878 ms
Time: 464.899 ms
Time: 464.905 ms
Time: 464.987 ms
Time: 465.055 ms
Time: 465.138 ms
Time: 465.159 ms
Time: 465.194 ms
Time: 465.310 ms
Time: 465.316 ms
Time: 465.375 ms
Time: 465.450 ms
Time: 465.535 ms
Time: 465.595 ms
Time: 465.680 ms
Time: 465.769 ms
Time: 465.865 ms
Time: 465.892 ms
Time: 465.903 ms
Time: 466.003 ms
Time: 466.154 ms
Time: 466.164 ms
Time: 466.203 ms
Time: 466.305 ms
Time: 466.344 ms
Time: 466.364 ms
Time: 466.388 ms
Time: 466.502 ms
Time: 466.593 ms
Time: 466.725 ms
Time: 466.794 ms
Time: 466.798 ms
Time: 466.904 ms
Time: 466.971 ms
Time: 466.997 ms
Time: 467.122 ms
Time: 467.146 ms
Time: 467.221 ms
Time: 467.224 ms
Time: 467.244 ms
Time: 467.277 ms
Time: 467.587 ms
Time: 468.142 ms
Time: 468.207 ms
Time: 468.237 ms
Time: 468.471 ms
Time: 468.663 ms
Time: 468.700 ms
Time: 469.235 ms
Time: 469.840 ms
Time: 470.472 ms
Time: 471.140 ms
Time: 472.811 ms
Time: 472.959 ms
Time: 474.858 ms
Time: 477.210 ms
Time: 479.571 ms
Time: 479.671 ms
Time: 482.797 ms
Time: 488.852 ms
Time: 514.639 ms
Time: 529.287 ms
Time: 612.185 ms
Time: 660.748 ms
Time: 742.227 ms
Time: 866.814 ms
Time: 1234.848 ms
Time: 1267.398 ms


100 runtimes for port/qsort.c, sorted ascending:

Time: 418.905 ms
Time: 420.611 ms
Time: 420.764 ms
Time: 420.904 ms
Time: 421.706 ms
Time: 422.466 ms
Time: 422.627 ms
Time: 423.189 ms
Time: 423.302 ms
Time: 425.096 ms
Time: 425.731 ms
Time: 425.851 ms
Time: 427.253 ms
Time: 430.113 ms
Time: 432.756 ms
Time: 432.963 ms
Time: 440.502 ms
Time: 440.640 ms
Time: 450.452 ms
Time: 458.143 ms
Time: 459.212 ms
Time: 467.706 ms
Time: 468.006 ms
Time: 468.574 ms
Time: 470.003 ms
Time: 472.313 ms
Time: 483.622 ms
Time: 492.395 ms
Time: 509.564 ms
Time: 531.037 ms
Time: 533.366 ms
Time: 535.610 ms
Time: 575.523 ms
Time: 582.688 ms
Time: 593.545 ms
Time: 647.364 ms
Time: 660.612 ms
Time: 677.312 ms
Time: 680.288 ms
Time: 697.626 ms
Time: 833.066 ms
Time: 834.511 ms
Time: 851.819 ms
Time: 920.443 ms
Time: 926.731 ms
Time: 954.289 ms
Time: 1045.214 ms
Time: 1059.200 ms
Time: 1062.328 ms
Time: 1136.018 ms
Time: 1260.091 ms
Time: 1276.883 ms
Time: 1319.351 ms
Time: 1438.854 ms
Time: 1475.457 ms
Time: 1538.211 ms
Time: 1549.004 ms
Time: 1744.642 ms
Time: 1771.258 ms
Time: 1959.530 ms
Time: 2300.140 ms
Time: 2589.641 ms
Time: 2612.780 ms
Time: 3100.024 ms
Time: 3284.125 ms
Time: 3379.792 ms
Time: 3750.278 ms
Time: 4302.278 ms
Time: 4780.624 ms
Time: 5000.056 ms
Time: 5092.604 ms
Time: 5168.722 ms
Time: 5292.941 ms
Time: 5895.964 ms
Time: 7003.164 ms
Time: 7099.449 ms
Time: 7115.083 ms
Time: 7384.940 ms
Time: 8214.010 ms
Time: 8700.771 ms
Time: 9331.225 ms
Time: 10503.360 ms
Time: 12496.026 ms
Time: 12982.474 ms
Time: 15192.39

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Simon Riggs
On Wed, 2006-02-15 at 16:51 -0500, Tom Lane wrote:
> Gary Doades <[EMAIL PROTECTED]> writes:
> > Interestingly, if I don't delete the table after a run, but just drop 
> > and re-create the index repeatedly it stays a pretty consistent time, 
> > either repeatedly good or repeatedly bad!
> 
> This is consistent with the theory of a data-dependent performance
> problem in qsort.  If you don't generate a fresh set of random test
> data, then you get repeatable runtimes.  With a new set of test data,
> you might or might not hit the not-so-sweet-spot that we seem to have
> detected.

Agreed. Good analysis...

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Gary Doades

Tom Lane wrote:

For some reason I hadn't immediately twigged to the fact that your test
script is just N repetitions of the exact same structure with random data.
So it's not so surprising that you get random variations in behavior
with different test data sets.


 > It seems clear that our qsort.c is doing a pretty awful job of picking

qsort pivots, while glibc is mostly managing not to make that mistake.
I haven't looked at the glibc code yet to see what they are doing
differently.

I'd say this puts a considerable damper on my enthusiasm for using our
qsort all the time, as was recently debated in this thread:
http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php
We need to fix our qsort.c before pushing ahead with that idea.


[snip]


Time: 28314.182 ms
Time: 29400.278 ms
Time: 34142.534 ms


Ouch! That confirms my problem. I generated the random test case because 
it was easier than including the dump of my tables, but you can 
appreciate that tables 20 times the size are basically crippled when it 
comes to creating an index on them.


Examining the dump and the associated times during restore it looks like 
I have 7 tables with this approximate distribution, thus the 
ridiculously long restore time. Better not re-index soon!


Is this likely to hit me in a random fashion during normal operation, 
joins, sorts, order by for example?


So the options are:
1) Fix the included qsort.c code and use that
2) Get FreeBSD to fix their qsort code
3) Both

I guess that 1 is the real solution in case anyone else's qsort is 
broken in the same way. Then at least you *could* use it all the time :)


Regards,
Gary.




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Gary Doades <[EMAIL PROTECTED]> writes:
> Is this likely to hit me in a random fashion during normal operation, 
> joins, sorts, order by for example?

Yup, anytime you're passing data with that kind of distribution
through a sort.

> So the options are:
> 1) Fix the included qsort.c code and use that
> 2) Get FreeBSD to fix their qsort code
> 3) Both

> I guess that 1 is the real solution in case anyone else's qsort is 
> broken in the same way. Then at least you *could* use it all the time :)

It's reasonable to assume that most of the *BSDen have basically the
same qsort code.  Ours claims to have come from NetBSD sources, but
I don't doubt that they all trace back to a common ancestor.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Gary Doades <[EMAIL PROTECTED]> writes:
> Ouch! That confirms my problem. I generated the random test case because 
> it was easier than including the dump of my tables, but you can 
> appreciate that tables 20 times the size are basically crippled when it 
> comes to creating an index on them.

Actually... we only use qsort when we have a sorting problem that fits
within the allowed sort memory.  The external-sort logic doesn't go
through that code at all.  So all the analysis we just did on your test
case doesn't necessarily apply to sort problems that are too large for
the sort_mem setting.

The test case would be sorting 20 index entries, which'd probably
occupy at least 24 bytes apiece of sort memory, so probably about 5 meg.
A problem 20 times that size would definitely not fit in the default
16MB maintenance_work_mem.  Were you using a large value of
maintenance_work_mem for your restore?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Ron
This behavior is consistent with the pivot choosing algorithm 
assuming certain distribution(s) for the data.  For instance, 
median-of-three partitioning is known to be pessimal when the data is 
geometrically or hyper-geometrically distributed.  Also, care must be 
taken that sometimes is not when there are many equal values in the 
data.  Even pseudo random number generator based pivot choosing 
algorithms are not immune if the PRNG is flawed in some way.


How are we choosing our pivots?


At 06:28 PM 2/15/2006, Tom Lane wrote:


I did some experimentation comparing the qsort from Fedora Core 4
(glibc-2.3.5-10.3) with our src/port/qsort.c.  For those who weren't
following the pgsql-performance thread, the test case is just this
repeated a lot of times:

create table atest(i int4, r int4);
insert into atest (i,r) select generate_series(1,10), 0;
insert into atest (i,r) select generate_series(1,10), random()*10;
\timing
create index idx on atest(r);
\timing
drop table atest;

I did this 100 times and sorted the reported runtimes.  (Investigation
with trace_sort = on confirms that the runtime is almost entirely spent
in qsort() called from our performsort --- the Postgres overhead is
about 100msec on this machine.)  Results are below.

It seems clear that our qsort.c is doing a pretty awful job of picking
qsort pivots, while glibc is mostly managing not to make that mistake.
I haven't looked at the glibc code yet to see what they are doing
differently.

I'd say this puts a considerable damper on my enthusiasm for using our
qsort all the time, as was recently debated in this thread:
http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php
We need to fix our qsort.c before pushing ahead with that idea.

regards, tom lane


100 runtimes for glibc qsort, sorted ascending:

Time: 459.860 ms

Time: 488.852 ms
Time: 514.639 ms
Time: 529.287 ms
Time: 612.185 ms
Time: 660.748 ms
Time: 742.227 ms
Time: 866.814 ms
Time: 1234.848 ms
Time: 1267.398 ms


100 runtimes for port/qsort.c, sorted ascending:

Time: 418.905 ms

Time: 20865.979 ms
Time: 21000.907 ms
Time: 21297.585 ms
Time: 21714.518 ms
Time: 25423.235 ms
Time: 27543.052 ms
Time: 28314.182 ms
Time: 29400.278 ms
Time: 34142.534 ms




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
I wrote:
> Gary Doades <[EMAIL PROTECTED]> writes:
>> Ouch! That confirms my problem. I generated the random test case because 
>> it was easier than including the dump of my tables, but you can 
>> appreciate that tables 20 times the size are basically crippled when it 
>> comes to creating an index on them.

> Actually... we only use qsort when we have a sorting problem that fits
> within the allowed sort memory.  The external-sort logic doesn't go
> through that code at all.  So all the analysis we just did on your test
> case doesn't necessarily apply to sort problems that are too large for
> the sort_mem setting.

I increased the size of the test case by 10x (basically s/10/100/)
which is enough to push it into the external-sort regime.  I get
amazingly stable runtimes now --- I didn't have the patience to run 100
trials, but in 30 trials I have slowest 11538 msec and fastest 11144 msec.
So this code path is definitely not very sensitive to this data
distribution.

While these numbers aren't glittering in comparison to the best-case
qsort times (~450 msec to sort 10% as much data), they are sure a lot
better than the worst-case times.  So maybe a workaround for you is
to decrease maintenance_work_mem, counterintuitive though that be.
(Now, if you *weren't* using maintenance_work_mem of 100MB or more
for your problem restore, then I'm not sure I know what's going on...)

We still ought to try to fix qsort of course.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Ron <[EMAIL PROTECTED]> writes:
> How are we choosing our pivots?

See qsort.c: it looks like median of nine equally spaced inputs (ie,
the 1/8th points of the initial input array, plus the end points),
implemented as two rounds of median-of-three choices.  With half of the
data inputs zero, it's not too improbable for two out of the three
samples to be zeroes in which case I think the med3 result will be zero
--- so choosing a pivot of zero is much more probable than one would
like, and doing so in many levels of recursion causes the problem.

I think.  I'm not too sure if the code isn't just being sloppy about the
case where many data values are equal to the pivot --- there's a special
case there to switch to insertion sort, and maybe that's getting invoked
too soon.  It'd be useful to get a line-level profile of the behavior of
this code in the slow cases...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Simon Riggs
On Wed, 2006-02-15 at 20:00 +, Gary Doades wrote:

> I have put together a test case that demonstrates the problem (see 
> below). I create a simple table, as close in structure to one of my 
> problem tables and populate an integer column with 100,000 zeros follow 
> by 100,000 random integers between 0 and 100,000. Then create an index 
> on this column. I then drop the table and repeat. The create index 
> should take around 1-2 seconds. A fair proportion of the time it takes 
> 50 seconds!!!
> 
> If I fill the same row with all random data the create index always 
> takes a second or two. If I fill the column with all zeros everything is 
> still OK.

Aside from the importance of investigating sort behaviour, have you
tried to build a partial index WHERE col > 0 ? That way you wouldn't
even be indexing the zeros.

Best Regards, Simon Riggs




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Christopher Kings-Lynne
Ouch! That confirms my problem. I generated the random test case because 
it was easier than including the dump of my tables, but you can 
appreciate that tables 20 times the size are basically crippled when it 
comes to creating an index on them.



I have to say that I restored a few gigabyte dump on freebsd the other 
day, and most of the restore time was in index creation - I didn't think 
too much of it though at the time.  FreeBSD 4.x.


Chris


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Simon Riggs
On Wed, 2006-02-15 at 19:59 -0500, Tom Lane wrote:

>  I get
> amazingly stable runtimes now --- I didn't have the patience to run 100
> trials, but in 30 trials I have slowest 11538 msec and fastest 11144 msec.
> So this code path is definitely not very sensitive to this data
> distribution.

"The worst-case behavior of replacement-selection is very close to its
average behavior, while the worst-case behavior of QuickSort is terrible
(N2) – a strong argument in favor of replacement-selection. Despite this
risk, QuickSort is widely used because, in practice, it has superior
performance." p.8, "AlphaSort: A Cache-Sensitive Parallel External
Sort", Nyberg et al, VLDB Journal 4(4): 603-627 (1995)

I think your other comment about flipping to insertion sort too early
(and not returning...) is a plausible cause for the poor pg qsort
behaviour, but the overall spread of values seems as expected.

Some test results I've seen seem consistent with the view that
increasing memory also increases run-time for larger settings of
work_mem/maintenance_work_mem. Certainly, as I observed a while back,
having a large memory settings doesn't help you at all when you are
doing final run merging on the external sort. Whatever we do, we should
look at the value high memory settings bring to each phase of a sort
separately from the other phases.

There is work underway on improving external sorts, so I hear (not me).
Plus my WIP on randomAccess requirements.

Best Regards, Simon Riggs




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Neil Conway
On Wed, 2006-02-15 at 18:28 -0500, Tom Lane wrote:
> It seems clear that our qsort.c is doing a pretty awful job of picking
> qsort pivots, while glibc is mostly managing not to make that mistake.
> I haven't looked at the glibc code yet to see what they are doing
> differently.

glibc qsort is actually merge sort, so I'm not surprised it avoids this
problem.

-Neil



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Ron

At 08:21 PM 2/15/2006, Tom Lane wrote:

Ron <[EMAIL PROTECTED]> writes:
> How are we choosing our pivots?

See qsort.c: it looks like median of nine equally spaced inputs (ie,
the 1/8th points of the initial input array, plus the end points),
implemented as two rounds of median-of-three choices.


OK, this is a bad way to do median-of-n partitioning for a few 
reasons.  See Sedgewick's PhD thesis for details.


Basically, if one is using median-of-n partitioning to choose a 
pivot, one should do it in =one= pass, and n for that pass should be 
<= the numbers of registers in the CPU.  Since the x86 ISA has 8 
GPR's, n should be <= 8.  7 for instance.


Special purposing the median-of-n code so that the minimal number of 
comparisons and moves is used to sort the sample and then 
"partitioning in place" is the best way to do it.  In addition, care 
must be taken to deal with the possibility that many of the keys may be equal.


The (pseudo) code looks something like this:

qs(a[],L,R){
if((R-L) > SAMPLE_SIZE){ // Not worth using qs for too few elements
   SortSample(SAMPLE_SIZE,a[],L,R);
   // Sorts SAMPLE_SIZE= n elements and does median-of-n 
partitioning for small n

   // using the minimal number of comparisons and moves.
   // In the process it ends up partitioning the first n/2 and last 
n/2 elements

   // SAMPLE_SIZE is a constant chosen to work best for a given CPU.
   //  #GPRs - 1 is a good initial guess.
   // For the x86 ISA, #GPRs - 1 = 7. For native x86-64, it's 15.
   // For most RISC CPUs it's 31 or 63.  For Itanium, it's 127 (!)
   pivot= a[(L+R)>>1]; i= L+(SAMPLE_SIZE>>1); j= R-(SAMPLE_SIZE>>1);
   for(;;){
  while(a[++i] < pivot);
  while(a[--j] > pivot);
  if(i >= j) break;
  if(a[i] > a[j]) swap(a[i],a[j]);
  }
   if((i-R) >= (j-L)){qs(a,L,i-1);}
   else{qs(a,i,R);}
else{OofN^2_Sort(a,L,R);}
// SelectSort may be better than InsertSort if KeySize in bits << 
RecordSize in bits

} // End of qs

Given that the most common CPU ISA in existence has 8 GPRs, 
SAMPLE_SIZE= 7 is probably optimal:

t= (L+R);
the set would be {L; t/8; t/4; t/2; 3*t/4; 7*t/8; R;}
==> {L; t>>3; t>>2; t>>1; (3*t)>>2; (7*t)>>3; R} as the locations.
Even better (and more easily scaled as the number of GPR's in the CPU 
changes) is to use

the set {L; L+1; L+2; t>>1; R-2; R-1; R}
This means that instead of 7 random memory accesses, we have 3; two 
of which result in a

burst access for three elements each.
That's much faster; _and_ using a sample of 9, 15, 31, 63, etc (to 
max of ~GPRs -1) elements is more easily done.


It also means that the work we do sorting the sample can be taken 
advantage of when starting
inner loop of quicksort: items L..L+2, t, and R-2..R are already 
partitioned by SortSample().


Insuring that the minimum number of comparisons and moves is done in 
SortSample can be down by using a code generator to create a 
comparison tree that identifies which permutation(s) of n we are 
dealing with and then moving them into place with the minimal number of moves.


SIDE NOTE: IIRC glibc's qsort is actually merge sort.  Merge sort 
performance is insensitive to all inputs, and there are way to 
optimize it as well.


I'll leave the actual coding to someone who knows the pg source 
better than I do.
Ron 




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq