[PERFORM] One huge db vs many small dbs

2013-12-05 Thread Max
Hello,

We are starting a new project to deploy a solution in cloud with the 
possibility to be used for 2.000+ clients. Each of this clients will use 
several tables to store their information (our model has about 500+ tables but 
there's less than 100 core table with heavy use). Also the projected ammout of 
information per client could be from small (few hundreds tuples/MB) to huge 
(few millions tuples/GB).

One of the many questions we have is about performance of the db if we work 
with only one (using a ClientID to separete de clients info) or thousands of 
separate dbs. The management of the dbs is not a huge concert as we have an 
automated tool.

At Google there's lots of cases about this subject but none have a scenario 
that matchs with the one I presented above, so I would like to know if anyone 
here has a similar situation or knowledgement and could share some thoughts.


Thanks

Max

[PERFORM] Shared memory usage

2007-08-25 Thread Max Zorloff

Hello.

I have a postgres 8.0 and ~400mb database with lots of simple selects  
using indexes.
I've installed pgpool on the system. I've set num_init_children to 5 and  
here is the top output.
One of postmasters is my demon running some insert/update tasks. I see  
that they all use cpu heavily, but do not use the shared memory.  
shared_buffers is set to 6, yet they use a minimal part of that. I'd  
like to know why won't they use more? All the indexes and half of the  
database should be in the shared memory, is it not? Or am I completely  
missing what are the shared_buffers for? If so, then how do I put my  
indexes and at least a part of the data into memory?


top - 00:12:35 up 50 days, 13:22,  8 users,  load average: 4.84, 9.71,  
13.22

Tasks: 279 total,  10 running, 268 sleeping,   1 stopped,   0 zombie
Cpu(s): 50.0% us, 12.9% sy,  0.0% ni, 33.2% id,  1.8% wa,  0.0% hi,  2.1%  
si

Mem:   6102304k total,  4206948k used,  1895356k free,   159436k buffers
Swap:  1959888k total,12304k used,  1947584k free,  2919816k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
11492 postgres  16   0  530m  72m  60m S   14  1.2   0:50.91 postmaster
11493 postgres  16   0  531m  72m  60m R   14  1.2   0:48.78 postmaster
11490 postgres  15   0  530m  71m  59m S   13  1.2   0:50.26 postmaster
11491 postgres  15   0  531m  75m  62m S   11  1.3   0:50.67 postmaster
11495 postgres  16   0  530m  71m  59m R   10  1.2   0:50.71 postmaster
10195 postgres  15   0  536m  84m  66m S6  1.4   1:11.72 postmaster

postgresql.conf:

shared_buffers = 6
work_mem = 2048
maintenance_work_mem = 256000

The rest are basically default values

Thank you in advance.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] [GENERAL] Shared memory usage

2007-08-25 Thread Max Zorloff
On Sun, 26 Aug 2007 00:39:52 +0400, Martijn van Oosterhout  
<[EMAIL PROTECTED]> wrote:



On Sun, Aug 26, 2007 at 01:22:58AM +0400, Max Zorloff wrote:

Hello.

I have a postgres 8.0 and ~400mb database with lots of simple selects
using indexes.
I've installed pgpool on the system. I've set num_init_children to 5 and
here is the top output.
One of postmasters is my demon running some insert/update tasks. I see
that they all use cpu heavily, but do not use the shared memory.
shared_buffers is set to 6, yet they use a minimal part of that. I'd
like to know why won't they use more? All the indexes and half of the
database should be in the shared memory, is it not? Or am I completely
missing what are the shared_buffers for? If so, then how do I put my
indexes and at least a part of the data into memory?


shared_memory is used for caching. It is filled as stuff is used. If
you're not using all of it that means it isn't needed. Remember, it is
not the only cache. Since your database is only 400MB it will fit
entirely inside the OS disk cache, so you really don't need much shared
memory at all.

Loading stuff into memory for the hell of it is a waste, let the system
manage the memory itself, if it needs it, it'll use it.

Have a nice day,


Could it be that most of the cpu usage is from lots of fast indexed sql  
queries

wrapped in sql functions?

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] [GENERAL] Shared memory usage

2007-08-25 Thread Max Zorloff
On Sun, 26 Aug 2007 00:39:52 +0400, Martijn van Oosterhout  
<[EMAIL PROTECTED]> wrote:



On Sun, Aug 26, 2007 at 01:22:58AM +0400, Max Zorloff wrote:

Hello.


shared_memory is used for caching. It is filled as stuff is used. If
you're not using all of it that means it isn't needed. Remember, it is
not the only cache. Since your database is only 400MB it will fit
entirely inside the OS disk cache, so you really don't need much shared
memory at all.

Loading stuff into memory for the hell of it is a waste, let the system
manage the memory itself, if it needs it, it'll use it.



Where do I find my OS disk cache settings? I'm using Linux.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Shared memory usage

2007-08-27 Thread Max Zorloff
On Mon, 27 Aug 2007 14:21:43 +0400, Adam Tauno Williams  
<[EMAIL PROTECTED]> wrote:



I have a postgres 8.0 and ~400mb database with lots of simple selects
using indexes.
I've installed pgpool on the system. I've set num_init_children to 5 and
here is the top output.
One of postmasters is my demon running some insert/update tasks. I see
that they all use cpu heavily, but do not use the shared memory.
shared_buffers is set to 6, yet they use a minimal part of that. I'd
like to know why won't they use more?


This just looks like the output of top;  what is telling you that
PostgreSQL is not using the shared memory?  Enable statistics collection
and then look in pg_statio_user_tables.


I have it enabled. How can I tell whether the shared memory is used from  
the information in this table?


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Shared memory usage

2007-08-29 Thread Max Zorloff
On Wed, 29 Aug 2007 23:26:06 +0400, Greg Smith <[EMAIL PROTECTED]>  
wrote:


First off, posting to two lists like you did (-general and -performance)  
is frowned on here.  Pick whichever is more appropriate for the topic  
and post to just that one; in your case, the performance list would be  
more appropriate, and I'm only replying to there.


Sorry, didn't know that.


On Sun, 26 Aug 2007, Max Zorloff wrote:


shared_buffers is set to 6, yet they use a minimal part of that.
PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
11492 postgres  16   0  530m  72m  60m S   14  1.2   0:50.91 postmaster


Looks to me like PostgreSQL is grabbing 530MB worth of memory on your  
system.  run the ipcs command to see how big the block that's dedicated  
to the main server is; I suspect you'll find it's at 400MB just like you  
expect it to be.  Here's an example from my server which has a 256MB  
shared_buffers:


-bash-3.00$ ipcs
-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch status
0x0052e2c1 1114114postgres  600277856256  3

Also:  when you've got top running, hit the "c" key and the postmaster  
processes will give you more information about what they're doing you  
may find helpful.


All the indexes and half of the database should be in the shared  
memory, is it not? Or am I completely missing what are the  
shared_buffers for? If so, then how do I put my indexes and at least a  
part of the data into memory?


You can find out what's inside the shared_buffers cache by using the  
installing the contrib/pg_buffercache module against your database.  The  
README.pg_buffercache file in there gives instructions on how to install  
it, and the sample query provided there should tell you what you're  
looking for here.


Thanks, I'll see that.


Where do I find my OS disk cache settings? I'm using Linux.


You can get a summary of how much memory Linux is using to cache data by  
running the free command, and more in-depth information is available if  
you look at the /proc/meminfo information.  I have a paper you may find  
helpful here, it has more detail in it than you need but it provides  
some pointers to resources to help you better understand how memory  
management in Linux works:  
http://www.westnet.com/~gsmith/content/linux-pdflush.htm


Thanks for that, too.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[PERFORM] Large (almost 50%!) performance drop after upgrading to 8.4.4?

2010-06-09 Thread Max Williams
Hi,
I was doing some benchmarking while changing configuration options to try to 
get more performance out of our postgresql servers and noticed that when 
running pgbench against 8.4.3 vs 8.4.4 on identical hardware and configuration 
there is a large difference in performance. I know tuning is a very deep topic 
and benchmarking is hardly an accurate indication of real world performance but 
I was still surprised by these results and wanted to know what I am doing wrong.

OS is CentOS 5.5 and the postgresql packages are from the pgdg repo.

Hardware specs are:
2x Quad core Xeons 2.4Ghz
16GB RAM
2x RAID1 7.2k RPM disks (slow I know, but we are upgrading them soon..)

Relevant Postgresql Configuration:
max_connections = 1000
shared_buffers = 4096MB
temp_buffers = 8MB
max_prepared_transactions = 1000
work_mem = 8MB
maintenance_work_mem = 512MB
wal_buffers = 8MB
checkpoint_segments = 192
checkpoint_timeout = 30min
effective_cache_size = 12288MB

Results for the 8.4.3 (8.4.3-2PGDG.el5) host:
[r...@some-host ~]# pgbench -h dbs3 -U postgres -i -s 100 pgbench1 > /dev/null 
2>&1 && pgbench -h dbs3 -U postgres -c 100 -t 10 pgbench1
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
query mode: simple
number of clients: 100
number of transactions per client: 10
number of transactions actually processed: 1000/1000
tps = 5139.554921 (including connections establishing)
tps = 5140.325850 (excluding connections establishing)
opreport:
CPU: Intel Core/i7, speed 2394.07 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (No unit mask) count 10
CPU_CLK_UNHALT...|
  samples|  %|
--
 37705832 61.3683 postgres
 18472598 30.0652 no-vmlinux
  4982274  8.1089 libc-2.5.so
   138517  0.2254 oprofiled
   134628  0.2191 libm-2.5.so
 1465  0.0024 libc-2.5.so
 1454  0.0024 libperl.so
  793  0.0013 libdcsupt.so.5.9.2
  444 7.2e-04 dsm_sa_datamgrd
CPU_CLK_UNHALT...|
  samples|  %|
--
  401 90.3153 dsm_sa_datamgrd
   43  9.6847 anon (tgid:8013 range:0xe000-0xf000)
  410 6.7e-04 libxml2.so.2.6.26
  356 5.8e-04 ld-2.5.so
  332 5.4e-04 libnetsnmp.so.10.0.3
  327 5.3e-04 dsm_sa_snmpd
CPU_CLK_UNHALT...|
  samples|  %|
--
  255 77.9817 dsm_sa_snmpd
   72 22.0183 anon (tgid:8146 range:0xe000-0xf000)
  304 4.9e-04 libcrypto.so.0.9.8e
  290 4.7e-04 libpthread-2.5.so
  199 3.2e-04 libdcsmil.so.5.9.2
  139 2.3e-04 modclusterd


Results for the 8.4.4 (8.4.4-1PGDG.el5) host:
[root@ some-host ~]# pgbench -h dbs4 -U postgres -i -s 100 pgbench1 > /dev/null 
2>&1 && pgbench -h dbs4 -U postgres -c 100 -t 10 pgbench1
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
query mode: simple
number of clients: 100
number of transactions per client: 10
number of transactions actually processed: 1000/1000
tps = 2765.643549 (including connections establishing)
tps = 2765.931203 (excluding connections establishing)
opreport:
CPU: Intel Core/i7, speed 2394.07 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (No unit mask) count 10
CPU_CLK_UNHALT...|
  samples|  %|
--
312481395 84.5038 postgres
 41861164 11.3204 no-vmlinux
 14290652  3.8646 libc-2.5.so
   812148  0.2196 oprofiled
   305909  0.0827 libm-2.5.so
 7647  0.0021 libc-2.5.so
 3809  0.0010 libdcsupt.so.5.9.2
 3077 8.3e-04 libperl.so
 2302 6.2e-04 dsm_sa_datamgrd
CPU_CLK_UNHALT...|
  samples|  %|
--
 2113 91.7897 dsm_sa_datamgrd
  189  8.2103 anon (tgid:8075 range:0xe000-0xf000)
 2175 5.9e-04 libxml2.so.2.6.26
 1455 3.9e-04 dsm_sa_snmpd
CPU_CLK_UNHALT...|
  samples|  %|
--
 1226 84.2612 dsm_sa_snmpd
  229 15.7388 anon (tgid:8208 range:0xe000-0xf000)
 1227 3.3e-04 libdchipm.so.5.9.2
 1192 3.2e-04 libpthread-2.5.so
  804 2.2e-04 libnetsnmp.so.10.0.3
  745 2.0e-04 modclusterd


Any input? I can reproduce these numbers consistently. If you need more 
information then just let me know. By the way, I am a new postgresql user so my 
experience is limited.
Cheers,
Max


Re: [PERFORM] Large (almost 50%!) performance drop after upgrading to 8.4.4?

2010-06-10 Thread Max Williams
Well the packages are from the pgdg repo which I would have thought are pretty 
common?
https://public.commandprompt.com/projects/pgcore/wiki


-Original Message-
From: Robert Haas [mailto:robertmh...@gmail.com] 
Sent: 10 June 2010 02:52
To: Max Williams
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Large (almost 50%!) performance drop after upgrading to 
8.4.4?

On Wed, Jun 9, 2010 at 6:56 AM, Max Williams  wrote:
> Any input? I can reproduce these numbers consistently. If you need more
> information then just let me know. By the way, I am a new postgresql user so
> my experience is limited.

Maybe different compile options?  If we'd really slowed things down by
50% between 8.4.3 and 8.4.4, there'd be an awful lot of people
screaming about it...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Large (almost 50%!) performance drop after upgrading to 8.4.4?

2010-06-10 Thread Max Williams
How do I tell if it was built with debugging options?


-Original Message-
From: Devrim GÜNDÜZ [mailto:dev...@gunduz.org] 
Sent: 10 June 2010 09:30
To: Robert Haas
Cc: Max Williams; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Large (almost 50%!) performance drop after upgrading to 
8.4.4?

On Wed, 2010-06-09 at 21:51 -0400, Robert Haas wrote:
> On Wed, Jun 9, 2010 at 6:56 AM, Max Williams 
> wrote:
> > Any input? I can reproduce these numbers consistently. If you need
> more
> > information then just let me know. By the way, I am a new postgresql
> user so
> > my experience is limited.
> 
> Maybe different compile options?  If we'd really slowed things down by 
> 50% between 8.4.3 and 8.4.4, there'd be an awful lot of people 
> screaming about it...

Given that there are 2 recent reports on the same issue, I wonder if the new 
packages were built with debugging options or not.

--
Devrim GÜNDÜZ
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer PostgreSQL RPM 
Repository: http://yum.pgrpms.org
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr 
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Large (almost 50%!) performance drop after upgrading to 8.4.4?

2010-06-10 Thread Max Williams
I'm afraid pg_config is not part of the pgdg packages.


-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: 10 June 2010 15:11
To: Max Williams
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Large (almost 50%!) performance drop after upgrading to 
8.4.4? 

Max Williams  writes:
> How do I tell if it was built with debugging options?

Run pg_config --configure and see if --enable-cassert is mentioned.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Large (almost 50%!) performance drop after upgrading to 8.4.4?

2010-06-10 Thread Max Williams
Ah, yes its OFF for 8.4.3 and ON for 8.4.4!

Can I just turn this off on 8.4.4 or is it a compile time option?
Also is this a mistake or intended? Perhaps I should tell the person who builds 
the pgdg packages??

Cheers,
Max


-Original Message-
From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] 
Sent: 10 June 2010 16:16
To: Max Williams; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Large (almost 50%!) performance drop after upgrading to 
8.4.4?

Max Williams  wrote:
 
> I'm afraid pg_config is not part of the pgdg packages.
 
Connect (using psql or your favorite client) and run:
 
show debug_assertions;
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] test post

2004-10-05 Thread Max Baker
please ignore if this goes through.  They've been bouncing and I'm trying to
find out why.

-m

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] Vacuum takes a really long time, vacuum full required

2004-10-19 Thread Max Baker
Hi Folks,

This is my _4th_ time trying to post this, me and the mailing list software
are fighting.  I think it's because of the attachments so I'll just put
links to them instead.  All apologies if this gets duplicated.

I've been having problems maintaining the speed of the database in the
long run.  VACUUMs of the main tables happen a few times a day after maybe
50,000 or less rows are added and deleted (say 6 times a day).

I have a whole lot (probably too much) indexing going on to try to speed
things up. 

Whatever the case, the database still slows down to a halt after a month or
so, and I have to go in and shut everything down and do a VACUUM FULL by
hand.  One index (of many many) takes 2000 seconds to vacuum.  The whole
process takes a few hours.

I would love suggestions on what I can do either inside my application, or
from a dba point of view to keep the database maintained without having to
inflict downtime.  This is for 'Netdisco' -- an open source network
management software by the way.  I'ld like to fix this for everyone who uses
it.


Sys Info :

$ uname -a
FreeBSD .ucsc.edu 4.10-STABLE FreeBSD 4.10-STABLE #0: Mon Aug 16
14:56:19 PDT 2004 [EMAIL PROTECTED]:/usr/src/sys/compile/  i386

$ pg_config --version
PostgreSQL 7.3.2

$ cat postgresql.conf
max_connections = 32
shared_buffers = 3900   # 30Mb - Bsd current kernel limit
max_fsm_relations = 1000# min 10, fsm is free space map, ~40 bytes
max_fsm_pages = 1   # min 1000, fsm is free space map, ~6 bytes
max_locks_per_transaction = 64  # min 10
wal_buffers = 8 # min 4, typically 8KB each

The log of the vacuum and the db schema could not be attached, so they are
at : 
http://netdisco.net/db_vacuum.txt
http://netdisco.net/pg_all.input

Thanks for any help!
-m

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Vacuum takes a really long time, vacuum full required

2004-10-19 Thread Max Baker
Hi Rod,

On Tue, Oct 19, 2004 at 11:40:17AM -0400, Rod Taylor wrote:
> > Whatever the case, the database still slows down to a halt after a month or
> > so, and I have to go in and shut everything down and do a VACUUM FULL by
> > hand.  One index (of many many) takes 2000 seconds to vacuum.  The whole
> > process takes a few hours.
> 
> Do a REINDEX on that table instead, and regular vacuum more frequently.

Great, this is exactly what I think it needs.  Meanwhile, I was checking out

http://www.postgresql.org/docs/7.3/static/sql-reindex.html

Which suggests I might be able to do a drop/add on each index with the
database 'live'.

However, the DROP INDEX command was taking an awfully long time to complete
and it hung my app in the mean time.   Does anyone know if the DROP INDEX
causes an exclusive lock, or is it just a lengthy process?

> > $ pg_config --version
> > PostgreSQL 7.3.2
> 
> 7.4.x deals with index growth a little better 7.3 and older did.

Will do.  Meanwhile I'm stuck supporting older 7.x versions, so I'm still
looking for a solution for them.

Thanks!
-m

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Free PostgreSQL Training, Philadelphia, Oct 30

2004-10-19 Thread Max Baker
On Wed, Oct 13, 2004 at 12:21:27PM -0400, Aaron Mulder wrote:
> All,
>   My company (Chariot Solutions) is sponsoring a day of free
> PostgreSQL training by Bruce Momjian (one of the core PostgreSQL
> developers).  The day is split into 2 sessions (plus a Q&A session):
> 
>  * Mastering PostgreSQL Administration
>  * PostgreSQL Performance Tuning
> 
>   Registration is required, and space is limited.  The location is
> Malvern, PA (suburb of Philadelphia) and it's on Saturday Oct 30.  For
> more information or to register, see
> 
> http://chariotsolutions.com/postgresql.jsp

I'm up in New York City and would be taking the train down to Philly.  Is
anyone coming from Philly or New York that would be able to give me a lift
to/from the train station?  Sounds like a great event.

Cheers,
-m

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Vacuum takes a really long time, vacuum full required

2004-10-23 Thread Max Baker
On Tue, Oct 19, 2004 at 11:40:17AM -0400, Rod Taylor wrote:
> > Whatever the case, the database still slows down to a halt after a month or
> > so, and I have to go in and shut everything down and do a VACUUM FULL by
> > hand.  One index (of many many) takes 2000 seconds to vacuum.  The whole
> > process takes a few hours.
> 
> Do a REINDEX on that table instead, and regular vacuum more frequently.
> 
> > $ pg_config --version
> > PostgreSQL 7.3.2
> 
> 7.4.x deals with index growth a little better 7.3 and older did.

I did a REINDEX of the database.  The results are pretty insane, the db went
from 16GB to 381MB.  Needless to say things are running a lot faster. 

I will now take Tom's well-given advice and upgrade to 7.4.  But at least
now I have something to tell my users who are not able to do a DB upgrade
for whatever reason.

Thanks for all your help folks!
-m

Before:
# du -h pgsql   
135Kpgsql/global
128Mpgsql/pg_xlog
 80Mpgsql/pg_clog
3.6Mpgsql/base/1
3.6Mpgsql/base/16975
1.0Kpgsql/base/16976/pgsql_tmp
 16Gpgsql/base/16976
 16Gpgsql/base
 16Gpgsql

After Reindex:
# du /data/pgsql/
131K/data/pgsql/global
128M/data/pgsql/pg_xlog
 81M/data/pgsql/pg_clog
3.6M/data/pgsql/base/1
3.6M/data/pgsql/base/16975
1.0K/data/pgsql/base/16976/pgsql_tmp
268M/data/pgsql/base/16976
275M/data/pgsql/base
484M/data/pgsql/

After Vacuum:
# du /data/pgsql/  
131K/data/pgsql/global
144M/data/pgsql/pg_xlog
 81M/data/pgsql/pg_clog
3.6M/data/pgsql/base/1
3.6M/data/pgsql/base/16975
1.0K/data/pgsql/base/16976/pgsql_tmp
149M/data/pgsql/base/16976
156M/data/pgsql/base
381M/data/pgsql/

netdisco=> select relname, relpages from pg_class order by relpages desc;

Before:
 relname | relpages 
-+--
 idx_node_switch_port_active |   590714
 idx_node_switch_port|   574344
 idx_node_switch |   482202
 idx_node_mac|   106059
 idx_node_mac_active |99842

After:
 relname | relpages 
-+--
 node_ip |13829
 node| 9560
 device_port | 2124
 node_ip_pkey| 1354
 idx_node_ip_ip  | 1017
 idx_node_ip_mac_active  |  846


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] perl garbage collector

2005-06-27 Thread Jean-Max Reymond
Hi, 
I have a stored procedure written in perl and I doubt that perl's
garbage collector is working :-(
after a lot of work, postmaster has a size of 1100 Mb and  I think
that the keyword "undef" has no effects.
Before tuning my procedure, does it exist a known issue, a workaround ?

-- 
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] perl garbage collector

2005-06-28 Thread Jean-Max Reymond
2005/6/28, Tom Lane <[EMAIL PROTECTED]>:
> Jean-Max Reymond <[EMAIL PROTECTED]> writes:
> > I have a stored procedure written in perl and I doubt that perl's
> > garbage collector is working :-(
> > after a lot of work, postmaster has a size of 1100 Mb and  I think
> > that the keyword "undef" has no effects.
> 
> Check the PG list archives --- there's been previous discussion of
> similar issues.  I think we concluded that when Perl is built to use
> its own private memory allocator, the results of that competing with
> malloc are not very pretty :-(.  You end up with a fragmented memory
> map and no chance to give anything back to the OS.

thanks Tom for your advice. I have read the discussion but a small
test is very confusing for me.
Consider this function:

CREATE FUNCTION jmax() RETURNS integer
AS $_$use strict;

my $i=0;
for ($i=0; $i<1;$i++) {
my $ch = "0123456789"x10;
my $res = spi_exec_query("select * from xdb_child where
doc_id=100 and ele_id=3 ");
}
my $j=1;$_$
LANGUAGE plperlu SECURITY DEFINER;


ALTER FUNCTION public.jmax() OWNER TO postgres;

the line my $ch = "0123456789"x10;   is used to allocate 1Mb.
the line my $res = spi_exec_query("select * from xdb_child where
doc_id=100 and ele_id=3 limit 5"); simulates a query.

without spi_exec_quer, the used memory in postmaster is a constant.
So, I think that pl/perl manages correctly memory in this case.
with spi_exec_query, postmaster grows and grows until the end of the loop. 
Si, it seems that spi_exec_query does not release all the memory after
each call.
For my application (in real life) afer millions of spi_exec_query, it
grows up to 1Gb :-(



-- 
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] perl garbage collector

2005-06-28 Thread Jean-Max Reymond
2005/6/28, Jean-Max Reymond <[EMAIL PROTECTED]>:
> For my application (in real life) afer millions of spi_exec_query, it
> grows up to 1Gb :-(

OK, now in 2 lines:

CREATE FUNCTION jmax() RETURNS integer
AS $_$use strict;

for (my $i=0; $i<1000;$i++) {
spi_exec_query("select 'foo'");
}
my $j=1;$_$
LANGUAGE plperlu SECURITY DEFINER

running this test and your postmaster eats a lot of memory.
it seems that there is a memory leak  in spi_exec_query :-( 


-- 
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] start time very high

2005-06-30 Thread Jean-Max Reymond
hi,

I have two computers, one laptop (1.5 GHz, 512 Mb RAM, 1 disk 4200)
and one big Sun (8Gb RAM, 2 disks SCSI).

On my laptop, I have this EXPLAIN ANALYZE

Sort  (cost=7.56..7.56 rows=1 width=28) (actual time=0.187..0.187 
rows=0 loops=1)
   Sort Key: evolution, indx
   ->  Index Scan using index_xdb_child on xdb_child c1  
(cost=0.00..7.55 rows=1 width=28) (actual time=0.045..0.045 rows=0 loops=1)
 Index Cond: ((doc_id = 100) AND (ele_id = 1) AND (isremoved = 0))
 Filter: (evolution = (subplan))
 SubPlan
   ->  Aggregate  (cost=3.78..3.78 rows=1 width=4) (never executed)
 ->  Index Scan using index_xdb_child on xdb_child c2  
(cost=0.00..3.77 rows=1 width=4) (never executed)
   Index Cond: ((doc_id = 100) AND (ele_id = 1))
   Filter: ((evolution <= 0) AND (child_id = $0) AND 
(child_class = $1))
 Total runtime: 0.469 ms
(11 rows)


and on the SUN:

"Sort  (cost=7.56..7.56 rows=1 width=28) (actual time=26.335..26.335
rows=0 loops=1)"
"  Sort Key: evolution, indx"
"  ->  Index Scan using index_xdb_child on xdb_child c1 
(cost=0.00..7.55 rows=1 width=28) (actual time=26.121..26.121 rows=0
loops=1)"
"Index Cond: ((doc_id = 100) AND (ele_id = 1) AND (isremoved = 0))"
"Filter: (evolution = (subplan))"
"SubPlan"
"  ->  Aggregate  (cost=3.78..3.78 rows=1 width=4) (never executed)"
"->  Index Scan using index_xdb_child on xdb_child c2 
(cost=0.00..3.77 rows=1 width=4) (never executed)"
"  Index Cond: ((doc_id = 100) AND (ele_id = 1))"
"  Filter: ((evolution <= 0) AND (child_id = $0)
AND (child_class = $1))"
"Total runtime: 26.646 ms"




so the request run in 26.646 ms on the Sun and 0.469ms on my laptop :-( 
the database are the same, vacuumed and I think the Postgres (8.0.3)
are well configured.
The Sun has two disks and use the TABLESPACE to have index on one disk
and data's on the other disk.
It seems that the cost of the first sort is very high on the Sun.
How is it possible ?

the request:

explain analyze select * from XDB_CHILD c1
where c1.doc_id = 100
        and c1.ele_id = 1
and c1.isremoved = 0
and c1.evolution = (select max(evolution)
from XDB_CHILD c2
where c2.doc_id=100
and c2.ele_id=1
and c2.evolution<=0
and
c2.child_id=c1.child_id
and
c2.child_class=c1.child_class) ORDER BY c1.evolution, c1.indx

-- 
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] start time very high

2005-06-30 Thread Jean-Max Reymond
2005/6/30, Jean-Max Reymond <[EMAIL PROTECTED]>:
> so the request run in 26.646 ms on the Sun and 0.469ms on my laptop :-(
> the database are the same, vacuumed and I think the Postgres (8.0.3)
> are well configured.
> The Sun has two disks and use the TABLESPACE to have index on one disk
> and data's on the other disk.
> It seems that the cost of the first sort is very high on the Sun.
> How is it possible ?

may be data's not loaded  in memory but on disk ?

-- 
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[PERFORM]

2005-07-07 Thread Jean-Max Reymond
Hi,
These last two days, I have some troubles with a very strange phenomena:
I have a 400 Mb database and a stored procedure written in perl which
call 14 millions times spi_exec_query (thanks to Tom to fix the memory
leak ;-) ).
On my laptop whith Centrino 1.6 GHz, 512 Mb RAM,
 - it is solved in 1h50' for Linux 2.6
 - it is solved in 1h37' for WXP Professionnal ( WXP better
tan Linux ;-) )
On a Desktop with PIV 2.8 GHz, 
 - it is solved in 3h30 for W2K
On a Desktop with PIV 1.8 GHz, two disks with data and index's on each disk
 - it is solved in 4h for W2K

I test CPU, memory performance on my laptop and it seems that the
performances are not perfect except for one single  test: String sort.

So, it seems that for my application (database in memory, 14 millions
of very small requests), Centrino (aka Pentium M) has a build-in
hardware to boost Postgres performance :-)
Any experience to confirm this fact ?

-- 
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Surprizing performances for Postgres on Centrino

2005-07-07 Thread Jean-Max Reymond
Hi,
These last two days, I have some troubles with a very strange phenomena:
I have a 400 Mb database and a stored procedure written in perl which
call 14 millions times spi_exec_query (thanks to Tom to fix the memory
leak ;-) ).
On my laptop whith Centrino 1.6 GHz, 512 Mb RAM,
- it is solved in 1h50' for Linux 2.6
- it is solved in 1h37' for WXP Professionnal ( WXP better
tan Linux ;-) )
On a Desktop with PIV 2.8 GHz,
- it is solved in 3h30 for W2K
On a Desktop with PIV 1.8 GHz, two disks with data and index's on each disk
- it is solved in 4h for W2K

I test CPU, memory performance on my laptop and it seems that the
performances are not perfect except for one single  test: String sort.

So, it seems that for my application (database in memory, 14 millions
of very small requests), Centrino (aka Pentium M) has a build-in
hardware to boost Postgres performance :-)
Any experience to confirm this fact ?
Some tips to speed up Postgres  on non-Centrino processors ?

-- 
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Surprizing performances for Postgres on Centrino

2005-07-07 Thread Jean-Max Reymond
On 7/7/05, Alvaro Herrera <[EMAIL PROTECTED]> wrote:

> 
> Do you have the same locale settings on all of them?
> 

interressant:
UNICODE on the fast laptop
SQL_ASCII on the slowest desktops.
is UNICODE database faster than SQL_ASCII ?

-- 
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Projecting currentdb to more users

2005-07-12 Thread Jean-Max Reymond
2005/7/12, Mohan, Ross <[EMAIL PROTECTED]>:
> From AMD's suit against Intel. Perhaps relevant to some PG/AMD issues.

Postgres is compiled with gnu compiler. Isn't it ?
I don't know how much can Postgres benefit from an optimized Intel compiler.

-- 
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



[PERFORM] size of cache

2005-07-13 Thread Jean-Max Reymond
with my application, it seems that size of cache has great effect:
from 512 Kb of L2 cache to 1Mb boost performance with a factor 3 and
20% again from 1Mb L2 cache to 2Mb L2 cache.
I don't understand why a 512Kb cache L2 is too small to fit the data's
does it exist a tool to trace processor activity and confirm that
processor is waiting for memory ?
does it exist a tool to snapshot postgres activity and understand
where we spend time and potentialy avoid the bottleneck ?

thanks for your tips.

-- 
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] blue prints please

2005-10-26 Thread Jean-Max Reymond
2005/10/26, Sidar López Cruz <[EMAIL PROTECTED]>:
> where can i find bests practices for tunning postgresql?

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

--
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] VACUUM FULL does not works.......

2006-12-06 Thread Jean-Max Reymond

2006/12/6, asif ali <[EMAIL PROTECTED]>:

Hi,
 I have a "product" table having 350 records. It takes approx 1.8 seconds to
get all records from this table. I copies this table to a "product_temp"
table and run the same query to select all records; and it took 10ms(much
faster).
 I did "VACUUM FULL" on "product" table but It did not work.

 I checked the file size of these two tables.
 "product" table's file size is "32mb" and
 "product_temp" table's file size is "72k".

 So, it seems that "VACUUM FULL" is not doing anything.
 Please suggest.


try VACUUM FULL VERBOSE and report the result.

--
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] TRUNCATE TABLE

2007-07-13 Thread Jean-Max Reymond

Adriaan van Os a écrit :
That's a remarkable advice, because XFS is known to be slow at creating 
and deleting files, see  and 
.




date of article: Fri Jul 25 2003  !

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Table Statistics with pgAdmin III

2007-07-24 Thread Jean-Max Reymond

Campbell, Lance a écrit :
I have installed pgAdmin III 1.6.  In the tool when you click on a 
particular table you can select a tab called “Statistics”.  This tab has 
all kinds of info on your table.  For some reason the only info I see is 
for table size, toast table size and indexes size.  Is there a reason 
that the other 15 fields have zeros in them?  I was thinking that maybe 
I needed to turn on a setting within my database in order to get 
statistics reported.


it seems that the module pgstattuple is needed

--
Jean-Max Reymond
CKR Solutions http://www.ckr-solutions.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Keeping processes open for re-use

2006-11-16 Thread Jean-Max Reymond

2006/11/10, Joshua D. Drake <[EMAIL PROTECTED]>:


I would actually suggest pg_pool over pg_pconnect.



Please, can you explain advantages of pg_pool over pg_connect ?

--
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] Performance with Intel Compiler

2004-08-31 Thread Jean-Max Reymond
hi,
has anyone compile Postgres with Intel compiler ?
Does it exist a substantial gain of performance ?


-- 
Jean-Max Reymond
CKR Solutions
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Optimizing a request

2004-08-31 Thread Jean-Max Reymond
hi,
I have the following structure in my base 7.4.2

CREATE TABLE "public"."article" (
 "art_id" INTEGER NOT NULL,
 "rub_id" INTEGER DEFAULT '0' NOT NULL,
 "art_titre" VARCHAR(100) DEFAULT '' NOT NULL,
 "art_texte" TEXT NOT NULL,
 "art_date" DATE NOT NULL,
 "aut_id" INTEGER,
 CONSTRAINT "article_pkey" PRIMARY KEY("art_id")
) WITH OIDS;

CREATE INDEX "article_art_date_index" ON "public"."article"
USING btree ("art_date");


CREATE INDEX "article_aut_id_index" ON "public"."article"
USING btree ("aut_id");


CREATE INDEX "article_rub_id_index" ON "public"."article"
USING btree ("rub_id");


CREATE INDEX "article_titre" ON "public"."article"
USING btree ("art_id", "art_titre");


CREATE TABLE "public"."auteur" (
 "aut_id" INTEGER NOT NULL,
 "aut_name" VARCHAR(100) DEFAULT '' NOT NULL,
 CONSTRAINT "auteur_pkey" PRIMARY KEY("aut_id")
) WITH OIDS;


CREATE TABLE "public"."rubrique" (
 "rub_id" INTEGER NOT NULL,
 "rub_titre" VARCHAR(100) DEFAULT '' NOT NULL,
 "rub_parent" INTEGER DEFAULT '0' NOT NULL,
 "rub_date" DATE,
 CONSTRAINT "rubrique_pkey" PRIMARY KEY("rub_id")
) WITH OIDS;

CREATE INDEX "rub_rub" ON "public"."rubrique"
USING btree ("rub_parent");

CREATE INDEX "rubrique_rub_date_index" ON "public"."rubrique"
USING btree ("rub_date");

CREATE INDEX "rubrique_rub_titre_index" ON "public"."rubrique"
USING btree ("rub_titre");

I want to optimize the following request and avoid the seq scan on the
table article (1000 rows).



explain SELECT art_id, art_titre, art_texte, rub_titre
FROM article inner join rubrique on article.rub_id = rubrique.rub_id
where rub_parent = 8;

Hash Join  (cost=8.27..265637.59 rows=25 width=130)
  Hash Cond: ("outer".rub_id = "inner".rub_id)
  ->  Seq Scan on article  (cost=0.00..215629.00 rows=1000 width=108)
  ->  Hash  (cost=8.26..8.26 rows=3 width=22)
->  Index Scan using rubrique_parent on rubrique 
(cost=0.00..8.26 rows=3 width=22)
  Index Cond: (rub_parent = 8)


thanks for your answers,

-- 
Jean-Max Reymond
CKR Solutions
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Optimizing a request

2004-08-31 Thread Jean-Max Reymond
On Tue, 31 Aug 2004 12:15:40 -0700, Josh Berkus <[EMAIL PROTECTED]> wrote:

> Those look suspiciously like stock estimates.  When was the last time you ran
> ANALYZE?

the vacuum analyze ran just before the explain

-- 
Jean-Max Reymond
CKR Solutions
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Optimizing a request

2004-08-31 Thread Jean-Max Reymond
- Original Message -
From: Gary Doades <[EMAIL PROTECTED]>
Date: Tue, 31 Aug 2004 20:21:49 +0100
Subject: Re: [PERFORM] Optimizing a request
To: [EMAIL PROTECTED]

 

> Have you run ANALYZE on this database after creating the indexes or loading the 
> data? 
 
the indexes are created and the data loaded and then, I run vacuum analyze.

>What percentage of rows in the "article" table are likely to match
the keys selected from  the "rubrique" table?
 
only 1 record.

If it is likely to fetch a high proportion of the rows from article
then it may be best that a seq scan is performed.
 
What are your non-default postgresql.conf settings? It may be better
to increase the default_statistics_target (to say 100 to 200) before
running ANALYZE and then re-run the query.
 
yes,  default_statistics_target is set to the default_value.
I have just increased  shared_buffers and effective_cache_size to give
advantage of 1 Mb RAM
 



-- 
Jean-Max Reymond
CKR Solutions
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Optimizing a request

2004-08-31 Thread Jean-Max Reymond
On Tue, 31 Aug 2004 21:16:46 +0100, Gary Doades <[EMAIL PROTECTED]> wrote:

> I can only presume you mean 1 GB RAM. What exactly are your
> settings for shared buffers and effective_cache_size?

for 1 GB RAM,
shared_buffers = 65536
effective_cache_size = 16384 

> 
> Can you increase default_statistics_target and re-test? It is possible
> that with such a large table that the distribution of values is skewed and
> postgres does not realise that an index scan would be better.

OK, tomorrow, I'll try with the new value of default_statistics_target

> It seems very odd otherwise that only on row out of 10,000,000 could
> match and postgres does not realise this.
> 
> Can you post an explain analyse (not just explain) for this query?

yes, of course


-- 
Jean-Max Reymond
CKR Solutions
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Optimizing a request

2004-08-31 Thread Jean-Max Reymond
On Tue, 31 Aug 2004 16:13:58 -0400, Tom Lane <[EMAIL PROTECTED]> wrote:

> That seems like a very strange plan choice given those estimated row
> counts.  I'd have expected it to use a nestloop with inner index scan
> on article_rub_id_index.  You haven't done anything odd like disable
> nestloop, have you?
> 

no optimizer disabled.

> What plan do you get if you turn off enable_hashjoin?  (If it's a merge
> join, then turn off enable_mergejoin and try again.)  Also, could we see
> EXPLAIN ANALYZE not just EXPLAIN output for all these cases?
> 
> regards, tom lane
> 

OK, TOM Thanks for your help

-- 
Jean-Max Reymond
CKR Solutions
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-20 Thread Jean-Max Reymond
On Thu, 20 Jan 2005 15:03:31 +0100, Hervé Piedvache <[EMAIL PROTECTED]> wrote:

> We were at this moment thinking about a Cluster solution ... We saw on the
> Internet many solution talking about Cluster solution using MySQL ... but
> nothing about PostgreSQL ... the idea is to use several servers to make a
> sort of big virtual server using the disk space of each server as one, and
> having the ability to use the CPU and RAM of each servers in order to
> maintain good service performance ...one can imagin it is like a GFS but
> dedicated to postgreSQL...
> 

forget mysql cluster for now.
We have a small database which size is 500 Mb.
It is not possible to load these base in a computer with 2 Mb of RAM
and loading the base in RAM is required.
So, we shrink the database and it is ok with 350 Mb to fit in the 2 Gb RAM.
First tests of performance on a basic request: 500x slower, yes 500x.
This issue is reported to mysql team  but no answer (and correction)

Actually, the solution is running with a replication database: 1 node
for write request and all the other nodes for read requests and the
load balancer is made with round robin solution.


-- 
Jean-Max Reymond
CKR Solutions
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-20 Thread Jean-Max Reymond
On Thu, 20 Jan 2005 12:13:17 -0700, Steve Wampler <[EMAIL PROTECTED]> wrote:
> Mitch Pirtle wrote:

> But that's not enough, because you're going to be running separate
> postgresql backends on the different hosts, and there are
> definitely consistency issues with trying to do that.  So far as
> I know (right, experts?) postgresql isn't designed with providing
> distributed consistency in mind (isn't shared memory used for
> consistency, which restricts all the backends to a single host?).

yes, you're right: you'll need a Distributed Lock Manager and an
application to manage it , Postgres ?

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Less expensive proprietary or Open source ETL tools

2009-08-16 Thread Jean-Max Reymond

Rstat a écrit :
Hi, 


As of today, we are still enjoying our Informatica tool but in a few months
we will need to change. Basically we do not use our software at its full
capacity and don't feel we need it anymore. 
So we are trying to find a less expensive solution that would have the same
features (or almost...). 


We are looking at less expensive tools and Open source software. We have
pretty much targeted a few companies and would like to know which ones would
be the better solution compared to Informatica. 

-Apatar 
-Expressor 
-Pentaho 
-Talend 


Some are paying software, some are open source (but not free...), so i'm
asking you to know which is the best software on the market.


Talend is a great Open Source tool and is OK with Postgres databases


--
Jean-Max Reymond
CKR Solutions Open Source http://www.ckr-solutions.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] not using my GIN index in JOIN expression

2014-02-27 Thread Jean-Max Reymond

I am running the last version of PostgreSQL 9.3.3
I have two tables detm and corm and a lot of datas in the column 
cormdata of corm table (1.4 GB).


I have a GIN index on cormdata:
CREATE INDEX ix_corm_fulltext_cormdata  ON corm
  USING gin (to_tsvector('french'::regconfig, cormdata))
  WHERE cormishtml IS FALSE AND length(cormdata) < 2;

select distinct b.detmmailid from corm b where 
(to_tsvector('french',b.cormdata) @@ to_tsquery('mauritanie') and 
b.cormishtml is false and length(b.cormdata) < 2)

is very fast and use the GIN index.

"HashAggregate  (cost=2027.72..2031.00 rows=328 width=52)"
"  ->  Bitmap Heap Scan on corm b  (cost=24.25..2026.35 rows=548 width=52)"
"Recheck Cond: ((to_tsvector('french'::regconfig, cormdata) @@ 
to_tsquery('mauritanie'::text)) AND (cormishtml IS FALSE) AND 
(length(cormdata) < 2))"
"->  Bitmap Index Scan on ix_corm_fulltext_cormdata 
(cost=0.00..24.11 rows=548 width=0)"
"  Index Cond: (to_tsvector('french'::regconfig, cormdata) 
@@ to_tsquery('mauritanie'::text))"



With a join an another table detm, GIN index is not used


 explain select distinct a.detmmailid from detm a  JOIN corm b on 
a.detmmailid = b.detmmailid  where ((to_tsvector('french',b.cormdata) @@ 
to_tsquery('mauritanie') and b.cormishtml is false and 
length(b.cormdata) < 2)  OR ( detmobjet ~* 'mauritanie' ))


"HashAggregate  (cost=172418.27..172423.98 rows=571 width=52)"
"  ->  Hash Join  (cost=28514.92..172416.85 rows=571 width=52)"
"Hash Cond: (b.detmmailid = a.detmmailid)"
"Join Filter: (((to_tsvector('french'::regconfig, b.cormdata) @@ 
to_tsquery('mauritanie'::text)) AND (b.cormishtml IS FALSE) AND 
(length(b.cormdata) < 2)) OR (a.detmobjet ~* 'mauritanie'::text))"
"->  Seq Scan on corm b  (cost=0.00..44755.07 rows=449507 
width=689)"

"->  Hash  (cost=19322.74..19322.74 rows=338574 width=94)"
"  ->  Seq Scan on detm a  (cost=0.00..19322.74 rows=338574 
width=94)"



If I remove   OR ( detmobjet ~* 'mauritanie' ) in the select, the GIN 
index is used
 explain select distinct a.detmmailid from detm a  JOIN corm b on 
a.detmmailid = b.detmmailid  where ((to_tsvector('french',b.cormdata) @@ 
to_tsquery('mauritanie') and b.cormishtml is false and 
length(b.cormdata) < 2))


"HashAggregate  (cost=4295.69..4301.17 rows=548 width=52)"
"  ->  Nested Loop  (cost=24.67..4294.32 rows=548 width=52)"
"->  Bitmap Heap Scan on corm b  (cost=24.25..2026.35 rows=548 
width=52)"
"  Recheck Cond: ((to_tsvector('french'::regconfig, 
cormdata) @@ to_tsquery('mauritanie'::text)) AND (cormishtml IS FALSE) 
AND (length(cormdata) < 2))"
"  ->  Bitmap Index Scan on ix_corm_fulltext_cormdata 
(cost=0.00..24.11 rows=548 width=0)"
"Index Cond: (to_tsvector('french'::regconfig, 
cormdata) @@ to_tsquery('mauritanie'::text))"
"->  Index Only Scan using pkey_detm on detm a  (cost=0.42..4.13 
rows=1 width=52)"

"  Index Cond: (detmmailid = b.detmmailid)"

How can i force the use of the GIN index ?
thanks for your tips,

--
Jean-Max Reymond
CKR Solutions Open Source http://www.ckr-solutions.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] not using my GIN index in JOIN expression

2014-02-27 Thread Jean-Max Reymond

Le 27/02/2014 15:19, Heikki Linnakangas a écrit :

On 02/27/2014 04:06 PM, Jean-Max Reymond wrote:

I am running the last version of PostgreSQL 9.3.3
I have two tables detm and corm and a lot of datas in the column
cormdata of corm table (1.4 GB).

I have a GIN index on cormdata:
CREATE INDEX ix_corm_fulltext_cormdata  ON corm
USING gin (to_tsvector('french'::regconfig, cormdata))
WHERE cormishtml IS FALSE AND length(cormdata) < 2;

select distinct b.detmmailid from corm b where
(to_tsvector('french',b.cormdata) @@ to_tsquery('mauritanie') and
b.cormishtml is false and length(b.cormdata) < 2)
is very fast and use the GIN index.

"HashAggregate  (cost=2027.72..2031.00 rows=328 width=52)"
"  ->  Bitmap Heap Scan on corm b  (cost=24.25..2026.35 rows=548
width=52)"
"Recheck Cond: ((to_tsvector('french'::regconfig, cormdata) @@
to_tsquery('mauritanie'::text)) AND (cormishtml IS FALSE) AND
(length(cormdata) < 2))"
"->  Bitmap Index Scan on ix_corm_fulltext_cormdata
(cost=0.00..24.11 rows=548 width=0)"
"  Index Cond: (to_tsvector('french'::regconfig, cormdata)
@@ to_tsquery('mauritanie'::text))"


With a join an another table detm, GIN index is not used


   explain select distinct a.detmmailid from detm a  JOIN corm b on
a.detmmailid = b.detmmailid  where ((to_tsvector('french',b.cormdata) @@
to_tsquery('mauritanie') and b.cormishtml is false and
length(b.cormdata) < 2)  OR ( detmobjet ~* 'mauritanie' ))

"HashAggregate  (cost=172418.27..172423.98 rows=571 width=52)"
"  ->  Hash Join  (cost=28514.92..172416.85 rows=571 width=52)"
"Hash Cond: (b.detmmailid = a.detmmailid)"
"Join Filter: (((to_tsvector('french'::regconfig, b.cormdata) @@
to_tsquery('mauritanie'::text)) AND (b.cormishtml IS FALSE) AND
(length(b.cormdata) < 2)) OR (a.detmobjet ~* 'mauritanie'::text))"
"->  Seq Scan on corm b  (cost=0.00..44755.07 rows=449507
width=689)"
"->  Hash  (cost=19322.74..19322.74 rows=338574 width=94)"
"  ->  Seq Scan on detm a  (cost=0.00..19322.74 rows=338574
width=94)"


If I remove   OR ( detmobjet ~* 'mauritanie' ) in the select, the GIN
index is used
   explain select distinct a.detmmailid from detm a  JOIN corm b on
a.detmmailid = b.detmmailid  where ((to_tsvector('french',b.cormdata) @@
to_tsquery('mauritanie') and b.cormishtml is false and
length(b.cormdata) < 2))

"HashAggregate  (cost=4295.69..4301.17 rows=548 width=52)"
"  ->  Nested Loop  (cost=24.67..4294.32 rows=548 width=52)"
"->  Bitmap Heap Scan on corm b  (cost=24.25..2026.35 rows=548
width=52)"
"  Recheck Cond: ((to_tsvector('french'::regconfig,
cormdata) @@ to_tsquery('mauritanie'::text)) AND (cormishtml IS FALSE)
AND (length(cormdata) < 2))"
"  ->  Bitmap Index Scan on ix_corm_fulltext_cormdata
(cost=0.00..24.11 rows=548 width=0)"
"Index Cond: (to_tsvector('french'::regconfig,
cormdata) @@ to_tsquery('mauritanie'::text))"
"->  Index Only Scan using pkey_detm on detm a  (cost=0.42..4.13
rows=1 width=52)"
"  Index Cond: (detmmailid = b.detmmailid)"

How can i force the use of the GIN index ?
thanks for your tips,


The problem with the OR detmobject ~* 'mauritanie' restriction is that
the rows that match that condition cannot be found using the GIN index.
I think you'd want the system to fetch all the rows that match the other
condition using the GIN index, and do something else to find the other
rows. The planner should be able to do that if you rewrite the query as
a UNION:

select a.detmmailid from detm a JOIN corm b on
a.detmmailid = b.detmmailid
where (to_tsvector('french',b.cormdata) @@ to_tsquery('mauritanie') and
b.cormishtml is false and length(b.cormdata) < 2)
union
select a.detmmailid from detm a  JOIN corm b on
a.detmmailid = b.detmmailid
where detmobjet ~* 'mauritanie'

Note that that will not return rows in 'detm' that have no matching rows
in 'corm' table, even if they match the "detmobjet ~* 'mauritanie"
condition. That's what your original query also did, but if that's not
what you want, leave out the JOIN from the second part of the union.

- Heikki


It works great: thanks a lot :-)

--
Jean-Max Reymond
CKR Solutions Open Source http://www.ckr-solutions.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Performance of UNION vs IN

2014-03-20 Thread Jean-Max Reymond

I have a very complex view zinfoexp and running the view as:
SELECT * FROM zinfoexp  WHERE idmembre in (1,84)
take 2700 ms

So, I try another syntax:
SELECT * FROM zinfoexp  WHERE idmembre = 1
union
SELECT * FROM zinfoexp  WHERE idmembre = 84

and for me, two calls to my view takes a lot of time (may be x2) and it 
takes 134 ms !


How is it possible that the optimizer cannot optimize the IN as UNION ?
I have a database postgresql 9.3  freshly vacuumed

--
Jean-Max Reymond
CKR Solutions Open Source http://www.ckr-solutions.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance of UNION vs IN

2014-03-20 Thread Jean-Max Reymond

Le 20/03/2014 18:13, Torsten Förtsch a écrit :

On 20/03/14 17:57, Jean-Max Reymond wrote:

I have a very complex view zinfoexp and running the view as:
SELECT * FROM zinfoexp  WHERE idmembre in (1,84)
take 2700 ms

So, I try another syntax:
SELECT * FROM zinfoexp  WHERE idmembre = 1
union
SELECT * FROM zinfoexp  WHERE idmembre = 84

and for me, two calls to my view takes a lot of time (may be x2) and it
takes 134 ms !


try

   SELECT * FROM zinfoexp  WHERE idmembre=1 OR idmembre=84

This will probably be even faster.

Also, the 2 statements of your's are not semantically equal. UNION
implies DISTINCT, see:

select * from (values (1), (1), (2)) t(i) UNION select 19;
  i

  19
   1
   2
(3 rows)

What you want is UNION ALL:

select * from (values (1), (1), (2)) t(i) UNION ALL select 19;
  i

   1
   1
   2
  19
(4 rows)


Torsten



same numbers with DISTINCT and UNION ALL (construction of VIEW does an 
implicit DISTINCT).


--
Jean-Max Reymond
Éruption de l'Etna: http://jmreymond.free.fr/Etna2002


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance