Re: [GENERAL] [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-25 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Benjamin Arai wrote:
> This kind of disappointing, I was hoping there was more that could be done.
> 
> There has to be another way to do incremental indexing without loosing
> that much performance.

What makes you think you are loosing performance by using partitioning?

Joshua D. Drake

> 
> Benjamin
> 
> On Aug 24, 2007, at 6:29 PM, Joshua D. Drake wrote:
> 
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
> 
>> Brandon Shalton wrote:
>>> Benjamin,
>>>
>>>

 In order to avoid the re-indexing I was thinking of instead creating
 a new
 table each month (building its indexes and etc) and accessing them all
 through a view. This way I only have to index the new data each month.

>>>
>>> Take a look at bizgres.org (based on postgres).
>>>
>>> They have a parent-child structure.
>>>
>>> The way i use it, is I have about 30M records a day that are inserted
>>> into the database.
>>>
>>> Each day is a "child" table to the "parent".
>>>
>>> so example:
>>>
>>> the parent table is called  "logfile"
>>>
>>> each day, is a child, with the structure like  "logfile_YYMMDD"
>>>
>>> the "child" inherits the table structure of the parent, such that you
>>> could query the child table name directly, or you run the query against
>>> the parent (ie. logfile table) and get all the data.
>>>
>>> the indexes are done on a per table basis, so new data that comes in, is
>>> a lesser amount, and doesn't require re-indexing.
> 
> 
>> PostgreSQL can do all of this too.
> 
>> Sincerely,
> 
>> Joshua D. Drake
> 
>>>
>>>
>>> example:
>>>
>>> select * from logfile_070825 where datafield = 'foo'
>>>
>>> if i knew i wanted to specifically go into that child, or:
>>>
>>> select * from logfile where datafield = 'foo'
>>>
>>> and all child tables are searched and results merged.  You can perform
>>> any kind of sql query and field structures are you normally do.
>>>
>>> the downside is that the queries are run sequentially.
>>>
>>> so if you had 100 child tables, each table is queried via indexes, then
>>> results are merged.
>>>
>>> but, this approach does allow me to dump alot of data in, without having
>>> the re-indexing issues you are facing.
>>>
>>> at some point, you could roll up the days, in to weekly child tables,
>>> then monthly tables, etc.
>>>
>>> I believe Bizgres has a new version of their system that does parallel
>>> queries which would certainly speed things up.
>>>
>>> For your documents, you can do it by the day it was checked in, or maybe
>>> you have some other way of logically grouping, but the parent/child
>>> table structure really helped to solve my problem of adding in millions
>>> of records each day.
>>>
>>> The closest thing in mysql is using merge tables, which is not really
>>> practical when it comes time to do the joins to the tables.
>>>
>>> -brandon
>>>
>>> http://www.t3report.com - marketing intelligence for online marketing
>>> and affiliate programs
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---(end of broadcast)---
>>> TIP 6: explain analyze is your friend
>>>
> 
> 
>> - --
> 
>>   === The PostgreSQL Company: Command Prompt, Inc. ===
>> Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
>> PostgreSQL solutions since 1997  http://www.commandprompt.com/
>> UNIQUE NOT NULL
>> Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
>> PostgreSQL Replication: http://www.commandprompt.com/products/
> 
>> 


- ---(end of broadcast)---
TIP 6: explain analyze is your friend



- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG0FZKATb/zqfZUUQRAsfRAJ4mjQP+1ltG7pqLFQ+Ru52LA5e7XACcDqKr
PIihth2x3gx3qTEI8WfWNjo=
=AhJx
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-25 Thread Benjamin Arai

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

As stated in the previous email if I use partitioning then queries  
will be executed sequentially - i.e., instead of log(n) it would be  
(# partitions) * log(n).  Right?


Benjamin

On Aug 25, 2007, at 9:18 AM, Joshua D. Drake wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Benjamin Arai wrote:
This kind of disappointing, I was hoping there was more that could  
be done.


There has to be another way to do incremental indexing without  
loosing

that much performance.


What makes you think you are loosing performance by using  
partitioning?


Joshua D. Drake



Benjamin

On Aug 24, 2007, at 6:29 PM, Joshua D. Drake wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Brandon Shalton wrote:

Benjamin,




In order to avoid the re-indexing I was thinking of instead  
creating

a new
table each month (building its indexes and etc) and accessing  
them all
through a view. This way I only have to index the new data each  
month.




Take a look at bizgres.org (based on postgres).

They have a parent-child structure.

The way i use it, is I have about 30M records a day that are  
inserted

into the database.

Each day is a "child" table to the "parent".

so example:

the parent table is called  "logfile"

each day, is a child, with the structure like  "logfile_YYMMDD"

the "child" inherits the table structure of the parent, such  
that you
could query the child table name directly, or you run the query  
against

the parent (ie. logfile table) and get all the data.

the indexes are done on a per table basis, so new data that  
comes in, is

a lesser amount, and doesn't require re-indexing.




PostgreSQL can do all of this too.



Sincerely,



Joshua D. Drake





example:

select * from logfile_070825 where datafield = 'foo'

if i knew i wanted to specifically go into that child, or:

select * from logfile where datafield = 'foo'

and all child tables are searched and results merged.  You can  
perform

any kind of sql query and field structures are you normally do.

the downside is that the queries are run sequentially.

so if you had 100 child tables, each table is queried via  
indexes, then

results are merged.

but, this approach does allow me to dump alot of data in,  
without having

the re-indexing issues you are facing.

at some point, you could roll up the days, in to weekly child  
tables,

then monthly tables, etc.

I believe Bizgres has a new version of their system that does  
parallel

queries which would certainly speed things up.

For your documents, you can do it by the day it was checked in,  
or maybe

you have some other way of logically grouping, but the parent/child
table structure really helped to solve my problem of adding in  
millions

of records each day.

The closest thing in mysql is using merge tables, which is not  
really

practical when it comes time to do the joins to the tables.

-brandon

http://www.t3report.com - marketing intelligence for online  
marketing

and affiliate programs










---(end of  
broadcast)---

TIP 6: explain analyze is your friend





- --



  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/ 
donate

PostgreSQL Replication: http://www.commandprompt.com/products/







- ---(end of  
broadcast)---

TIP 6: explain analyze is your friend



- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/ 
donate

PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG0FZKATb/zqfZUUQRAsfRAJ4mjQP+1ltG7pqLFQ+Ru52LA5e7XACcDqKr
PIihth2x3gx3qTEI8WfWNjo=
=AhJx
-END PGP SIGNATURE-



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (Darwin)

iQIVAwUBRtBxK/yqRf6YpodNAQKVkhAAgF4DaXeMxplX1EUXZMuw9aqr+75NxNcp
ZOCJPSFN0jwzY3MlFCRVjL1kzXmRJB4L3fE2xVQX9reY62TPfYC8m/xatey1X6nc
RdfNb9IzL6OyAghcpnUnwYntQtmGRpJtS7LQrx/SiDz8LWIp2S5v3Q9S8alKNTUS
FupCNy1bL3yJf9tySSvol6JSH2edVt8f48J1j03f5B9zh+G/rKrQ+muuKOHyU3mb
cVJ+gbSWCesuo+9rfaJ24m2ODwZm/YA+ENhlc3EOvD8z+cYn2OjuvAqvHABRsEKe
+E9NWBPK/7UT4/T4B/LcBW1B6VISFqyETkwe2fhY5kVZnF+f0KtQIxXh/9qMsnnh
tWthI9YmG4MIBmCsJwdneABHdfMJDp8IlawXqMlX4VkPHUrUtiQV/oDNsHMrU8BM
SZOK5m0ADgXk0rndkEWXhERsyuFaocFj+snvaJEVH9PJSDVgjo7EMW5Qfo6p3NFg
ujBurhLaSuj52vClbdOs3lYp0Drbuf9iQnot3pD4XsCKAOTQm3S7BvgKMd5FUHLX
HBFn4KiSRGx7hwlrss4rjqJ8BoJKbtvGxyNSiwZkrAOke+gqEML6pPdvlAj3Dif8
KrsKcEu/

Re: [GENERAL] [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-25 Thread Brandon Shalton


As stated in the previous email if I use partitioning then queries  will 
be executed sequentially - i.e., instead of log(n) it would be  (# 
partitions) * log(n).  Right?




depends.. since indexes would be hit for each child table, the time for 
query is dependent on the amount of data that is indexed in each table.


the querying of the parent is still pretty quick given dual processor and a 
fast array filestorage device.


given your situation, i would give the parent/child approach a child.  I 
haven't checked in postgres if it is has it has Joshua had replied, but I do 
know bizgres does as i have been running this configuration for the last 3 
years and it solved my problem of importing 30-60M records in a day and 
still being able to query the database for data.


-brandon

http://www.t3report.com - marketing intelligence for online marketing and 
affiliate programs



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [GENERAL] [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-25 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Benjamin Arai wrote:
> As stated in the previous email if I use partitioning then queries will
> be executed sequentially - i.e., instead of log(n) it would be (#
> partitions) * log(n).  Right?

The planner will consider every relevant partition during the execution.
Which may be a performance hit, it may not be. It depends on many
factors. In general however, partitioning when done correctly is a
performance benefit and a maintenance benefit.

Sincerely,

Joshua D. Drake


> 
> Benjamin
> 
> On Aug 25, 2007, at 9:18 AM, Joshua D. Drake wrote:
> 
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
> 
>> Benjamin Arai wrote:
>>> This kind of disappointing, I was hoping there was more that could be
>>> done.
>>>
>>> There has to be another way to do incremental indexing without loosing
>>> that much performance.
> 
>> What makes you think you are loosing performance by using partitioning?
> 
>> Joshua D. Drake
> 
>>>
>>> Benjamin
>>>
>>> On Aug 24, 2007, at 6:29 PM, Joshua D. Drake wrote:
>>>
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
>>>
 Brandon Shalton wrote:
> Benjamin,
>
>
>>
>> In order to avoid the re-indexing I was thinking of instead creating
>> a new
>> table each month (building its indexes and etc) and accessing them
>> all
>> through a view. This way I only have to index the new data each
>> month.
>>
>
> Take a look at bizgres.org (based on postgres).
>
> They have a parent-child structure.
>
> The way i use it, is I have about 30M records a day that are inserted
> into the database.
>
> Each day is a "child" table to the "parent".
>
> so example:
>
> the parent table is called  "logfile"
>
> each day, is a child, with the structure like  "logfile_YYMMDD"
>
> the "child" inherits the table structure of the parent, such that you
> could query the child table name directly, or you run the query
> against
> the parent (ie. logfile table) and get all the data.
>
> the indexes are done on a per table basis, so new data that comes
> in, is
> a lesser amount, and doesn't require re-indexing.
>>>
>>>
 PostgreSQL can do all of this too.
>>>
 Sincerely,
>>>
 Joshua D. Drake
>>>
>
>
> example:
>
> select * from logfile_070825 where datafield = 'foo'
>
> if i knew i wanted to specifically go into that child, or:
>
> select * from logfile where datafield = 'foo'
>
> and all child tables are searched and results merged.  You can perform
> any kind of sql query and field structures are you normally do.
>
> the downside is that the queries are run sequentially.
>
> so if you had 100 child tables, each table is queried via indexes,
> then
> results are merged.
>
> but, this approach does allow me to dump alot of data in, without
> having
> the re-indexing issues you are facing.
>
> at some point, you could roll up the days, in to weekly child tables,
> then monthly tables, etc.
>
> I believe Bizgres has a new version of their system that does parallel
> queries which would certainly speed things up.
>
> For your documents, you can do it by the day it was checked in, or
> maybe
> you have some other way of logically grouping, but the parent/child
> table structure really helped to solve my problem of adding in
> millions
> of records each day.
>
> The closest thing in mysql is using merge tables, which is not really
> practical when it comes time to do the joins to the tables.
>
> -brandon
>
> http://www.t3report.com - marketing intelligence for online marketing
> and affiliate programs
>
>
>
>
>
>
>
>
>
>
> ---(end of
> broadcast)---
> TIP 6: explain analyze is your friend
>
>>>
>>>
 - --
>>>
   === The PostgreSQL Company: Command Prompt, Inc. ===
 Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
 PostgreSQL solutions since 1997  http://www.commandprompt.com/
 UNIQUE NOT NULL
 Donate to the PostgreSQL Project:
 http://www.postgresql.org/about/donate
 PostgreSQL Replication: http://www.commandprompt.com/products/
>>>

> 
> 
>> - ---(end of
>> broadcast)---
>> TIP 6: explain analyze is your friend
> 
> 
> 
>> - --
> 
>>   === The PostgreSQL Company: Command Prompt, Inc. ===
>> Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
>> PostgreSQL solutions since 1997  http://www.commandprompt.com/
>> UNIQUE NOT NULL
>> Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
>> PostgreSQL Replication: http://www.commandprompt.com/products/
> 
>> 


- --

  === The PostgreS

Re: [GENERAL] [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-25 Thread Benjamin Arai

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Since I am using tsearch2 on the table I think there is going to be a  
significant performance hit - e.g., I partition by batch (batches are  
not separated by date, they are essentially random subsets of a much  
larger data-set).  I am querying this database using tsearch2, so I  
am assuming all tables are going to be queried each time since the  
text is not partition by any specific constraint - e.g., >R goes to  
table 1 and <=R goes to table 2.


Benjamin

On Aug 25, 2007, at 11:18 AM, Joshua D. Drake wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Benjamin Arai wrote:
As stated in the previous email if I use partitioning then queries  
will

be executed sequentially - i.e., instead of log(n) it would be (#
partitions) * log(n).  Right?


The planner will consider every relevant partition during the  
execution.

Which may be a performance hit, it may not be. It depends on many
factors. In general however, partitioning when done correctly is a
performance benefit and a maintenance benefit.

Sincerely,

Joshua D. Drake




Benjamin

On Aug 25, 2007, at 9:18 AM, Joshua D. Drake wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Benjamin Arai wrote:
This kind of disappointing, I was hoping there was more that  
could be

done.

There has to be another way to do incremental indexing without  
loosing

that much performance.


What makes you think you are loosing performance by using  
partitioning?



Joshua D. Drake




Benjamin

On Aug 24, 2007, at 6:29 PM, Joshua D. Drake wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Brandon Shalton wrote:

Benjamin,




In order to avoid the re-indexing I was thinking of instead  
creating

a new
table each month (building its indexes and etc) and accessing  
them

all
through a view. This way I only have to index the new data each
month.



Take a look at bizgres.org (based on postgres).

They have a parent-child structure.

The way i use it, is I have about 30M records a day that are  
inserted

into the database.

Each day is a "child" table to the "parent".

so example:

the parent table is called  "logfile"

each day, is a child, with the structure like  "logfile_YYMMDD"

the "child" inherits the table structure of the parent, such  
that you

could query the child table name directly, or you run the query
against
the parent (ie. logfile table) and get all the data.

the indexes are done on a per table basis, so new data that comes
in, is
a lesser amount, and doesn't require re-indexing.




PostgreSQL can do all of this too.



Sincerely,



Joshua D. Drake





example:

select * from logfile_070825 where datafield = 'foo'

if i knew i wanted to specifically go into that child, or:

select * from logfile where datafield = 'foo'

and all child tables are searched and results merged.  You can  
perform

any kind of sql query and field structures are you normally do.

the downside is that the queries are run sequentially.

so if you had 100 child tables, each table is queried via  
indexes,

then
results are merged.

but, this approach does allow me to dump alot of data in, without
having
the re-indexing issues you are facing.

at some point, you could roll up the days, in to weekly child  
tables,

then monthly tables, etc.

I believe Bizgres has a new version of their system that does  
parallel

queries which would certainly speed things up.

For your documents, you can do it by the day it was checked  
in, or

maybe
you have some other way of logically grouping, but the parent/ 
child

table structure really helped to solve my problem of adding in
millions
of records each day.

The closest thing in mysql is using merge tables, which is not  
really

practical when it comes time to do the joins to the tables.

-brandon

http://www.t3report.com - marketing intelligence for online  
marketing

and affiliate programs










---(end of
broadcast)---
TIP 6: explain analyze is your friend





- --



  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project:
http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/








- ---(end of
broadcast)---
TIP 6: explain analyze is your friend





- --



  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/ 
donate

PostgreSQL Replication: http://www.commandprompt.com/products/







- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergenc

Re: [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-25 Thread Erik Jones


On Aug 24, 2007, at 7:41 PM, Benjamin Arai wrote:


Hi,

I have an application which loads millions of NEW documents each month
into a PostgreSQL tsearch2 table.  I have the initial version  
completed
and searching performance is great but my problem is that each time  
a new
month rolls around I have to drop all the indexes do a COPY and re- 
index
the entire table. This is problematic considering that each month  
takes

longer than the previous to rebuild the indexes and the application in
unavailable during the rebuilding process.

In order to avoid the re-indexing I was thinking of instead  
creating a new

table each month (building its indexes and etc) and accessing them all
through a view. This way I only have to index the new data each month.

Does this work?  Does a view with N tables make it N times slower for
tsearch2 queries? Is there a better solution?



You can use Postgres's inheritance mechanism for your partitioning  
mechanism and combine it with constraint exclusion to avoid the N^2  
issues.  See:


http://www.postgresql.org/docs/8.2/interactive/ddl-inherit.html

and

http://www.postgresql.org/docs/8.2/interactive/ddl-partitioning.html

Basically, create a table from which all of your partitioned tables  
inherit.  Partition in such a way that you can use constraint  
exclusion and then you can treat the parent table like the view you  
were suggesting.


Erik Jones

Software Developer | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-25 Thread Erik Jones


On Aug 25, 2007, at 2:58 PM, Erik Jones wrote:



On Aug 24, 2007, at 7:41 PM, Benjamin Arai wrote:


Hi,

I have an application which loads millions of NEW documents each  
month
into a PostgreSQL tsearch2 table.  I have the initial version  
completed
and searching performance is great but my problem is that each  
time a new
month rolls around I have to drop all the indexes do a COPY and re- 
index
the entire table. This is problematic considering that each month  
takes
longer than the previous to rebuild the indexes and the  
application in

unavailable during the rebuilding process.

In order to avoid the re-indexing I was thinking of instead  
creating a new
table each month (building its indexes and etc) and accessing them  
all
through a view. This way I only have to index the new data each  
month.


Does this work?  Does a view with N tables make it N times slower for
tsearch2 queries? Is there a better solution?



You can use Postgres's inheritance mechanism for your partitioning  
mechanism and combine it with constraint exclusion to avoid the N^2  
issues.  See:


http://www.postgresql.org/docs/8.2/interactive/ddl-inherit.html

and

http://www.postgresql.org/docs/8.2/interactive/ddl-partitioning.html

Basically, create a table from which all of your partitioned tables  
inherit.  Partition in such a way that you can use constraint  
exclusion and then you can treat the parent table like the view you  
were suggesting.


Erik Jones



Sorry, I didn't see that you had crossposted and carried the  
conversation on another list.  Please, don't do that.  Avoid the top  
posting, as well.   They both make it difficult for others to join in  
or follow the conversations and issues.


Erik Jones

Software Developer | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[PERFORM] Shared memory usage

2007-08-25 Thread Max Zorloff

Hello.

I have a postgres 8.0 and ~400mb database with lots of simple selects  
using indexes.
I've installed pgpool on the system. I've set num_init_children to 5 and  
here is the top output.
One of postmasters is my demon running some insert/update tasks. I see  
that they all use cpu heavily, but do not use the shared memory.  
shared_buffers is set to 6, yet they use a minimal part of that. I'd  
like to know why won't they use more? All the indexes and half of the  
database should be in the shared memory, is it not? Or am I completely  
missing what are the shared_buffers for? If so, then how do I put my  
indexes and at least a part of the data into memory?


top - 00:12:35 up 50 days, 13:22,  8 users,  load average: 4.84, 9.71,  
13.22

Tasks: 279 total,  10 running, 268 sleeping,   1 stopped,   0 zombie
Cpu(s): 50.0% us, 12.9% sy,  0.0% ni, 33.2% id,  1.8% wa,  0.0% hi,  2.1%  
si

Mem:   6102304k total,  4206948k used,  1895356k free,   159436k buffers
Swap:  1959888k total,12304k used,  1947584k free,  2919816k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
11492 postgres  16   0  530m  72m  60m S   14  1.2   0:50.91 postmaster
11493 postgres  16   0  531m  72m  60m R   14  1.2   0:48.78 postmaster
11490 postgres  15   0  530m  71m  59m S   13  1.2   0:50.26 postmaster
11491 postgres  15   0  531m  75m  62m S   11  1.3   0:50.67 postmaster
11495 postgres  16   0  530m  71m  59m R   10  1.2   0:50.71 postmaster
10195 postgres  15   0  536m  84m  66m S6  1.4   1:11.72 postmaster

postgresql.conf:

shared_buffers = 6
work_mem = 2048
maintenance_work_mem = 256000

The rest are basically default values

Thank you in advance.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] [GENERAL] Shared memory usage

2007-08-25 Thread Martijn van Oosterhout
On Sun, Aug 26, 2007 at 01:22:58AM +0400, Max Zorloff wrote:
> Hello.
> 
> I have a postgres 8.0 and ~400mb database with lots of simple selects  
> using indexes.
> I've installed pgpool on the system. I've set num_init_children to 5 and  
> here is the top output.
> One of postmasters is my demon running some insert/update tasks. I see  
> that they all use cpu heavily, but do not use the shared memory.  
> shared_buffers is set to 6, yet they use a minimal part of that. I'd  
> like to know why won't they use more? All the indexes and half of the  
> database should be in the shared memory, is it not? Or am I completely  
> missing what are the shared_buffers for? If so, then how do I put my  
> indexes and at least a part of the data into memory?

shared_memory is used for caching. It is filled as stuff is used. If
you're not using all of it that means it isn't needed. Remember, it is
not the only cache. Since your database is only 400MB it will fit
entirely inside the OS disk cache, so you really don't need much shared
memory at all.

Loading stuff into memory for the hell of it is a waste, let the system
manage the memory itself, if it needs it, it'll use it.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [PERFORM] [GENERAL] Shared memory usage

2007-08-25 Thread Max Zorloff
On Sun, 26 Aug 2007 00:39:52 +0400, Martijn van Oosterhout  
<[EMAIL PROTECTED]> wrote:



On Sun, Aug 26, 2007 at 01:22:58AM +0400, Max Zorloff wrote:

Hello.

I have a postgres 8.0 and ~400mb database with lots of simple selects
using indexes.
I've installed pgpool on the system. I've set num_init_children to 5 and
here is the top output.
One of postmasters is my demon running some insert/update tasks. I see
that they all use cpu heavily, but do not use the shared memory.
shared_buffers is set to 6, yet they use a minimal part of that. I'd
like to know why won't they use more? All the indexes and half of the
database should be in the shared memory, is it not? Or am I completely
missing what are the shared_buffers for? If so, then how do I put my
indexes and at least a part of the data into memory?


shared_memory is used for caching. It is filled as stuff is used. If
you're not using all of it that means it isn't needed. Remember, it is
not the only cache. Since your database is only 400MB it will fit
entirely inside the OS disk cache, so you really don't need much shared
memory at all.

Loading stuff into memory for the hell of it is a waste, let the system
manage the memory itself, if it needs it, it'll use it.

Have a nice day,


Could it be that most of the cpu usage is from lots of fast indexed sql  
queries

wrapped in sql functions?

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] [GENERAL] Shared memory usage

2007-08-25 Thread Max Zorloff
On Sun, 26 Aug 2007 00:39:52 +0400, Martijn van Oosterhout  
<[EMAIL PROTECTED]> wrote:



On Sun, Aug 26, 2007 at 01:22:58AM +0400, Max Zorloff wrote:

Hello.


shared_memory is used for caching. It is filled as stuff is used. If
you're not using all of it that means it isn't needed. Remember, it is
not the only cache. Since your database is only 400MB it will fit
entirely inside the OS disk cache, so you really don't need much shared
memory at all.

Loading stuff into memory for the hell of it is a waste, let the system
manage the memory itself, if it needs it, it'll use it.



Where do I find my OS disk cache settings? I'm using Linux.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match