Re: [PERFORM] Defining performance.

2006-11-30 Thread Chris

Tobias Brox wrote:

[EMAIL PROTECTED] - Thu at 06:37:12PM -0600]

As my dataset has gotten larger I have had to throw more metal at the
problem, but I have also had to rethink my table and query design.  Just
because your data set grows linearly does NOT mean that the performance of
your query is guaranteed to grow linearly!  A sloppy query that runs OK
with 3000 rows in your table may choke horribly when you hit 5.


Then some limit is hit ... either the memory cache, or that the planner
is doing an unlucky change of strategy when hitting 5.


Not really. A bad query is a bad query (eg missing a join element). It 
won't show up for 3000 rows, but will very quickly if you increase that 
by a reasonable amount. Even as simple as a missing index on a join 
column won't show up for a small dataset but will for a larger one.


It's a pretty common mistake to assume that a small dataset will behave 
exactly the same as a larger one - not always the case.


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] SQL_CALC_FOUND_ROWS in POSTGRESQL / Some one can

2006-12-10 Thread Chris

Joshua D. Drake wrote:

On Mon, 2006-12-11 at 14:33 +1100, Chris wrote:

Marcos Borges wrote:

07/12/2006 04:31
*SQL_CALC_FOUND_ROWS in POSTGRESQL*

In mysqln i m using the command SQL_CALC_FOUND_ROWS in follow sintax.
SELECT SQL_CALC_FOUND_ROWS name, email, tel FROM mytable WHERE name <> 
'' LIMIT 0, 10

to have the recorset data.
and
SELECT FOUND_ROWS();
to have the total of registers found.

I dont want to use the command count(*), because the performance will 
fall down, depending of the quantyt of tables and "joins".


The Data base postgresql have something similar ???
Nope, you're out of luck sorry. That's a mysql-ism and I doubt postgres 
will ever include something similar.


Your language will have a similar binding. Something like pg_numrows.


I guess they are similar but also not really :)

The SQL_CALC_FOUND_ROWS directive in mysql will run the same query but 
without the limit.


It's the same as doing a select count(*) type query using the same 
clauses, but all in one query instead of two.


It doesn't return any extra rows on top of the limit query so it's 
better than using pg_numrows which runs the whole query and returns it 
to php (in this example).



Their docs explain it:

http://dev.mysql.com/doc/refman/4.1/en/information-functions.html

See "FOUND_ROWS()"

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] SQL_CALC_FOUND_ROWS in POSTGRESQL / Some one can

2006-12-10 Thread Chris

Mark Kirkwood wrote:

Chris wrote:

It's the same as doing a select count(*) type query using the same 
clauses, but all in one query instead of two.


It doesn't return any extra rows on top of the limit query so it's 
better than using pg_numrows which runs the whole query and returns it 
to php (in this example).



Their docs explain it:

http://dev.mysql.com/doc/refman/4.1/en/information-functions.html

See "FOUND_ROWS()"



Note that from the same page:

"If you are using SELECT SQL_CALC_FOUND_ROWS, MySQL must calculate how 
many rows are in the full result set. However, this is faster than 
running the query again without LIMIT, because the result set need not 
be sent to the client."


So it is not as cost-free as it would seem - the CALC step is 
essentially doing "SELECT count(*) FROM (your-query)" in addition to 
your-query-with-the-limit.


I don't buy the "its cheap 'cause nothing is returned to the client" 
bit, because 'SELECT count(*) ...' returns only 1 tuple of 1 element to 
the client anyway. On the face of it, it *looks* like you save an extra 
set of parse, execute, construct (trivially small) resultset calls - but 
'SELECT FOUND_ROWS()' involves that set of steps too, so I'm not 
entirely convinced that doing a 2nd 'SELECT count(*)...' is really any 
different in impact.


Sorry - I created a bit of confusion here. It's not doing the count(*), 
it's doing the query again without the limit.


ie:

select SQL_CALC_FOUND_ROWS userid, username, password from users limit 10;

will do:

select userid, username, password from users limit 10;

and calculate this:

select userid, username, password from users;

and tell you how many rows that will return (so you can call 
'found_rows()').



the second one does do a lot more because it has to send the results 
across to the client program - whether the client uses that info or not 
doesn't matter.



The OP didn't want to have to change to using two different queries:
select count(*) from table;
select * from table limit 10 offset 0;


Josh's comment was to do the query again without the limit:
select userid, username, password from users;

and then use something like http://www.php.net/pg_numrows to work out 
the number of results the query would have returned.. but that would 
keep the dataset in memory and eventually with a large enough dataset 
cause a problem.


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Writting a "search engine" for a pgsql DB

2007-02-27 Thread Chris

Madison Kelly wrote:

Hi all,

  I am asking in this list because, at the end of the day, this is a 
performance question.


  I am looking at writing a search engine of sorts for my database. I 
have only ever written very simple search engines before which amounted 
to not much more that the query string being used with ILIKE on a pile 
of columns. This was pretty rudimentary and didn't offer anything like 
relevance sorting and such (I'd sort by result name, age or whatnot).


  So I am hoping some of you guys and gals might be able to point me 
towards some resources or offer some tips or gotcha's before I get 
started on this. I'd really like to come up with a more intelligent 
search engine that doesn't take two minutes to return results. :) I 
know, in the end good indexes and underlying hardware will be important, 
but a sane as possible query structure helps to start with.


As someone mentioned, tsearch2 is a good option.

 I wrote a small article about how to get it set up relatively 
easily: http://www.designmagick.com/article/27/ 


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Best OS for Postgres 8.2

2007-05-07 Thread Chris

David Levy wrote:

Hi,

I am about to order a new server for my Postgres cluster. I will
probably get a Dual Xeon Quad Core instead of my current Dual Xeon.
Which OS would you recommend to optimize Postgres behaviour (i/o
access, multithreading, etc) ?

I am hesitating between Fedora Core 6, CentOS and Debian. Can anyone
help with this ?


Use the one you're most comfortable with.

I don't think you'll notice *that* much difference between linux systems 
for performance - but whether you're comfortable using any of them will 
make a difference in managing it in general.


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Best OS for Postgres 8.2

2007-05-07 Thread Chris

[EMAIL PROTECTED] wrote:

On Tue, 8 May 2007, Chris wrote:


David Levy wrote:

 Hi,

 I am about to order a new server for my Postgres cluster. I will
 probably get a Dual Xeon Quad Core instead of my current Dual Xeon.
 Which OS would you recommend to optimize Postgres behaviour (i/o
 access, multithreading, etc) ?

 I am hesitating between Fedora Core 6, CentOS and Debian. Can anyone
 help with this ?


Use the one you're most comfortable with.

I don't think you'll notice *that* much difference between linux 
systems for performance - but whether you're comfortable using any of 
them will make a difference in managing it in general.


the tuneing that you do (both of the OS and of postgres) will make more 
of a difference then anything else.


Which is why it's best to know/understand the OS first ;)

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Postgres Benchmark Results

2007-05-27 Thread Chris


I am re-running it with other tuning, notably cost-based vacuum 
delay and less frequent checkpoints, and it is a *lot* smoother.
These take a full night to run, so I'll post more results when I 
have usefull stuff to show.

This has proven to be a very interesting trip to benchmarkland...


[ rather late in my reply but I had to ]

Are you tuning mysql in a similar fashion ?

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[PERFORM] update query taking too long

2007-06-27 Thread Chris

Hi all,

I'm trying to do an update of a reasonably large table and it's taking 
way too long so I'm trying to work out why and if I need to tweak any 
settings to speed it up.


The table is around 3.5 million records.

The query is

update table set domainname=substring(emailaddress from position('@' in 
emailaddress));


I've left it running for over 20 minutes and it hasn't finished so I'm 
doing something terribly bad but I have no idea what ;)


Maybe there's another way to write the query but I'm not sure how to 
make it better.


Most settings are default, I have bumped up shared_buffers a bit to 
65536 - I think that works out to 512Meg ? The machine has 2G ram.


Running version 8.1.9.

Any pointers about settings etc are most welcome.

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] update query taking too long

2007-06-27 Thread Chris

Tom Lane wrote:

Chris <[EMAIL PROTECTED]> writes:
I'm trying to do an update of a reasonably large table and it's taking 
way too long so I'm trying to work out why and if I need to tweak any 
settings to speed it up.


Any foreign keys leading to or from that table?


Nope :(


3.5 million row updates are not exactly gonna be instantaneous anyway,
but only FK checks or really slow user-written triggers would make it
take upwards of an hour ...


No triggers, functions.

Table is pretty basic.

I have a few indexes (one on the primary key, one on emailaddress etc) 
but the 'domainname' column is a new one not referenced by any of the 
indexes.


FWIW (while the other update is still going in another window):

select SUBSTRING(emailaddress FROM POSITION('@' IN emailaddress)) from 
table;

Time: 28140.399 ms

Is there a better way to write the update? I thought about something 
like this (but couldn't get it working - guess I don't have the right 
syntax):


update t1 set domainname=(select id, SUBSTRING(emailaddress FROM 
POSITION('@' IN emailaddress)) from table t2) AS t2 where t1.id=t2.id


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] update query taking too long

2007-06-27 Thread Chris

A. Kretschmer wrote:

am  Thu, dem 28.06.2007, um 15:03:32 +1000 mailte Chris folgendes:

Hi all,

I'm trying to do an update of a reasonably large table and it's taking 
way too long so I'm trying to work out why and if I need to tweak any 
settings to speed it up.


The table is around 3.5 million records.

The query is

update table set domainname=substring(emailaddress from position('@' in 
emailaddress));


I think, this is a bad idea.
Because, first, you have 2 columns with nearly identical data
(mailaddres includes the domain and a extra domain field)


Yeh I know. I might have to go back to the drawing board on this one. 
The app has to work in mysql & postgres so I'm a bit limited in some of 
my approaches.


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] update query taking too long

2007-06-27 Thread Chris

A. Kretschmer wrote:

am  Thu, dem 28.06.2007, um 16:16:50 +1000 mailte Chris folgendes:
Is there a better way to write the update? I thought about something 
like this (but couldn't get it working - guess I don't have the right 
syntax):


update t1 set domainname=(select id, SUBSTRING(emailaddress FROM 
POSITION('@' IN emailaddress)) from table t2) AS t2 where t1.id=t2.id


test=# select * from foo;
 id |mail | domain
+-+
  1 | [EMAIL PROTECTED] |
  2 | [EMAIL PROTECTED] |
(2 rows)

test=*# update foo set domain=SUBSTRING(mail FROM (POSITION('@' IN
mail)+1));


That's what my original query is (apart from the +1 at the end) ;)

I was just trying to approach it differently with the other attempt.

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] update query taking too long

2007-06-27 Thread Chris

Richard Huxton wrote:

Chris wrote:

Tom Lane wrote:

Any foreign keys leading to or from that table?


Nope :(


3.5 million row updates are not exactly gonna be instantaneous anyway,
but only FK checks or really slow user-written triggers would make it
take upwards of an hour ...


No triggers, functions.


Of course you really want a trigger on this, since presumably domainname 
should always be kept in sync with emailaddress. But that's not the 
immediate issue.



Table is pretty basic.

I have a few indexes (one on the primary key, one on emailaddress etc) 
but the 'domainname' column is a new one not referenced by any of the 
indexes.


FWIW (while the other update is still going in another window):


What's saturated? Is the system I/O limited or CPU limited? You *should* 
be limited by the write speed of your disk with something simple like this.


What happens if you do the following?


db=# CREATE TABLE email_upd_test (id SERIAL, email text, domainname 
text, PRIMARY KEY (id));
NOTICE:  CREATE TABLE will create implicit sequence 
"email_upd_test_id_seq" for serial column "email_upd_test.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index 
"email_upd_test_pkey" for table "email_upd_test"

CREATE TABLE
Time: 276.500 ms
db=# INSERT INTO email_upd_test (email) SELECT n::text || '@' || n::text 
FROM (SELECT generate_series(1,100) AS n) AS numbers;

INSERT 0 100
Time: 14104.663 ms
db=# ANALYSE email_upd_test;
ANALYZE
Time: 121.775 ms
db=# UPDATE email_upd_test SET domainname=substring(email from 
position('@' in email));

UPDATE 100
Time: 43796.030 ms


I think I'm I/O bound from my very limited understanding of vmstat.

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] update query taking too long

2007-06-28 Thread Chris

Richard Huxton wrote:

Chris wrote:
db=# UPDATE email_upd_test SET domainname=substring(email from 
position('@' in email));

UPDATE 100
Time: 43796.030 ms

I think I'm I/O bound from my very limited understanding of vmstat.


Well, 43 seconds to update 1 million rows suggests your real query 
should be complete in a few minutes, even if your real table has more 
columns.


Yep.

I think I have solved it though - the server was checkpointing so much 
not much else was going on.


I didn't have logging set up before but it's up and running now and I 
was getting


LOG:  checkpoints are occurring too frequently (26 seconds apart)
HINT:  Consider increasing the configuration parameter 
"checkpoint_segments".


So I increased that from 10 to 30 and it finished:

UPDATE 3500101
Time: 146513.349 ms

Thanks for all the help :)

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Query performance issue

2007-07-24 Thread Chris

Jonathan Gray wrote:
We’re experiencing a query performance problem related to the planner 
and its ability to perform a specific type of merge.


 

We have created a test case (as attached, or here: 
http://www3.streamy.com/postgres/indextest.sql) which involves a 
hypothetical customer ordering system, with customers, orders, and 
customer groups.


 

If we want to retrieve a single customers 10 most recent orders, sorted 
by date, we can use a double index on (customer,date); Postgres’s query 
planner will use the double index with  a backwards index scan on the 
second indexed column (date).


 

However, if we want to retrieve a “customer class’s” 10 most recent 
orders, sorted by date, we are not able to get Postgres to use double 
indexes.


You don't have any indexes on the 'customerclass' table.

Creating a foreign key doesn't create an index, you need to do that 
separately.


Try

create index cc_customerid_class on indextest.customerclass(classid, 
customerid);


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Query performance issue

2007-07-24 Thread Chris

Chris wrote:

Jonathan Gray wrote:
We’re experiencing a query performance problem related to the planner 
and its ability to perform a specific type of merge.


 

We have created a test case (as attached, or here: 
http://www3.streamy.com/postgres/indextest.sql) which involves a 
hypothetical customer ordering system, with customers, orders, and 
customer groups.


 

If we want to retrieve a single customers 10 most recent orders, 
sorted by date, we can use a double index on (customer,date); 
Postgres’s query planner will use the double index with  a backwards 
index scan on the second indexed column (date).


 

However, if we want to retrieve a “customer class’s” 10 most recent 
orders, sorted by date, we are not able to get Postgres to use double 
indexes.


You don't have any indexes on the 'customerclass' table.

Creating a foreign key doesn't create an index, you need to do that 
separately.


Try

create index cc_customerid_class on indextest.customerclass(classid, 
customerid);




It could also be that since you don't have very much data (10,000) rows 
- postgres is ignoring the indexes because it'll be quicker to scan the 
tables.


If you bump it up to say 100k rows, what happens?

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Query performance issue

2007-07-24 Thread Chris

Jonathan Gray wrote:

Chris,

Creating indexes on the customerclass table does speed up the queries but
still does not create the plan we are looking for (using the double index
with a backward index scan on the orders table).


Stupid question - why is that particular plan your "goal" plan?


The plans we now get, with times on par or slightly better than with the
plpgsql hack, are:

  EXPLAIN ANALYZE
  SELECT o.orderid,o.orderstamp FROM indextest.orders o 
  INNER JOIN indextest.customerclass cc ON (cc.classid = 2) 
  WHERE o.customerid = cc.customerid ORDER BY o.orderstamp DESC LIMIT 5;


Didn't notice this before...

Shouldn't this be:

INNER JOIN indextest.customerclass cc ON (o.customerid = cc.customerid)
WHERE cc.classid = 2

ie join on the common field not the classid one which doesn't appear in 
the 2nd table?



As I said, this is a hypothetical test case we have arrived at that
describes our situation as best as we can given a simple case.  We're
interested in potential issues with the approach, why postgres would not
attempt something like it, and how we might go about implementing it
ourselves at a lower level than we currently have (in SPI, libpq, etc). 


If it could be generalized then we could use it in cases where we aren't
pulling from just one table (the orders table) but rather trying to merge,
in sorted order, results from different conditions on different tables.
Right now we use something like the plpgsql or plpythonu functions in the
example and they outperform our regular SQL queries by a fairly significant
margin.


I'm sure if you posted the queries you are running with relevant info 
you'd get some help ;)


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Partitioning in postgres - basic question

2007-10-04 Thread Chris

Tore Lukashaugen wrote:

Hi,

I am new to postgres having worked with Oracle in the past. I am interested 
in understanding Postgres's table partition functionality better. 
Specifically, I have a third party application running against my postgres 
database, but the database is becoming rather large to maintain. I am 
thinking about partitioning the biggest table.


Would I be able to set-up partitioning on this table with it being seemless 
to the third party app (assuming that it performs pretty standard DML 
statements against the table in question)?


http://www.postgresql.org/docs/8.2/static/ddl-partitioning.html#DDL-PARTITIONING-IMPLEMENTATION

The examples use rules but some on the list have said triggers work 
better if you have a lot of partitions.


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[PERFORM] stats collector suddenly causing lots of IO

2010-04-13 Thread Chris
I have a lot of centos servers which are running postgres.  Postgres isn't used
that heavily on any of them, but lately, the stats collector process keeps
causing tons of IO load.  It seems to happen only on servers with centos 5.
The versions of postgres that are running are:

8.1.18
8.2.6
8.3.1
8.3.5
8.3.6
8.3.7
8.3.8
8.3.9
8.4.2
8.4.3

I've tried turning off everything under RUNTIME STATISTICS in postgresql.conf
except track_counts (since auto vacuum says it needs it), but it seems to have
little affect on the IO caused by the stats collector.

Has anyone else noticed this?  Have there been recent kernel changes
that could cause this that anyone knows about?  Since we haven't touched
postgres on these boxes since they were setup initially, I'm a bit baffled as
to what might be causing the problem, and why I can't make it go away short of
kill -STOP.

Any suggestions would be much appreciated!


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] stats collector suddenly causing lots of IO

2010-04-20 Thread Chris
I have a lot of centos servers which are running postgres.  Postgres isn't used
that heavily on any of them, but lately, the stats collector process keeps
causing tons of IO load.  It seems to happen only on servers with centos 5.
The versions of postgres that are running are:

8.1.18
8.2.6
8.3.1
8.3.5
8.3.6
8.3.7
8.3.8
8.3.9
8.4.2
8.4.3

I've tried turning off everything under RUNTIME STATISTICS in postgresql.conf
except track_counts (since auto vacuum says it needs it), but it seems to have
little affect on the IO caused by the stats collector.

Has anyone else noticed this?  Have there been recent kernel changes
that could cause this that anyone knows about?  Since we haven't touched
postgres on these boxes since they were setup initially, I'm a bit baffled as
to what might be causing the problem, and why I can't make it go away short of
kill -STOP.

Any suggestions would be much appreciated!

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] planner index choice

2010-07-28 Thread Chris

Hi there,

I have a simple query where I don't understand the planner's choice to 
use a particular index.


The main table looks like this:

# \d sq_ast_attr_val
   Table "public.sq_ast_attr_val"
   Column| Type  |  Modifiers
-+---+--
 assetid | character varying(15) | not null
 attrid  | integer   | not null
 contextid   | integer   | not null default 0
 custom_val  | text  |
 use_default | character(1)  | not null default '1'::bpchar
Indexes:
"ast_attr_val_pk" PRIMARY KEY, btree (assetid, attrid, contextid)
"sq_ast_attr_val_assetid" btree (assetid)
"sq_ast_attr_val_attrid" btree (attrid)
"sq_ast_attr_val_concat" btree (((assetid::text || '~'::text) || 
attrid))

"sq_ast_attr_val_contextid" btree (contextid)


The query:

SELECT
  assetid, custom_val
FROM
  sq_ast_attr_val
WHERE
  attrid IN (SELECT attrid FROM sq_ast_attr WHERE name = 
'is_contextable' AND (type_code = 'metadata_field_select' OR 
owning_type_code = 'metadata_field'))

  AND contextid = 0
INTERSECT
SELECT
  assetid, custom_val
FROM
  sq_ast_attr_val
WHERE
  assetid = '62321'
  AND contextid = 0;


The explain analyze plan:

http://explain.depesz.com/s/nWs

I'm not sure why it's picking the sq_ast_attr_val_contextid index to do 
the contextid = 0 check, the other parts (attrid/assetid) are much more 
selective.


If I drop that particular index:

http://explain.depesz.com/s/zp


All (I hope) relevant postgres info:

Centos 5.5 x86_64 running pg8.4.4.

Server has 8gig memory.

# select name, setting, source from pg_settings where name in 
('shared_buffers', 'effective_cache_size', 'work_mem');

 name | setting
--+
shared_buffers| 262144
effective_cache_size  | 655360
work_mem  | 32768

All planner options are enabled:

# select name, setting, source from pg_settings where name like 'enable_%';
   name| setting | source
---+-+-
 enable_bitmapscan | on  | default
 enable_hashagg| on  | default
 enable_hashjoin   | on  | default
 enable_indexscan  | on  | default
 enable_mergejoin  | on  | default
 enable_nestloop   | on  | default
 enable_seqscan| on  | default
 enable_sort   | on  | default
 enable_tidscan| on  | default

Any insights welcome - thanks!

--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] planner index choice

2010-07-29 Thread Chris

Hi,


Hrm ... are you *certain* that's an 8.4 server?


Yep.

# psql -U postgres -d db
psql (8.4.4)

db=# select version();
 version 


--
 PostgreSQL 8.4.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 
4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit

(1 row)



The actual problem seems to be that choose_bitmap_and() is choosing to
add an indexscan on sq_ast_attr_val_contextid, even though this index
is a lot less selective than the sq_ast_attr_val_attrid scan it had
already picked.  I've seen that behavior before, and there were a series
of patches back in 2006-2007 that seem to have pretty much fixed it.
So that's another reason for suspecting you've got an old server version
there...


I just recreated the index and re-ran the explain analyze and it doesn't 
give the "outer" bit any more - not sure how I got that before.


db=# begin;
BEGIN
db=# create index attr_val_contextid on sq_ast_attr_val(contextid);
CREATE INDEX
db=# analyze sq_ast_attr_val;
ANALYZE
db=# explain analyze SELECT
db-# assetid, custom_val
db-# FROM
db-# sq_ast_attr_val
db-# WHERE
db-# attrid IN (SELECT attrid FROM sq_ast_attr WHERE name =
db(# 'is_contextable' AND (type_code = 'metadata_field_select' OR
db(# owning_type_code = 'metadata_field'))
db-#AND contextid = 0
db-# INTERSECT
db-# SELECT
db-#assetid, custom_val
db-# FROM
db-#sq_ast_attr_val
db-# WHERE
db-#assetid = '62321'
db-#AND contextid = 0;

http://explain.depesz.com/s/br9

Without that index (again with an analyze after doing a rollback):

http://explain.depesz.com/s/gxH

--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] good old VACUUM FULL

2011-03-22 Thread Chris

On 23/03/11 11:52, felix wrote:

I posted many weeks ago about a severe problem with a table that was
obviously bloated and was stunningly slow. Up to 70 seconds just to get
a row count on 300k rows.

I removed the text column, so it really was just a few columns of fixed
data.
Still very bloated.  Table size was 450M

The advice I was given was to do CLUSTER, but this did not reduce the
table size in the least.
Nor performance.

Also to resize my free space map (which still does need to be done).
Since that involves tweaking the kernel settings, taking the site down
and rebooting postgres and exposing the system to all kinds of risks and
unknowns and expensive experimentations I was unable to do it and have
had to hobble along with a slow table in my backend holding up jobs.

Much swearing that nobody should ever do VACUUM FULL.  Manual advises
against it.  Only crazy people do that.





moral of the story:  if your table is really bloated, just do VACUUM FULL


You'll need to reindex that table now - vacuum full can bloat your 
indexes which will affect your other queries.


reindex table fastadder_fastadderstatus;

--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimizing performance of a like '%...%' condition

2006-02-17 Thread Chris



Indexing the t_name.name field, I can increase speed, but only if I
restrict my search to something like :

select *
from t_name
where t_name.name like 'my_search%'

(In this case it takes generally less than 1 second)


My question : Are there algorithms or tools that can speed up such a
type of queries ("like" condition begining with a "%" symbol) ?


Apart from indexing the field you could use full text indexing. See 
http://techdocs.postgresql.org/techdocs/fulltextindexing.php



What other types of queries are you running that you want to speed up ?

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Creating a correct and real benchmark

2006-02-20 Thread Chris

PFC wrote:



I'm developing a search engine using the postgresql's databas. I've
already doing some tunnings looking increase the perform.

Now, I'd like of do a realistic test of perfom with number X of queries
for know the performance with many queries.

What the corret way to do this?




I guess the only way to know how it will perform with your own  
application is to benchmark it with queries coming from your own  
application. You can create a test suite with a number of typical 
queries  and use your favourite scripting language to spawn a number of 
threads and  hammer the database. I find it interesting to measure the 
responsiveness  of the server while torturing it, simply by measuring 
the time it takes to  respond to a simple query and graphing it. Also 
you should not have N  threads issue the exact same queries, because 
then you will hit a too  small dataset. Introduce some randomness in the 
testing, for instance.  Benchmarking from another machine makes sure the 
test client's CPU usage  is not a part of the problem.


The other advice on top of this is don't just import a small amount of data.

If your application is going to end up with 200,000 rows - then test 
with 200,000 rows or more so you know exactly how it will handle under 
"production" conditions.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Joins and full index scans...mysql vs postgres?

2006-02-22 Thread Chris

ryan groth wrote:

I am issing a query like this:
SELECT *
   FROM users users
   LEFT JOIN phorum_users_base ON users.uid = phorum_users_base.user_id
   LEFT JOIN useraux ON useraux.uid = users.uid;




I'm not sure if postgres would rewrite your query to do the joins 
properly, though I guess someone else might've already suggested this :)



I'm probably wrong but I read that as:

join users -> phorum_users_base (ON users.uid = phorum_users_base.user_id)

join phorum_users_base -> useraux (ON useraux.uid = users.uid) which 
won't be indexable because u.uid doesn't exist in phorum_users_base.




Try

SELECT *
FROM users users
LEFT JOIN phorum_users_base ON users.uid = phorum_users_base.user_id
LEFT JOIN useraux ON useraux.uid = phorum_users_base.user_id

or

SELECT *
FROM users u, phorum_users_base pub, useraux ua WHERE u.uid = 
pub.user_id AND au.uid = u.uid AND pub.user_id=au.uid;



--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Help understanding indexes, explain, and optimizing

2006-03-06 Thread Chris

i.v.r. wrote:

Hi everyone,

I'm experimenting with PostgreSQL, but since I'm no expert DBA, I'm 
experiencing some performance issues.


Please take a look at the following query:

SELECT
 /*groups."name" AS t2_r1,
 groups."id" AS t2_r3,
 groups."user_id" AS t2_r0,
 groups."pretty_url" AS t2_r2,
 locations."postal_code" AS t0_r6,
 locations."pretty_url" AS t0_r7,
 locations."id" AS t0_r8,
 locations."colony_id" AS t0_r0,
 locations."user_id" AS t0_r1,
 locations."group_id" AS t0_r2,
 locations."distinction" AS t0_r3,
 locations."street" AS t0_r4,
 locations."street_2" AS t0_r5,
 schools."updated" AS t1_r10,
 schools."level_id" AS t1_r4,
 schools."pretty_url" AS t1_r11,
 schools."user_id" AS t1_r5,
 schools."id" AS t1_r12,
 schools."type_id" AS t1_r6,
 schools."distinction" AS t1_r7,
 schools."cct" AS t1_r8,
 schools."created_on" AS t1_r9,
 schools."location_id" AS t1_r0,
 schools."service_id" AS t1_r1,
 schools."sustentation_id" AS t1_r2,
 schools."dependency_id" AS t1_r3*/
 groups.*,
 locations.*,
 schools.*
FROM locations
LEFT OUTER JOIN groups ON groups.id = locations.group_id
LEFT OUTER JOIN schools ON schools.location_id = locations.id
WHERE (colony_id = 71501)
ORDER BY groups.name, locations.distinction, schools.distinction

As you can see, I've commented out some parts. I did that as an 
experiment, and it improved the query by 2x. I really don't understand 
how is that possible... I also tried changing the second join to an 
INNER join, and that improves it a little bit also.


Anyway, the main culprit seems to be that second join. Here's the output 
from EXPLAIN:


Sort  (cost=94315.15..94318.02 rows=1149 width=852)
  Sort Key: groups.name, locations.distinction, schools.distinction
  ->  Merge Left Join  (cost=93091.96..94256.74 rows=1149 width=852)
Merge Cond: ("outer".id = "inner".location_id)
->  Sort  (cost=4058.07..4060.94 rows=1148 width=646)
  Sort Key: locations.id
  ->  Hash Left Join  (cost=1.01..3999.72 rows=1148 width=646)
Hash Cond: ("outer".group_id = "inner".id)
->  Index Scan using locations_colony_id on 
locations  (cost=0.00..3992.91 rows=1148 width=452)

  Index Cond: (colony_id = 71501)
->  Hash  (cost=1.01..1.01 rows=1 width=194)
  ->  Seq Scan on groups  (cost=0.00..1.01 
rows=1 width=194)

->  Sort  (cost=89033.90..89607.67 rows=229510 width=206)
  Sort Key: schools.location_id
  ->  Seq Scan on schools  (cost=0.00..5478.10 rows=229510 
width=206)


I don't completely understand what that output means, but it would seem 
that the first join costs about 4000, but if I remove that join from the 
query, the performance difference is negligible. So as I said, it seems 
the problem is the join on the schools table.


I hope it's ok for me to post the relevant tables here, so here they are 
(I removed some constraints and indexes that aren't relevant to the 
query above):


CREATE TABLE groups
(
 user_id int4 NOT NULL,
 name varchar(50) NOT NULL,
 pretty_url varchar(50) NOT NULL,
 id serial NOT NULL,
 CONSTRAINT groups_pk PRIMARY KEY (id),
)

CREATE TABLE locations
(
 colony_id int4 NOT NULL,
 user_id int4 NOT NULL,
 group_id int4 NOT NULL,
 distinction varchar(60) NOT NULL,
 street varchar(60) NOT NULL,
 street_2 varchar(50) NOT NULL,
 postal_code varchar(5) NOT NULL,
 pretty_url varchar(60) NOT NULL,
 id serial NOT NULL,
 CONSTRAINT locations_pk PRIMARY KEY (id),
 CONSTRAINT colony FOREIGN KEY (colony_id)
 REFERENCES colonies (id) MATCH SIMPLE
 ON UPDATE NO ACTION ON DELETE NO ACTION,
 CONSTRAINT "group" FOREIGN KEY (group_id)
 REFERENCES groups (id) MATCH SIMPLE
 ON UPDATE NO ACTION ON DELETE NO ACTION,
)
CREATE INDEX locations_fki_colony
 ON locations
 USING btree
 (colony_id);
CREATE INDEX locations_fki_group
 ON locations
 USING btree
 (group_id);

CREATE TABLE schools
(
 location_id int4 NOT NULL,
 service_id int4 NOT NULL,
 sustentation_id int4 NOT NULL,
 dependency_id int4 NOT NULL,
 level_id int4 NOT NULL,
 user_id int4 NOT NULL,
 type_id int4 NOT NULL,
 distinction varchar(25) NOT NULL,
 cct varchar(20) NOT NULL,
 created_on timestamp(0) NOT NULL,
 updated timestamp(0),
 pretty_url varchar(25) NOT NULL,
 id serial NOT NULL,
 CONSTRAINT schools_pk PRIMARY KEY (id),
 CONSTRAINT "location" FOREIGN KEY (location_id)
 REFERENCES locations (id) MATCH SIMPLE
 ON UPDATE NO ACTION ON DELETE NO ACTION,
)
CREATE INDEX schools_fki_location
 ON schools
 USING btree
 (location_id);

So I'm wondering what I'm doing wrong. I migrated this database from 
MySQL, and on there it ran pretty fast.


Have you done an 'analyze' or 'vacuum analyze' over these tables?

A left outer join gets *everything* from the second table:

> LEFT OUTER JOIN groups ON groups.id = locations.group_id
> LEFT OUTER JOIN schools ON schools.location_id = locations.id

So they will load everything from groups a

Re: [PERFORM] pgCluster and postgres 8.1

2006-03-09 Thread Chris

Javier Somoza wrote:

I wanna test my system performance when using pgCluster.
I'm using postgreSQL 8.1.0 and i've downloaded pgcluster-1.5.0rc7
and pgcluster-1.5.0rc7-patch.

Do i need to recompile postgreSQL with the patch?
Can i use pgcluster-1.5 with this version of postgreSQL?


What does the documentation that comes with the patch say?

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] import performance

2006-03-13 Thread Chris

Hi all,

I'm trying to work out why my 8.1 system is slower than my 7.4 system 
for importing data.


The import is a lot of "insert into" commands - it's a converted 
database from another system so I can't change it to copy commands.



My uncommented config options:


autovacuum = off

bgwriter_all_maxpages = 15
bgwriter_all_percent = 10.0
bgwriter_delay = 2000
bgwriter_lru_maxpages = 10
bgwriter_lru_percent = 5.0

checkpoint_segments = 10

commit_delay = 10
commit_siblings = 500

temp_buffers = 500

wal_buffers = 16

max_connections = 16

shared_buffers = 256


(I was playing around with the bgwriter stuff to see if it made any 
differences, so I could be making it worse).


It's a pretty small machine - 2.6GHz with 512M RAM.

My main concern is 7.4 on a smaller machine with less memory is faster 
to import this data.



Suggestions on what I need to do would be fantastic, thanks!

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] import performance

2006-03-13 Thread Chris

Gavin Sherry wrote:

On Tue, 14 Mar 2006, Chris wrote:



Hi all,

I'm trying to work out why my 8.1 system is slower than my 7.4 system
for importing data.

The import is a lot of "insert into" commands - it's a converted
database from another system so I can't change it to copy commands.





new config variables...

autovacuum = off

bgwriter_all_maxpages = 15
bgwriter_all_percent = 2.0
bgwriter_delay = 500
bgwriter_lru_maxpages = 10
bgwriter_lru_percent = 5.0

checkpoint_segments = 10
checkpoint_timeout = 300

commit_delay = 1
commit_siblings = 10

fsync = on

lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'

log_connections = on
log_destination = 'syslog'
log_disconnections = on
log_duration = on
log_statement = 'all'

max_connections = 16

redirect_stderr = on

shared_buffers = 12500

silent_mode = off

stats_command_string = off

syslog_facility = 'LOCAL0'
syslog_ident = 'postgres'

temp_buffers = 500

wal_buffers = 256


I changed a couple of things and restarted postgres before trying again. 
Still getting pretty insert times :(


INSERT 0 1
Time: 1251.956 ms
INSERT 0 1
Time: 700.244 ms
INSERT 0 1
Time: 851.254 ms
INSERT 0 1
Time: 407.725 ms
INSERT 0 1
Time: 267.881 ms
INSERT 0 1
Time: 575.834 ms
INSERT 0 1
Time: 371.914 ms
INSERT 0 1


The table schema is bare:

CREATE TABLE ArticleLive_articlepages (
  PageID serial not null,
  ArticleID integer default '0',
  SortOrderID integer default '0',
  Title varchar(100) NOT NULL default '',
  Content text,
  PRIMARY KEY  (PageID)
);

(I know the fields will be lowercased...).

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] import performance

2006-03-13 Thread Chris

Frank Wiles wrote:

On Tue, 14 Mar 2006 12:24:22 +1100
Chris <[EMAIL PROTECTED]> wrote:



Gavin Sherry wrote:


On Tue, 14 Mar 2006, Chris wrote:




Hi all,

I'm trying to work out why my 8.1 system is slower than my 7.4
system for importing data.

The import is a lot of "insert into" commands - it's a converted
database from another system so I can't change it to copy commands.



  Are you on the same hardware specifically in your disk subsystem? 
  Anything else different about how the two servers are used?  


Different hardware.

7.4 is running on a 500MHz computer with 256M compared to 8.1 running on 
a 2.6GHz with 512M.


The only notable config variables on that machine (the rest are logging):

commit_delay = 1

checkpoint_segments = 10
checkpoint_warning = 300

insert times:

Time: 63.756 ms
INSERT 13584074 1
Time: 46.465 ms
INSERT 13584075 1
Time: 70.518 ms
INSERT 13584077 1
Time: 59.864 ms
INSERT 13584078 1
Time: 35.984 ms

Tons of difference :/

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] import performance

2006-03-13 Thread Chris

Tom Lane wrote:

Chris <[EMAIL PROTECTED]> writes:


Tons of difference :/



Have you checked that the I/O performance is comparable?  It seems
possible that there's something badly misconfigured about the disks
on your new machine.  Benchmarking with "bonnie" or some such would
be useful; also try looking at "iostat 1" output while running the
inserts on both machines.


I'll check out bonnie, thanks.

hdparm shows a world of difference (which I can understand) - that being 
the old server is a lot slower.


hdparm -t /dev/hda
/dev/hda:
 Timing buffered disk reads:   24 MB in  3.13 seconds =   7.67 MB/sec

hdparm -T /dev/hda
/dev/hda:
 Timing cached reads:   596 MB in  2.00 seconds = 298.00 MB/sec



Newer server:
hdparm -t /dev/hda
/dev/hda:
 Timing buffered disk reads:   70 MB in  3.02 seconds =  23.15 MB/sec

hdparm -T /dev/hda
/dev/hda:
 Timing cached reads:   1512 MB in  2.00 seconds = 754.44 MB/sec


Also, are the inserts just trivial "insert values (... some constants ...)"
or is there more to it than that?


Straight inserts, no foreign keys, triggers etc.


The only other thing I can see is the old server is ext2:
/dev/hda4 on / type ext2 (rw,errors=remount-ro)

the new one is ext3:
/dev/hda2 on / type ext3 (rw)


If it's a server issue not a postgres issue I'll keep playing :) I 
thought my config was bad but I guess not.


Thanks for all the help.

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] import performance

2006-03-13 Thread Chris

David Lang wrote:

On Tue, 14 Mar 2006, Chris wrote:


The only other thing I can see is the old server is ext2:
/dev/hda4 on / type ext2 (rw,errors=remount-ro)

the new one is ext3:
/dev/hda2 on / type ext3 (rw)



this is actually a fairly significant difference.

with ext3 most of your data actually gets written twice, once to the 
journal and a second time to the spot on the disk it's actually going to 
live.


in addition there are significant differences in how things are arranged 
on disk between the two filesystems, (overridable at mount, but only 
changes future new files). the ext3 layout is supposed to be better for 
a general purpose filesystem, but I've found common cases (lots of files 
and directories) where it's significantly slower, and I think postgres 
will fall into those layouts.


try makeing a xfs filesystem for your postgres data and see what sort of 
performance you get on it.


Interesting.

To be honest I think I'm just lucky with my really old server. I can't 
see any particular tweaks in regards to drives or anything else. I have 
another server running postgres 7.4.something and it's as slow as the 
8.1 system.


#1 is running 2.4.x kernel - pg 7.4 (debian package) - good performance. 
ext2.


#2 is running 2.2.x kernel (I know I know)..  - pg 7.4 (debian package) 
- reasonable performance. ext2.


#3 is running 2.6.x kernel - pg 8.1 (fedora package) - reasonable 
performance. ext3.


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Slow SELECTS after large update cycle

2006-03-15 Thread Chris

Jan de Visser wrote:

Hello,

After fixing the hanging problems I reported here earlier (by uninstalling 
W2K3 SP1), I'm running into another weird one.


After doing a +/- 8hr cycle of updates and inserts (what we call a 'batch'), 
the first 'reporting' type query on tables involved in that write cycle is 
very slow. As an example, I have a query which according to EXPLAIN ANALYZE 
takes about 1.1s taking 46s. After this one hit, everything is back to 
normal, and subsequent executions of the same query are in fact subsecond. 
Restarting the appserver and pgsql does not make the slowness re-appear, only 
running another batch will.


During the 'write'/batch cycle, a large number of rows in various tables are 
inserted and subsequently (repeatedly) updated. The reporting type queries 
after that are basically searches on those tables.


After a large batch you need to run 'analyze' over the tables involved 
to get postgresql to update it's statistics so it can work out which 
indexes etc it should use.


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] simple join uses indexes, very slow

2006-03-28 Thread Chris

george young wrote:

[PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.0.1]
I have a simple join on two tables that takes way too long.  Can you help
me understand what's wrong?  There are indexes defined on the relevant columns.
I just did a fresh vacuum --full --analyze on the two tables.
Is there something I'm not seeing?
[CPU is 950Mhz AMD, 256MB RAM, 15k rpm scsi disk]
-- George Young

Table sizes: parameters has 2.1512e+07 tuples, run_opsets has 211745 tuples.

explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM run_opsets ro, 
parameters p WHERE ro.run = p.run AND ro.opset_num = p.opset_num and 
ro.run='team9';
 QUERY PLAN

 Nested Loop  (cost=2.16..7957.40 rows=6707 width=22) (actual 
time=14.986..70197.129 rows=43050 loops=1)
   ->  Index Scan using run_opsets_pkey on run_opsets ro  (cost=0.00..128.75 
rows=71 width=18) (actual time=0.386..62.959 rows=263 loops=1)
 Index Cond: (run = 'team9'::text)
   ->  Bitmap Heap Scan on parameters p  (cost=2.16..109.93 rows=27 width=22) 
(actual time=1.591..266.211 rows=164 loops=263)
 Recheck Cond: (('team9'::text = p.run) AND ("outer".opset_num = 
p.opset_num))
 ->  Bitmap Index Scan on parameters_idx  (cost=0.00..2.16 rows=27 
width=0) (actual time=1.153..1.153 rows=164 loops=263)
   Index Cond: (('team9'::text = p.run) AND ("outer".opset_num = 
p.opset_num))
 Total runtime: 70237.727 ms
(8 rows)

 Table "public.run_opsets"
Column|Type |Modifiers
--+-+-
 run  | text| not null
 opset| text|
 opset_ver| integer |
 opset_num| integer | not null
 status   | opset_status|
 date_started | timestamp without time zone |
 date_done| timestamp without time zone |
 work_started | timestamp without time zone |
 lock_user| text| default 'NO-USER'::text
 lock_pid | integer |
 needs_review | text|
Indexes:
"run_opsets_pkey" PRIMARY KEY, btree (run, opset_num) CLUSTER


--  Table "public.parameters"
  Column   |  Type   |   Modifiers
---+-+---
 run   | text| not null
 opset_num | integer | not null
 opset | text| not null
 opset_ver | integer | not null
 step_num  | integer | not null
 step  | text| not null
 step_ver  | integer | not null
 name  | text| not null
 value | text|
 split | boolean | not null default false
 wafers| text[]  | not null default '{}'::text[]
Indexes:
"parameters_idx" btree (run, opset_num, step_num, opset, opset_ver, step, 
step_ver, name, split, wafers)
"parameters_opset_idx" btree (opset, step, name)
"parameters_step_idx" btree (step, name)


More for my own information (because nobody else has suggested it), 
would it make a difference if 'run' was a varchar field rather than text?



--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Slow performance on Windows .NET and OleDb

2006-03-28 Thread Chris

Ruben Rubio Rey wrote:

Greg Quinn wrote:


The query is,

select * from users

which returns 4 varchar fields, there is no where clause

Yes, I am running the default postgres config. Basically I have been a 
MySQL user and thought I would like to check out PostGreSql. So I did 
a quick performance test. The performance was so different that I 
thought PostGreSQL was nothing compared to MySQL, but now it seems its 
just a few configuration options. Strange how the defult config would 
be so slow...



- Do u have any indexes? Remove it. To get all rows you do not need it


I wouldn't do that. Postgres needs indexing just like any other database.

It might affect this query but it's not going to help other queries.

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Inserts optimization?

2006-04-12 Thread Chris

Francisco Reyes wrote:

Doing my first write heavy database.
What settings will help improve inserts?
Only a handfull of connections, but each doing up to 30 inserts/second.
Plan to have 2 to 3 clients which most of the time will not run at the 
same time, but ocasionaly it's possible two of them may bump into each 
other.


If you can, use copy instead:

http://www.postgresql.org/docs/8.1/interactive/sql-copy.html

MUCH quicker (and don't worry about using multiple clients).

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Easy question

2006-05-02 Thread Chris

Bert wrote:

No i didn't defined any indexes for the table, I know the performance
will increase with an index, but this was not my question. My question
furthermore belongs to the access mode of the SQL statement.
Furthermore i do not understand why the Upper function should increase
the performance.


The index will have entries like:

CHRIS
BERT
JOE

and so on.

If you run a query like:

select * from table where UPPER(name) = 'CHRIS';

It's an easy match.

If you don't create an UPPER index, it has to do a comparison with each 
row - so the index can't be used because postgres has to convert the 
field to upper and then do the comparison.


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Postgres gets stuck

2006-05-09 Thread Chris


This is a deadly bug, because our web site goes dead when this happens, 
and it requires an administrator to log in and kill the stuck postgres 
process then restart Postgres.  We've installed failover system so that 
the web site is diverted to a backup server, but since this has happened 
twice in one week, we're worried.


Any ideas?


Sounds like a deadlock issue.

Do you have query logging turned on?

Also, edit your postgresql.conf file and add (or uncomment):

stats_command_string = true

and restart postgresql.

then you'll be able to:

select * from pg_stat_activity;

to see what queries postgres is running and that might give you some clues.

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] UNSUBSCRIBE

2006-05-09 Thread Chris

Shoaib Burq wrote:

UNSUBSCRIBE


To unsubscribe:

List-Unsubscribe: 




Email admins - Could we add this above or below the random tips that get 
appended to every email ?


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] UNSUBSCRIBE

2006-05-09 Thread Chris

Tom Lane wrote:

Chris <[EMAIL PROTECTED]> writes:

Email admins - Could we add this above or below the random tips that get 
appended to every email ?



You mean like these headers that already get added to every list
message (these copied-and-pasted from your own message):


The headers aren't the first place you'd go looking for such info.. once 
you know they are there it's ok.



Maybe the real problem is at the other end of the process, ie we should
require some evidence of a greater-than-room-temp IQ to subscribe in the
first place?


Maybe :) The php-general list has

To unsubscribe, visit: http://www.php.net/unsub.php

at the bottom of every email, and there are still random unsubscribe 
requests..


Ah well :)

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Speed Up Offset and Limit Clause

2006-05-10 Thread Chris

Christian Paul Cosinas wrote:

Hi!

How can I speed up my server's performance when I use offset and limit
clause.

For example I have a query:
SELECT * FROM table ORDER BY id, name OFFSET 10 LIMIT 1

This query takes a long time about more than 2 minutes.

If my query is:
SELECT * FROM table ORDER BY id, name OFFSET 5 LIMIT 1
It takes about 2 seconds.


Please create a new thread rather than replying to someone elses post 
and changing the subject. These threads can sometimes get missed.


You do have an index on id and name don't you?

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] optimizing LIKE '%2345' queries

2006-07-04 Thread Chris

Gene wrote:

Thanks for the suggestion. Actually I went ahead and created a reverse
function using plpgsql, created an index using reverse column and now
my queries use "where reverse(column) like reverse('%2345') and it's
using the index like i hoped it would! Now if I could figure out how
to optimize like  '%2345%' queries. I don't want to create many
indexes though the table is very write heavy.


You can't because that text can be anywhere inside the database field, 
so the whole field basically has to be checked to see if it's there.


You could check out full text indexing (tsearch2).


http://www.designmagick.com/article/27/PostgreSQL/Introduction-to-Full-Text-Indexing

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] managing database with thousands of tables

2006-07-05 Thread Chris

Eugeny N Dzhurinsky wrote:

On Wed, Jul 05, 2006 at 09:39:31AM -0400, Tom Lane wrote:

Eugeny N Dzhurinsky <[EMAIL PROTECTED]> writes:

but it seems pg_autovacuum does not do vacuuming on system tables.

There was a bug awhile back whereby autovac failed to notice temp table
cleanup at connection end --- maybe you need to update?


May be. So should I update to newer postgres 8.1 or just upgrade
pg_autovacuum somehow (I don't know how btw ;) )?


Update the whole lot. You should be able to do the upgrade "in place" 
but take a backup "just in case".


http://www.postgresql.org/docs/8.1/interactive/release.html

will list all changes between versions.

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Query plan issue when upgrading to postgres 8.14 (from

2006-07-06 Thread Chris

Ioana Danes wrote:
I have a problem with a query that in postgres 7.4 and 8.12 has an 
acceptable response time but in postgres 8.14 is very slow.


This is the table I use:
*
create* *table* TEST (
TESTIDINT8 *not* *null*,
TESTTYPE  INT4 *null*,
*constraint* PK_TESTID *primary* *key* (TESTID));
*create* *index* IX_TEST_TESTTYPE *on* TEST (TESTTYPE);
 
And this is the query with the problem:
 
*explain select* *max*(TESTID) *from* TEST *where* TESTTYPE = 1577;
 
The query plan in postgres 7.4 and 8.12 is using the index by TESTTYPE 
field, which is what I want in this case.
 
QUERY PLAN 
Aggregate  (cost=25.97..25.97 rows=1 width=8)   
  ->  Index Scan using ix_test_testtype on test  (cost=0.00..25.95 
rows=9 width=8)   
Index Cond: (testtype = 1577)
 
 
With postgres 8.14 the query plan uses the primary key PK_TESTID with 
filter by TESTTYPE, which it takes almost 10 minutes to execute:
 
QUERY PLAN 
Limit  (cost=0.00..41.46 rows=1 width=8)   
  ->  Index Scan Backward using pk_testid on test  (cost=…)   
Filter: ((testid IS NOT NULL) and (testtype = 1577))
 
When replacing the index

*create* *index* IX_TEST_TESTTYPE *on* TEST (TESTTYPE);
with
*create* *index* IX_TEST_TESTTYPE *on* TEST (TESTTYPE, TESTID);
the query plan uses this index and the execution of this select is 
extremely fast.
 
 From what I can see, the query plan for 8.14 is using a index scan by 
the field used with max() function with a filter by the field in where 
condition.
Should not the query plan use an index scan by the field in where 
condition (which in my case is a small range) and come up with the max 
value in that range?
 
Is this a bug, am I missing a configuration step or this is how it is 
supposed to work?


You've left out the best details. Post an 'explain analyze' from both 
versions, and don't cut anything out :)


I'm guessing postgres is seeing an index on the table is faster because 
it doesn't think you have many rows in the table. How many are there, 
and have you done an analyze of the table after loading the data in?


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] setting up foreign keys

2006-08-10 Thread Chris

Sue Fitt wrote:

Hi all,

This is my first post to the performance list, I hope someone can help me.

I'm setting up a table with 2 columns, both of which reference a column 
in another table:


CREATE TABLE headwords_core_lexemes (
core_id int REFERENCES headwords_core(core_id),
lexeme_id int REFERENCES headwords_core(core_id),
);


One problem here is both of these are referencing the same column ;) I'm 
sure that's a typo.


It sounds like you have something blocking or locking the other table. 
Check pg_locks (I think it is), 13,000 rows shouldn't take *that* long.



Make sure there is an index on headwords_core(core_id) and whatever the 
other column should be.


Foreign keys have to check the other table so without those indexes, it 
will be slow(er).


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] setting up foreign keys

2006-08-10 Thread Chris

Sue Fitt wrote:

Thanks Chris and Chris, you've solved it.

I had a gui open that connects to the database. It was doing nothing 
(and not preventing me adding to or altering headwords_core via psql), 
but having closed it the table is instantly created. Weird.


BTW, referencing the same column twice is deliberate, it's a 
cross-reference.


The same column and the same table?

Same column different table I could understand but not the same column & 
table ;)


I'm sure there's a reason for it though :)

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] setting up foreign keys

2006-08-10 Thread Chris

Merlin Moncure wrote:

On 8/10/06, Chris <[EMAIL PROTECTED]> wrote:

Sue Fitt wrote:
> Thanks Chris and Chris, you've solved it.
>
> I had a gui open that connects to the database. It was doing nothing
> (and not preventing me adding to or altering headwords_core via psql),
> but having closed it the table is instantly created. Weird.
>
> BTW, referencing the same column twice is deliberate, it's a
> cross-reference.

The same column and the same table?

Same column different table I could understand but not the same column &
table ;)


create table color(color text);

create table person(eye_color text references color(color), hair_color
text references color(color));


lol. Good point :)

*back to the hidey hole!*

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] setting up foreign keys

2006-08-13 Thread Chris

Spiegelberg, Greg wrote:

Sort of on topic, how many foreign keys in a single table is good v.
bad?  I realize it's relative to the tables the FK's reference so here's
an example:

Table A: 300 rows
Table B: 15,000,000 rows
Table C: 100,000 rows
Table E: 38 rows
Table F: 9 rows
Table G: is partitioned on the FK from Table A and has a FK column for
each of the above tables

I'm in the process of normalizing the database and have a schema like
this in mind.  Works wonderfully for SELECT's but haven't gotten the
data import process down just yet so I haven't had a chance to put it
through it's paces.  Depending on the performance of INSERT, UPDATE, and
COPY I may drop the FK constraints since my app could enforce the FK
checks.


As long as both sides of the FK's are indexed I don't think you'll have 
a problem with a particular number of FK's per table.


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Query tuning

2006-08-22 Thread Chris

Subbiah, Stalin wrote:

Hello All,

This query runs forever and ever. Nature of this table being lots of
inserts/deletes/query, I vacuum it every half hour to keep the holes
reusable and nightly once vacuum analyze to update the optimizer. We've
got index on eventtime only. Running it for current day uses index range
scan and it runs within acceptable time. Below is the explain of the
query. Is the order by sequencenum desc prevents from applying limit
optimization?

explain SELECT *
FROM EVENTLOG 
WHERE EVENTTIME>'07/23/06 16:00:00' 
AND  EVENTTIME<'08/22/06 16:00:00' 
AND  (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA' 
OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA' 
OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA') 
ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 0;
 
QUERY PLAN






-
 Limit  (cost=15546930.29..15546931.54 rows=500 width=327)
   ->  Sort  (cost=15546930.29..15581924.84 rows=13997819 width=327)
 Sort Key: eventtime, sequencenum
 ->  Seq Scan on eventlog  (cost=0.00..2332700.25 rows=13997819
width=327)
   Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp
without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp
without time zone) AND (((objdomainid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text)))
(5 rows)

Thanks,
Stalin
Pg version 8.0.1, suse 64bit.


Firstly you should update to 8.0.8 - because it's in the same stream you 
won't need to do a dump/initdb/reload like a major version change, it 
should be a simple upgrade.


Can you send explain analyze instead of just explain?

It sounds like you're not analyz'ing enough - if you're doing lots of 
updates/deletes/inserts then the statistics postgresql uses to choose 
whether to do an index scan or something else will quickly be outdated 
and so it'll have to go back to a full table scan every time..


Can you set up autovacuum to handle that for you more regularly?

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Query tuning

2006-08-22 Thread Chris

Subbiah, Stalin wrote:

Actually these servers will be upgraded to 8.1.4 in couple of months.


even so, you could get some bad data in there.
http://www.postgresql.org/docs/8.0/static/release.html . Go through the 
old release notes and you'll find various race conditions, crashes etc.



Here you go with explain analyze.

# explain analyze SELECT *
FROM EVENTLOG 
WHERE EVENTTIME>'07/23/06 16:00:00' AND  EVENTTIME<'08/22/06 16:00:00' 
AND  (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA' 
OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA' 
OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA') 
ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 500;
 
QUERY PLAN






-
 Limit  (cost=15583110.14..15583111.39 rows=500 width=327) (actual
time=427771.568..427772.904 rows=500 loops=1)
   ->  Sort  (cost=15583108.89..15618188.88 rows=14031998 width=327)
(actual time=427770.504..427771.894 rows=1000 loops=1)
 Sort Key: eventtime, sequencenum
 ->  Seq Scan on eventlog  (cost=0.00..2334535.17 rows=14031998
width=327) (actual time=10.370..190038.764 rows=7699388 loops=1)
   Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp
without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp
without time zone) AND (((objdomainid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text)))
 Total runtime: 437884.134 ms
(6 rows)


If you analyze the table then run this again what plan does it come back 
with?


I can't read explain output properly but I suspect (and I'm sure I'll be 
corrected if need be) that the sort step is way out of whack and so is 
the seq scan because the stats aren't up to date enough.


Do you have an index on objdomainid, objid and userdomainid (one index 
per field) ? I wonder if that will help much.


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Forcing index usage without 'enable_hashjoin = FALSE'

2006-08-22 Thread Chris

Dan Langille wrote:

I'm using PostgreSQL 8.1.4 and I'm trying to force the planner to use
an index.  With the index, I get executions times of 0.5 seconds. 
Without, it's closer to 2.5 seconds.


Compare these two sets of results (also provided at 
http://rafb.net/paste/results/ywcOZP66.html

should it appear poorly formatted below):

freshports.org=# \i test2.sql
 
QUERY PLAN

--
--
-
 Merge Join  (cost=24030.39..24091.43 rows=3028 width=206) (actual 
time=301.301..355.261 rows=3149 loops=1)

   Merge Cond: ("outer".id = "inner".category_id)
   ->  Sort  (cost=11.17..11.41 rows=97 width=4) (actual 
time=0.954..1.300 rows=95 loops=1)

 Sort Key: c.id
 ->  Seq Scan on categories c  (cost=0.00..7.97 rows=97 
width=4) (actual time=0.092..0.517 rows=97 loops=1)
   ->  Sort  (cost=24019.22..24026.79 rows=3028 width=206) (actual 
time=300.317..314.114 rows=3149 loops=1)

 Sort Key: p.category_id
 ->  Nested Loop  (cost=0.00..23844.14 rows=3028 width=206) 
(actual time=0.082..264.459 rows=3149 loops=1)
   ->  Seq Scan on ports p  (cost=0.00..6141.11 rows=3028 
width=206) (actual time=0.026..133.575 rows=3149 loops=1)

 Filter: (status = 'D'::bpchar)
   ->  Index Scan using element_pkey on element e  
(cost=0.00..5.83 rows=1 width=4) (actual time=0.022..0.026 rows=1 
loops=3149)

 Index Cond: ("outer".element_id = e.id)
 Total runtime: 369.869 ms
(13 rows)

freshports.org=# set enable_hashjoin = true;
SET
freshports.org=# \i test2.sql
   QUERY PLAN
--
--
 Hash Join  (cost=6156.90..13541.14 rows=3028 width=206) (actual 
time=154.741..2334.366 rows=3149 loops=1)

   Hash Cond: ("outer".category_id = "inner".id)
   ->  Hash Join  (cost=6148.68..13472.36 rows=3028 width=206) 
(actual time=153.801..2288.792 rows=3149 loops=1)

 Hash Cond: ("outer".id = "inner".element_id)
 ->  Seq Scan on element e  (cost=0.00..4766.70 rows=252670 
width=4) (actual time=0.022..1062.626 rows=252670 loops=1)
 ->  Hash  (cost=6141.11..6141.11 rows=3028 width=206) 
(actual time=151.105..151.105 rows=3149 loops=1)
   ->  Seq Scan on ports p  (cost=0.00..6141.11 rows=3028 
width=206) (actual time=0.027..131.072 rows=3149 loops=1)

 Filter: (status = 'D'::bpchar)
   ->  Hash  (cost=7.97..7.97 rows=97 width=4) (actual 
time=0.885..0.885 rows=97 loops=1)
 ->  Seq Scan on categories c  (cost=0.00..7.97 rows=97 
width=4) (actual time=0.076..0.476 rows=97 loops=1)

 Total runtime: 2346.877 ms
(11 rows)

freshports.org=#

Without leaving "enable_hashjoin = false", can you suggest a way to 
force the index usage?


FYI, the query is:

explain analyse
SELECT P.id,
   P.category_id,
   P.version as version,
   P.revisionas revision,
   P.element_id,
   P.maintainer,
   P.short_description,
   to_char(P.date_added - SystemTimeAdjust(), 'DD Mon  
HH24:MI:SS') as date_added,

   P.last_commit_id  as last_change_log_id,
   P.package_exists,
   P.extract_suffix,
   P.homepage,
   P.status,
   P.broken,
   P.forbidden,
   P.ignore,
   P.restricted,
   P.deprecated,
   P.no_cdrom,
   P.expiration_date,
   P.latest_link
  FROM categories C, ports P JOIN element E on P.element_id = E.id
 WHERE P.status  = 'D'
   AND P.category_id = C.id;



I doubt it would make a difference but if you:

...
FROM categories C JOIN ports P on P.category_id=C.id JOIN element E on 
P.element_id = E.id

WHERE P.status  = 'D';

does it change anything?

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] data=writeback

2004-04-08 Thread Chris
> LABEL=/usr/local/pgsql  /usr/local/pgsql  ext3
> noatime,data=writeback  1 2
> LABEL=/usr/local/pgsql  /usr/local/pgsql/wal  ext3
> noatime,data=ordered1 2

The same label mounted on two different mount points is probably I typo?

I'm not sure if data=writeback is ok. I was wondering about the same
thing after reading the "good pc but bad performance,why?" thread.

This is from man mount:

  writeback
  Data ordering is not preserved - data may be written into
  the  main file system after its metadata has been commit-
  ted to the journal.  This is rumoured to be the  highest-
  throughput  option.   It  guarantees internal file system
  integrity, however it can allow old  data  to  appear  in
  files after a crash and journal recovery.

How does this relate to fflush()? Does fflush still garantee 
all data has ben written?

Bye, Chris.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] data=writeback

2004-04-08 Thread Chris

> > > LABEL=/usr/local/pgsql  /usr/local/pgsql  ext3
> > > noatime,data=writeback  1 2
> > > LABEL=/usr/local/pgsql  /usr/local/pgsql/wal  ext3
> > > noatime,data=ordered1 2
> >
> > The same label mounted on two different mount points is probably I typo?
> 
> 
> No, the same label mounted on two different mount points is not a typo. This
> is the way it is in my /etc/fstab.
> 
> Note that I did not create this file myself, it was created by the RedHat
> Enterprise Linux 3 ES installer. I created different partitions for the data
> directory (/usr/local/pgsql) and the wal directory (/usr/local/pgsql/wal)
> using the installer and this is how the /etc/fstab file ended up.
> 
> Why, is this bad? They use the same label, but use different mount points?
> Can this cause problems?

Mmm... how can the mounter distinguish the two partitions?

Maybe I'm missing a concept here, but I thought labels must uniquely
identify partitions?

Seems suspicious to me...

Does it work? When you give just "mount" at the command line what output
do you get?

Bye, Chris.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Join Query Perfomance Issue

2008-02-14 Thread Chris


Nested Loop  (cost=0.00..31157.91 rows=3054 width=14) (actual 
time=0.252..149.557 rows=2769 loops=1)
  ->  Index Scan using 
messungen_v_dat_2007_11_12_messpunkt_minute_tag_idx on 
messungen_v_dat_2007_11_12 m  (cost=0.00..5134.28 rows=3054 width=4) 
(actual time=0.085..11.562 rows=2769 loops=1)

Index Cond: ((ganglinientyp = 'M'::bpchar) AND (799 = minute_tag))
  ->  Index Scan using messwerte_mv_nr_idx on messwerte_mv p  
(cost=0.00..8.51 rows=1 width=18) (actual time=0.031..0.035 rows=1 
loops=2769)

Index Cond: (p.nr = m.messpunkt)
Total runtime: 159.703 ms
(6 rows)

Nested Loop is not the best regarding to performance, but there isn't a 
way to avoid it here?


Your own tests have proven it's the right approach for this particular 
query.


Another strange problem occurs when i retry the query after about 12 
hours break without akivity on the database (starting work in the 
morning) :
The query runs incredible slow (~3sec), analyse on the tables doesn't 
change much. But when i switch enable_netloop to false, retry the query 
(very bad result, > 30sec), then set enable_nestloop back to true, the 
query works amazingly fast again (100ms).


The o/s has cached some of the data so instead of actually hitting the 
disk, it's getting it from the o/s cache.


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] 7 hrs for a pg_restore?

2008-02-20 Thread Chris



When I said "obfuscating" I meant it. I'm pretty familiar with sh scripting
and I'm not even sure what the && behaviour would do.


It chains commands together so if the first fails the second doesn't happen.

$ echo 1 && echo 2
1
2

$ echo '1234' > /etc/file_that_doesnt_exist && echo 2
-bash: /etc/file_that_doesnt_exist: Permission denied


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] 7 hrs for a pg_restore?

2008-02-20 Thread Chris

Gregory Stark wrote:

"Chris" <[EMAIL PROTECTED]> writes:


When I said "obfuscating" I meant it. I'm pretty familiar with sh scripting
and I'm not even sure what the && behaviour would do.

It chains commands together so if the first fails the second doesn't happen.


I meant in this case, not in general. That is, does it introduce a subshell?


Ah - my misunderstanding then. No idea about that one.

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Confirmação de envio / Sending confirmation (captchaid:13266b20536d)

2008-03-05 Thread Chris

petchimuthu lingam wrote:

C5BK4513


Ahh - you are sending this to the wrong address, these are not being 
sent by the postgres mailing list.


Check which address you are replying to next time...

--
Postgresql & php tutorials
http://www.designmagick.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] count * performance issue

2008-03-05 Thread Chris

sathiya psql wrote:

count(*) tooks much time...

but with the where clause we can make this to use indexing,... what 
where clause we can use??


Am using postgres 7.4 in Debian OS with 1 GB RAM,

am having a table with nearly 50 lakh records,


Looks suspiciously like a question asked yesterday:

http://archives.postgresql.org/pgsql-performance/2008-03/msg00068.php

--
Postgresql & php tutorials
http://www.designmagick.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] migration of 7.4 to 8.1

2008-03-11 Thread Chris

sathiya psql wrote:

This might be a silly question, but ... why 8.1 ?

If you're doing a major upgrade, why not go straight to 8.3? It's been
out long enough that there aren't any obvious nasty bugs, and there have
been a fair few fixes and improvements since prior versions.

Because am using Debian ETCH stable... i need to continue using this... 
In Debian ETCH stable 8.1 is only there


I will be installing my project in other machines where it is having 
DEBIAN ETCH STABLE, i dont want to face any problems... so only...


You can get 8.3 from backports: http://www.backports.org/ - it's a 
debian project to get more up to date versions for existing stable 
releases (they package everything exactly the same way).


--
Postgresql & php tutorials
http://www.designmagick.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] migration of 7.4 to 8.1

2008-03-12 Thread Chris

[EMAIL PROTECTED] wrote:

On Wed, 12 Mar 2008, sathiya psql wrote:


In the home page itself they were saying  testing ... unstable


you are talking about the debian home page right?


then we should not use that for live.

so i prefer 8.1 .


Debian selected the version of Postgres for Etch about a year and a half 
ago. At that point selecting 8.1 was a resonable choice.


Debian has a policy that they will never change the version number of a 
package in a stable release (they will backport many bugfixes, but not 
upgrade the version)


As a result 2 years from now when Postgres is on 8.5 stable (and looking 
at 8.6), Debian Etch will still be on 8.1


I like that with debian I can install multiple postgres versions and it 
handles everything for me :) Changing the default port, config files are 
in different folders, different start up scripts.. all works very nicely :)


Which means you can have 8.1 installed and 8.3 installed at the same 
time - both from packages, no compiling etc necessary - and can switch 
between them very easily.


--
Postgresql & php tutorials
http://www.designmagick.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] question on TRUNCATE vs VACUUM FULL

2008-03-18 Thread Chris




So my question is this:  Shouldn’t VACUUM FULL clean Table C and reclaim 
all its space?


You've got concepts mixed up.

TRUNCATE deletes all of the data from a particular table (and works in 
all dbms's).


http://www.postgresql.org/docs/8.3/interactive/sql-truncate.html



VACUUM FULL is a postgres-specific thing which does work behind the 
scenes to clean up MVCC left-overs. It does not touch any current data 
or records in the table, it's purely behind the scenes work.


http://www.postgresql.org/docs/current/interactive/sql-vacuum.html


The two have completely different uses and nothing to do with each other 
what-so-ever.


--
Postgresql & php tutorials
http://www.designmagick.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] slow pg_connect()

2008-03-24 Thread Chris



* Read about configuring and using persistent database connections
  (http://www.php.net/manual/en/function.pg-pconnect.php) with PHP


Though make sure you understand the ramifications of using persistent 
connections. You can quickly exhaust your connections by using this and 
also cause other issues for your server.


If you do this you'll probably have to adjust postgres to allow more 
connections, which usually means lowering the amount of shared memory 
each connection can use which can also cause performance issues.


I'd probably use pgpool-II and have it handle the connection stuff for 
you rather than doing it through php.


--
Postgresql & php tutorials
http://www.designmagick.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] bulk insert performance problem

2008-04-07 Thread Chris

Craig Ringer wrote:

Christian Bourque wrote:

Hi,

I have a performance problem with a script that does massive bulk
insert in 6 tables. When the script starts the performance is really
good but will degrade minute after minute and take almost a day to
finish!
  
Would I be correct in guessing that there are foreign key relationships 
between those tables, and that there are significant numbers of indexes 
in use?


The foreign key checking costs will go up as the tables grow, and AFAIK 
the indexes get a bit more expensive to maintain too.


If possible you should probably drop your foreign key relationships and 
drop your indexes, insert your data, then re-create the indexes and 
foreign keys. The foreign keys will be rechecked when you recreate them, 
and it's *vastly* faster to do it that way. Similarly, building an index 
from scratch is quite a bit faster than progressively adding to it. Of 
course, dropping the indices is only useful if you aren't querying the 
tables as you build them.


If you are, add "analyze" commands through the import, eg every 10,000 
rows. Then your checks should be a bit faster.


The other suggestion would be to do block commits:

begin;
do stuff for 5000 rows;
commit;

repeat until finished.

--
Postgresql & php tutorials
http://www.designmagick.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Creating large database of MD5 hash values

2008-04-11 Thread Chris




1. Which datatype should I use to represent the hash value? UUIDs are
also 16 bytes...


md5's are always 32 characters long so probably varchar(32).


2. Does it make sense to denormalize the hash set relationships?


The general rule is normalize as much as possible then only denormalize 
when absolutely necessary.



3. Should I index?


What sort of queries are you going to be running?


4. What other data structure options would it make sense for me to choose?


What sort of other data will you be needing to store?

--
Postgresql & php tutorials
http://www.designmagick.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trigger is not firing immediately

2008-07-13 Thread Chris
Praveen wrote:
>  
> Hi All,
> I am having a trigger in table, If I update the the table manually
> trigger is firing immediately(say 200ms per row), But if I update the
> table through procedure the trigger is taking time to fire(say 7 to 10
> seconds per row).
>  
> Please tell me what kind of changes can I make so that  trigger  fire
> immediately while updating the table through procedure ?

Sending the same email over and over again isn't going to get you a
response any quicker.

If you send the details of the trigger and the tables/fields it affects
then you might get a more helpful response.

-- 
Postgresql & php tutorials
http://www.designmagick.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] limit clause produces wrong query plan

2008-11-24 Thread Chris

Andrus wrote:

Scott,


And how exactly should it be optimized?  If a query is even moderately
interesting, with a few joins and a where clause, postgresql HAS to
create the rows that come before your offset in order to assure that
it's giving you the right rows.


SELECT ... FROM bigtable ORDER BY intprimarykey OFFSET 100 LIMIT 100

It should scan primary key in index order for 200 first keys and 
skipping first 100 keys.


... which if you have a lot of table joins, unions/intersects/whatever 
else, should be done on which field and how?


For a query like:

select * t1 join t2 using (id) where t1.id='x' order by t1.id limit 100;

it has to join the tables first (may involve a seq scan) to make sure 
the id's match up, reduce the number of rows to match the where clause 
(may/may not be done first, I don't know) - the limit is applied last.


it can't grab the first 100 entries from t1 - because they might not 
have a matching id in t2, let alone match the where clause.


--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] left join + case - how is it processed?

2009-01-18 Thread Chris

Hi all,

I have a view that looks like this:

 SELECT
CASE
WHEN r.assetid IS NULL THEN p.assetid
ELSE r.assetid
END AS assetid,
CASE
WHEN r.userid IS NULL THEN p.userid
ELSE r.userid
END AS userid, p.permission, p."granted", p.cascades
   FROM sq_ast_perm p
   LEFT JOIN sq_vw_ast_role r ON r.roleid::text = p.userid::text AND 
r.assetid::text = p.assetid::text;



It was pointed out to me that the first CASE is useless (since r.assetid 
will always be the same as p.assetid because of the left join condition) 
so I'm looking at that to see if it'll make much of a difference and it 
does.


I won't post the whole lot but the first line is the most interesting.

# explain analyze SELECT * from sq_vw_ast_perm where assetid='30748';


 Merge Left Join  (cost=9529.34..13823.76 rows=75721 width=102) (actual 
time=284.371..341.536 rows=1 loops=1)


(The row count is right - it's the total # of rows from sq_ast_perm).


When I change the view to be:

 SELECT p.assetid,
CASE
WHEN r.userid IS NULL THEN p.userid
ELSE r.userid
END AS userid, p.permission, p."granted", p.cascades
   FROM sq_ast_perm p
   LEFT JOIN sq_vw_ast_role r ON r.roleid::text = p.userid::text AND 
r.assetid::text = p.assetid::text;



The Merge left join only returns 3 rows:

# explain analyze SELECT * from sq_vw_ast_perm where assetid='30748';

   Merge Left Join  (cost=9507.18..9508.23 rows=3 width=70) 
(actual time=11.544..11.549 rows=1 loops=1)


I thought the where condition would cut down on the rows returned, then 
the case statement would take effect to do the null check. It seems to 
be doing it in reverse ??


Recently analyzed, only just imported so free of bloat. Running 8.1.11.

Thanks!
--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] left join + case - how is it processed?

2009-01-18 Thread Chris



The reason why the CASE is affecting your query planning is because
you are using a query that compares assetid to a constant:

SELECT * from sq_vw_ast_perm where assetid='30748';

When PostgreSQL evaluates this statement, assetid gets expanded either
into a case statement (with your first view definition) or into
sq_ast_perm.assetid (with your second view definition).  The latter
definition allows PostgreSQL to make use of the column statistics
(which are pretty accurate) whereas the former is probably leading to
a SWAG, because PostgreSQL isn't very good at estimating the
selectivity of CASE.  The bad selectivity estimate, in turn, is
leading to a poor plan choice...


If I take it out of the view, it's fine:

#  SELECT
# CASE
# WHEN r.assetid IS NULL THEN p.assetid
# ELSE r.assetid
# END AS assetid,
# CASE
# WHEN r.userid IS NULL THEN p.userid
# ELSE r.userid
# END AS userid, p.permission, p."granted", p.cascades
#FROM sq_ast_perm p
#LEFT JOIN sq_vw_ast_role r ON r.roleid::text = p.userid::text AND 
r.assetid::text = p.assetid::text

# where p.assetid='30748';

  QUERY PLAN 


-
 Merge Left Join  (cost=9459.89..9463.13 rows=3 width=102) (actual 
time=0.096..0.098 rows=1 loops=1)


In this case I assume the planner is doing the 'WHERE' first to cut down 
the rows, then applying the CASE at the end.


The view it seems to be the opposite - I still don't understand why 
that's the case.



Though I do get the same behaviour as the view when I do it as a subselect.

--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] left join + case - how is it processed?

2009-01-18 Thread Chris





I thought the where condition would cut down on the rows returned, then the
case statement would take effect to do the null check. It seems to be doing
it in reverse ??

# explain analyze SELECT * from sq_vw_ast_perm where assetid='30748';

It aperas to me that both of your statements have where clauses, but I
believe where isn't that explicit.  I'm not sure the nature of your
problem permits the query optimizer to eliminate rows at all, even
with the where statement. "assetid" is probably not known when the
query optimizer hits, because it is computed based on the nullness of
the columns.  I'd assume that the optimizer *could* more easily
optimize this if you had used coalesce rather than an ad-hoc method
with CASE. My guess is you can exclude rows with WHERE if the the
column used is an run-time computation involving an ad-hoc CASE.


No difference.

Full explain plan here:

http://explain-analyze.info/query_plans/2725-query-plan-1447

I can see it's doing the extra filter step at the start (4th line) which 
is not present without the coalesce/case statement. I just don't 
understand why it's being done at that stage.


--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] left join + case - how is it processed?

2009-01-19 Thread Chris

Tom Lane wrote:

Chris  writes:
I can see it's doing the extra filter step at the start (4th line) which 
is not present without the coalesce/case statement. I just don't 
understand why it's being done at that stage.


It's not that hard to understand.  With the original view formulation
(or the COALESCE version), the fully expanded form of the query looks
like

select ... from p left join r ...
  where expression_involving_both_p_and_r = constant

If you make the view output be just p.assetid then you have

select ... from p left join r ...
  where p.assetid = constant

In the first case the planner cannot apply the WHERE restriction until
it's formed the p+r join; so you see the condition applied as a filter
on the join node's output.  In the second case, the planner can push the
WHERE restriction down into the scan of p, since the left join doesn't
affect it.  (If a p row doesn't pass the restriction, then no join row
formed from it can either; ergo there is no need to form those join rows
at all.)


So because the CASE is on (some of) the fields I'm joining on, in effect 
it's made part of the join condition. If the fields are outside that 
(r.userid/p.userid), then it's evaluated after.


Thanks!

--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any better plan for this query?..

2009-05-06 Thread Chris

Dimitri wrote:

Hi Craig,

yes, you detailed very well the problem! :-)
all those CHAR columns are so just due historical issues :-) as well
they may contains anything else and not only numbers, that's why..
Also, all data inside are fixed, so VARCHAR will not save place, or
what kind of performance issue may we expect with CHAR vs VARCHAR if
all data have a fixed length?..


None in postgres, but the char/varchar thing may or may not bite you at 
some point later - sounds like you have it covered though.



It's 2 times faster on InnoDB, and as it's just a SELECT query no need
to go in transaction details :-)


 Total runtime: 1.442 ms
(10 rows)

You posted a query that's taking 2/1000's of a second. I don't really 
see a performance problem here :)


--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Can Postgres use an INDEX over an OR?

2009-07-20 Thread Chris

Віталій Тимчишин wrote:



2009/7/20 Robert James >



Hi. I notice that when I do a WHERE x, Postgres uses an index, and
when I do WHERE y, it does so as well, but when I do WHERE x OR y,
it doesn't. Why is this so? 



It's not clever enough.


Of course it is.

I'm running 8.3.7.

create table t1(id int primary key);
insert into t1(id) select a from generate_series(1, 50) as s(a);
analyze t1;

explain analyze select * from t1 where id=5000 or id=25937;
  QUERY PLAN 


--
 Bitmap Heap Scan on t1  (cost=8.60..16.44 rows=2 width=4) (actual 
time=0.077..0.083 rows=2 loops=1)

   Recheck Cond: ((id = 5000) OR (id = 25937))
   ->  BitmapOr  (cost=8.60..8.60 rows=2 width=0) (actual 
time=0.063..0.063 rows=0 loops=1)
 ->  Bitmap Index Scan on t1_pkey  (cost=0.00..4.30 rows=1 
width=0) (actual time=0.034..0.034 rows=1 loops=1)

   Index Cond: (id = 5000)
 ->  Bitmap Index Scan on t1_pkey  (cost=0.00..4.30 rows=1 
width=0) (actual time=0.021..0.021 rows=1 loops=1)

   Index Cond: (id = 25937)
 Total runtime: 0.153 ms
(8 rows)

What Robert didn't post was his query, see

http://archives.postgresql.org/pgsql-general/2009-07/msg00767.php

which makes it a lot harder to 'optimize' since they aren't straight 
forward conditions.


--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Will Postgres ever lock with read only queries?

2009-07-27 Thread Chris

Robert James wrote:
Hi.  I'm seeing some weird behavior in Postgres.  I'm running read only 
queries (SELECT that is - no UPDATE or DELETE or INSERT is happening at 
all).  I can run one rather complicated query and the results come 
back... eventually.  Likewise with another.  But, when I run both 
queries at the same time, Postgres seems to ground to a halt.  Neither 
one completes.  In fact, pgAdmin locks up - I need to cancel them using 
psql.
I'd expect this from MySQL but not Postgres.  Am I doing something 
wrong? Or missing something?


They're probably not blocking each other but more likely you're 
exhausting your servers resources. If they return "eventually" 
individually, then running both at the same time will take at least 
"eventually x2".


As Mark said, what are the queries? What postgres version? What o/s? 
What are your hardware specs (how much memory, disk speeds/types etc)?


--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Will Postgres ever lock with read only queries?

2009-07-28 Thread Chris

Robert James wrote:
Thanks for the replies.  I'm running Postgres 8.2 on Windows XP, Intel 
Core Duo (though Postgres seems to use only one 1 core).


A single query can only use one core, but it will use both if multiple 
queries come in.



The queries are self joins on very large tables, with lots of nested loops.


If you want help optimizing them, you'll need to send through
- explain analyze
- table definitions
and of course
- the query itself

--
Postgresql & php tutorials
http://www.designmagick.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] load / stress testing

2009-07-30 Thread Chris

Hi,

Everyone says "load test using your app" - out of interest how does 
everyone do that at the database level?


I've tried playr (https://area51.myyearbook.com/trac.cgi/wiki/Playr) but 
haven't been able to get it working properly. I'm not sure what other 
tools are available.


TIA.
--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Greenplum MapReduce

2009-08-02 Thread Chris

Suvankar Roy wrote:


Hi all,

Has anybody worked on Greenplum MapReduce programming ?


It's a commercial product, you need to contact greenplum.

--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] improving my query plan

2009-08-20 Thread Chris

Kevin Kempter wrote:

Hi all;


I have a simple query against two very large tables ( > 800million rows 
in theurl_hits_category_jt table and 9.2 million in the url_hits_klk1 
table )



I have indexes on the join columns and I've run an explain.
also I've set the default statistics to 250 for both join columns. I get 
a very high overall query cost:


If you had an extra where condition it might be different, but you're 
just returning results from both tables that match up so doing a 
sequential scan is going to be the fastest way anyway.


--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimizer + bind variables

2009-11-03 Thread Chris

David Kerr wrote:

On Wed, Nov 04, 2009 at 07:43:16AM +0800, Craig Ringer wrote:
- David Kerr wrote:
- > Does/is it possible for the PG optimizer come up with differnet plans when 
- > you're using bind variables vs when you send static values?
- 
- Yes, if the bind variable form causes your DB access driver to use a

- server-side prepared statement. Pg can't use its statistics to improve
- its query planning if it doesn't have a value for a parameter when it's
- building the query plan.

hmm, that's a little unclear to me.

let's assume that the application is using prepare:

Assuming the database hasn't changed, would:
PREPARE bla1 as SELECT * from users where username = '$1';
explain execute bla1

give the same output as
explain select * from users where username = 'dave';

?


No.

This is explained in the notes here:

http://www.postgresql.org/docs/current/static/sql-prepare.html

--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] How to troubleshoot high mem usage by postgres?

2010-02-27 Thread Chris
Hi, I'm having an issue where a postgres process is taking too much
memory when performing many consecutive inserts and updates from a PHP
script (running on the command line). I would like to know what sort
of logging I can turn on to help me determine what is causing memory
to be consumed and not released.

Most PHP scripts are not long-running and properly releasing the
resources using the provided functions in the pgsql PHP extension is
not necessary. However since I do have a long-running script, I have
taken steps to ensure everything is being properly released when it is
no longer needed (I am calling the functions provided, but I don't
know if the pgsql extension is doing the right thing). In spite of
this, the longer the script runs and processes records, the more
memory increases. It increases to the point that system memory is
exhausted and it starts swapping. I killed the process at this point.

I monitored the memory with top. here are the results.. the first is
10 seconds after my script started running. The second is about 26
seconds.

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  DATA COMMAND
17461 postgres  16   0  572m 405m  14m S 20.0 10.7   0:10.65 422m postmaster
17460 root  15   0  136m  14m 4632 S 10.6  0.4   0:06.16  10m php
17462 postgres  15   0  193m  46m 3936 D  3.3  1.2   0:01.77  43m postmaster


  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  DATA COMMAND
17461 postgres  16   0 1196m 980m  17m S 19.0 26.0   0:25.72 1.0g postmaster
17460 root  15   0  136m  14m 4632 R 10.3  0.4   0:14.31  10m php
17462 postgres  16   0  255m 107m 3984 R  3.0  2.9   0:04.19 105m postmaster


If I am indeed doing everything I can to release the resources (and
I'm 95% sure I am) then it looks like the pgsql extension is at fault
here.
Regardless of who/what is at fault, I need to fix it. And to do that I
need to find out what isn't getting released properly. How would I go
about that?

Thanks,
Chris

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to troubleshoot high mem usage by postgres?

2010-02-27 Thread Chris
On Sat, Feb 27, 2010 at 3:38 PM, Ben Chobot  wrote:
> In your postgresql.conf file, what are the settings for work_mem and
> shared_buffers?

I have not done any tuning on this db yet (it is a dev box). It is
using defaults.
shared_buffers = 32MB
#work_mem = 1MB


I do appreciate the several quick responses and I will work on
responding to the them.

@Craig Ringer:
select version() reports:
PostgreSQL 8.4.2 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC)
4.1.2 20071124 (Red Hat 4.1.2-42), 64-bit
The system has 4GB of RAM.
The postgres log currently does not show any useful information. Only
thing in there for today is an "Unexpected EOF on client connection"
because I killed the process after it started swapping.

The test input for my PHP script is a csv file with about 450,000
records in it. The php script processes the each csv record in a
transaction, and on average it executes 2 insert or update statements
per record. I don't think the specific statements executed are
relevant (they are just basic INSERT and UPDATE statements).

I will try to come up with a short script that reproduces the problem.

@Tom Lane:
As I mentioned above I am not doing everything in a single
transaction. However I do want to try your suggestion regarding
getting a "memory context map". But I'm afraid I don't know how to do
what you are describing. How can I set the ulimit of postmaster? And
does the postmaster stderr output go to the postgres log file? If not,
where can I find it?

Thanks again,
Chris

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to troubleshoot high mem usage by postgres?

2010-02-27 Thread Chris
On Sat, Feb 27, 2010 at 3:38 PM, Craig Ringer
 wrote:
> Can you boil this down to a simple PHP test-case that connects to a dummy
> database and repeats something that causes the backend to grow in memory
> usage? Trying to do this - by progressively cutting things out of your test
> until it stops growing - will help you track down what, exactly, is causing
> the growth.

Thank you for your suggestion. I have done this, and in doing so I
have also discovered why this problem is occurring.

My application uses a class that abstracts away the db interaction, so
I do not normally use the pg_* functions directly. Any time any
statement was executed, it created a new "named" prepared statement. I
wrongly assumed that calling pg_free_result() on the statement
resource would free this prepared statement inside of postgres.

I will simply modify the class to use an empty statement name if there
is no need for it to be named (which I actually need very infrequently
anyway).

I have attached the script I created to test with, for those who are
interested. The first line of the script has the connection string. I
used a db called testdb. run from the command line with:
php -f test3.php

Note my comment in the php file
<<<<<< UNCOMMENT THIS LINE AND MEMORY ISSUE IS FIXED

Thanks for the help everyone.
Chris
<>

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [HACKERS] full text search index scan query plan changed in 8.4.2?

2010-02-28 Thread Chris

Josh Berkus wrote:

Xufei,

List changed to psql-performance, which is where this discussion belongs.


I am testing the index used by full text search recently.

I have install 8.3.9 and 8.4.2 separately. 


In 8.3.9, the query plan is like:

postgres=# explain SELECT s.name as source , t.name as target FROM element as s, element as t WHERE to_tsvector('testcfg',s.name)  @@ to_tsquery('testcfg',replace(t.name,':','|')); QUERY PLAN  
Nested Loop  (cost=0.01..259.92 rows=491 width=18)   
 ->  Seq Scan on element t  (cost=0.00..13.01 rows=701 width=9)   
 ->  Index Scan using element_ftsidx_test on element s  (cost=0.01..0.33 rows=1 width=9) 
 Index Cond: (to_tsvector('testcfg'::regconfig, (s.name)::text) @@ to_tsquery('testcfg'::regconfig, replace((t.name)::text, ':'::text, '|'::text)))

(4 rows)

I have index: "element_ftsidx_test" gin (to_tsvector('testcfg'::regconfig, 
name::text))

The same index and query in 8.4.2: 

postgres=# explain SELECT s.name as source , t.name as target FROM element as s, element as t WHERE to_tsvector('testcfg',s.name)  @@ to_tsquery('testcfg',replace(t.name,':','|')) ;QUERY PLAN   -- 
Nested Loop  (cost=0.32..3123.51 rows=2457 width=18)   
 ->  Seq Scan on element t  (cost=0.00..13.01 rows=701 width=9)   
 ->  Bitmap Heap Scan on element s  (cost=0.32..4.36 rows=4 width=9)  Recheck Cond: (to_tsvector('testcfg'::regconfig, (s.name)::text) @@ to_tsquery('testcfg'::regconfig, replace((t.name)::text, ':'::text, '|'::text))) 
->  Bitmap Index Scan on element_ftsidx_test  (cost=0.00..0.32 rows=4 width=0)

 Index Cond: (to_tsvector('testcfg'::regconfig, (s.name)::text) @@ 
to_tsquery('testcfg'::regconfig, replace((t.name)::text, ':'::text, '|'::text)))
(6 rows)

Why the query plans are different and why? Thanks!


Because the row estimates changed, since 8.4 improved row estimation for
TSearch.  The 2nd query is probably actually faster, no?  If not, you
may need to increase your stats collection.  Or at least show us a
VACUUM ANALYZE.


I'm sure you mean explain analyze :)

--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Hardware advice for scalable warehouse db

2011-07-14 Thread chris
Hi list,

My employer will be donated a NetApp FAS 3040 SAN [1] and we want to run
our warehouse DB on it. The pg9.0 DB currently comprises ~1.5TB of
tables, 200GB of indexes, and grows ~5%/month. The DB is not update
critical, but undergoes larger read and insert operations frequently.

My employer is a university with little funds and we have to find a
cheap way to scale for the next 3 years, so the SAN seems a good chance
to us. We are now looking for the remaining server parts to maximize DB
performance with costs <= $4000. I digged out the following
configuration with the discount we receive from Dell:

  1 x Intel Xeon X5670, 6C, 2.93GHz, 12M Cache
  16 GB (4x4GB) Low Volt DDR3 1066Mhz
  PERC H700 SAS RAID controller
  4 x 300 GB 10k SAS 6Gbps 2.5" in RAID 10

I was thinking to put the WAL and the indexes on the local disks, and
the rest on the SAN. If funds allow, we might downgrade the disks to
SATA and add a 50 GB SATA SSD for the WAL (SAS/SATA mixup not possible).

Any comments on the configuration? Any experiences with iSCSI vs. Fibre
Channel for SANs and PostgreSQL? If the SAN setup sucks, do you see a
cheap alternative how to connect as many as 16 x 2TB disks as DAS?

Thanks so much!

Best,
Chris

[1]: http://www.b2net.co.uk/netapp/fas3000.pdf


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance hit for replication

2005-04-12 Thread Chris Browne
[EMAIL PROTECTED] ("Joshua D. Drake") writes:
>>So, my question is this: My server currently works great,
>>performance wise.  I need to add fail-over capability, but I'm
>>afraid that introducing a stressful task such as replication will
>>hurt my server's performance. Is there any foundation to my fears? I
>>don't need to replicate the archived log data because I can easily
>>restore that in a separate step from the nightly backup if disaster
>>occurs. Also, my database load is largely selects. My application
>>works great with PostgreSQL 7.3 and 7.4, but I'm currently using
>>7.3.
>>
>>I'm eager to hear your thoughts and experiences,
>>
> Well with replicator you are going to take a pretty big hit
> initially during the full sync but then you could use batch
> replication and only replicate every 2-3 hours.
>
> I am pretty sure Slony has similar capabilities.

Yes, similar capabilities, similar "pretty big hit."

There's a downside to "batch replication" that some of the data
structures grow in size if you have appreciable periods between
batches.
-- 
(format nil "[EMAIL PROTECTED]" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/slony.html
Rules of the Evil Overlord #78.  "I will not tell my Legions of Terror
"And he must  be taken alive!" The command will be:  ``And try to take
him alive if it is reasonably practical.''"


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Chris Browne
josh@agliodbs.com (Josh Berkus) writes:
> Bill,
>
>> What about if an out-of-the-ordinary number of rows
>> were deleted (say 75% of rows in the table, as opposed
>> to normal 5%) followed by a 'VACUUM ANALYZE'?  Could
>> things get out of whack because of that situation?
>
> Yes.  You'd want to run REINDEX after and event like that.  As you should now.

Based on Tom's recent comments, I'd be inclined to handle this via
doing a CLUSTER, which has the "triple heroism effect" of:

 a) Reorganizing the entire table to conform with the relevant index order,
 b) Having the effect of VACUUM FULL, and
 c) Having the effect of REINDEX

all in one command.

It has all of the "oops, that blocked me for 20 minutes" effect of
REINDEX and VACUUM FULL, but at least it doesn't have the effect
twice...
-- 
(format nil "[EMAIL PROTECTED]" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/sap.html
Rules of the Evil Overlord #78.  "I will not tell my Legions of Terror
"And he must  be taken alive!" The command will be:  ``And try to take
him alive if it is reasonably practical.''"


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How can an index be larger than a table

2005-04-21 Thread Chris Browne
josh@agliodbs.com (Josh Berkus) writes:
> David,
>
>> What also seems weird to me is that the control table has some unique
>> indexes created on it, but the data_upate_events table just has a unique
>> constraint.  Will postgres use an index in the background to enforce
>> this constraint?
>
> If you somehow have a unique constraint without a unique index, something is 
> seriously broken.   I suspect hacking of system tables.
>
> Otherwise, it sounds like you have index bloat due to mass deletions.  Run 
> REINDEX, or, preferably, VACUUM FULL and then REINDEX.

There is in a sense no "best order" for this; VACUUM FULL will wind up
further injuring the indices when it reorganizes the table, which
means that whether you do it first or last, you'll have "index injury."

This actually seems a plausible case for CLUSTER.  (And as Douglas &
Douglas says, "You can become a hero by using CLUSTER."  :-))
-- 
(format nil "[EMAIL PROTECTED]" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/sap.html
Rules of the Evil Overlord #78.  "I will not tell my Legions of Terror
"And he must  be taken alive!" The command will be:  ``And try to take
him alive if it is reasonably practical.''"


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] batch inserts are "slow"

2005-05-02 Thread Chris Browne
[EMAIL PROTECTED] (Christopher Petrilli) writes:
> On 5/2/05, Tim Terlegård <[EMAIL PROTECTED]> wrote:
>> Howdy!
>> 
>> I'm converting an application to be using postgresql instead of
>> oracle.  There seems to be only one issue left, batch inserts in
>> postgresql seem significant slower than in oracle. I have about 200
>> batch jobs, each consisting of about 14 000 inserts. Each job takes
>> 1.3 seconds in postgresql and 0.25 seconds in oracle. With 200 jobs
>> this means several more minutes to complete the task. By fixing
>> this I think the application using postgresql over all would be
>> faster than when using oracle.
>
> Just as on Oracle you would use SQL*Loader for this application, you
> should use the COPY syntax for PostgreSQL.  You will find it a lot
> faster.  I have used it by building the input files and executing
> 'psql' with a COPY command, and also by using it with a subprocess,
> both are quite effective.

I'd suggest taking a peek at the PGForge project, pgloader
.

This is intended to provide somewhat analagous functionality to
SQL*Loader; a particularly useful thing about it is that it will load
those records that it can, and generate a file consisting of just the
failures.

It uses COPY, internally, so it does run reasonably fast.

To the extent to which it is inadequate, it would be neat to see some
enhancements...
-- 
(format nil "[EMAIL PROTECTED]" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/sap.html
Rules of the Evil Overlord #78.  "I will not tell my Legions of Terror
"And he must  be taken alive!" The command will be:  ``And try to take
him alive if it is reasonably practical.''"


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Kernel Resources and max_connections

2005-05-03 Thread Chris Hebrard
Hi,
I have postgres 8.0.2 installed on FreeBSD FreeBSD 4.11-RELEASE with 2GB 
of RAM.

When trying to set max_connections=256 I get the following error message:
FATAL:  could not create semaphores: No space left on device
DETAIL:  Failed system call was semget(5432017, 17, 03600).
HINT:  This error does *not* mean that you have run out of disk space.
   It occurs when either the system limit for the maximum number of 
semaphore sets (SEMMNI), or the system wide maximum number of semaphores 
(SEMMNS), would be exceeded.  You need to raise the respective kernel 
parameter.  Alternatively, reduce PostgreSQL's consumption of semaphores 
by reducing its max_connections parameter (currently 256).
   The PostgreSQL documentation contains more information about 
configuring your system for PostgreSQL.

I have read through the kernel resources documentation for postgres 8 
and set values accordingly. 
Some settings are not staying after a reboot, even if I place them in 
/boot/loader.conf.

So far I'm able to get max_connections to 250.
Here is  a dump of  kern.ipc values:
kern.ipc.maxsockbuf: 262144
kern.ipc.sockbuf_waste_factor: 8
kern.ipc.somaxconn: 128
kern.ipc.max_linkhdr: 16
kern.ipc.max_protohdr: 60
kern.ipc.max_hdr: 76
kern.ipc.max_datalen: 136
kern.ipc.nmbclusters: 65536
kern.ipc.msgmax: 16384
kern.ipc.msgmni: 40
kern.ipc.msgmnb: 2048
kern.ipc.msgtql: 40
kern.ipc.msgssz: 8
kern.ipc.msgseg: 2048
kern.ipc.semmap: 30
kern.ipc.semmni: 256
kern.ipc.semmns: 272
kern.ipc.semmnu: 30
kern.ipc.semmsl: 60
kern.ipc.semopm: 100
kern.ipc.semume: 10
kern.ipc.semusz: 92
kern.ipc.semvmx: 32767
kern.ipc.semaem: 16384
kern.ipc.shmmax: 33554432
kern.ipc.shmmin: 1
kern.ipc.shmmni: 192
kern.ipc.shmseg: 128
kern.ipc.shmall: 8192
kern.ipc.shm_use_phys: 0
kern.ipc.shm_allow_removed: 0
kern.ipc.mbuf_wait: 32
kern.ipc.mbtypes: 38 551 3 0 0 0 0 0 0 0 0 0 0 0 0 0
kern.ipc.nmbufs: 262144
kern.ipc.nsfbufs: 8704
kern.ipc.nsfbufspeak: 7
kern.ipc.nsfbufsused: 0
kern.ipc.m_clreflimithits: 0
kern.ipc.mcl_pool_max: 0
kern.ipc.mcl_pool_now: 0
kern.ipc.maxsockets: 65536
And boot/loader.conf:
userconfig_script_load="YES"
kern.ipc.nmbclusters="65536"
kern.maxfiles="65536"
kern.maxfilesperproc="65536"
net.inet.tcp.mssdflt="1460"
kern.somaxconn="4096"
kern.ipc.semmns="272"
kern.ipc.semmni="256"
kern.ipc.shmmax="66099200"
kern.ipc.shmmax and kern.ipc.shmmin will not stay to what I set them to.
What am I doing wrong or not doing at all?
Your help is greatly appreciated.
Regards,
Chris.



--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.11.2 - Release Date: 5/2/2005
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Kernel Resources and max_connections

2005-05-03 Thread Chris Hebrard
Mark Kirkwood wrote:
Chris Hebrard wrote:
kern.ipc.shmmax and kern.ipc.shmmin will not stay to what I set them to.
What am I doing wrong or not doing at all?
These need to go in /etc/sysctl.conf. You might need to set shmall as 
well.

(This not-very-clear distinction between what is sysctl'abe and what 
is a kernel tunable is a bit of a downer).

cheers
Mark

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Thanks for your reply,
I set the values in etc/sysctl.conf:
# $FreeBSD: src/etc/sysctl.conf,v 1.1.2.3 2002/04/15 00:44:13 dougb Exp $
#
#  This file is read when going to multi-user and its contents piped thru
#  ``sysctl'' to adjust kernel values.  ``man 5 sysctl.conf'' for details.
#
# Added by IMP 2005-05-04
net.inet.tcp.rfc1323=1
kern.ipc.somaxconn=1024
kern.ipc.maxsockbuf=8388608
net.inet.tcp.sendspace=3217968
net.inet.tcp.recvspace=3217968
kern.ipc.semmns="272"
kern.ipc.semmni="256"
kern.ipc.shmmax="66099200"
kern.ipc.shmmin="256"
After a restart both shmmax and shmmin are now 0 and postgres failed to 
start.

kern.ipc.maxsockbuf: 8388608
kern.ipc.sockbuf_waste_factor: 8
kern.ipc.somaxconn: 1024
kern.ipc.max_linkhdr: 16
kern.ipc.max_protohdr: 60
kern.ipc.max_hdr: 76
kern.ipc.max_datalen: 136
kern.ipc.nmbclusters: 65536
kern.ipc.msgmax: 16384
kern.ipc.msgmni: 40
kern.ipc.msgmnb: 2048
kern.ipc.msgtql: 40
kern.ipc.msgssz: 8
kern.ipc.msgseg: 2048
kern.ipc.semmap: 30
kern.ipc.semmni: 10
kern.ipc.semmns: 60
kern.ipc.semmnu: 30
kern.ipc.semmsl: 60
kern.ipc.semopm: 100
kern.ipc.semume: 10
kern.ipc.semusz: 92
kern.ipc.semvmx: 32767
kern.ipc.semaem: 16384
kern.ipc.shmmax: 0
kern.ipc.shmmin: 0
kern.ipc.shmmni: 192
kern.ipc.shmseg: 128
kern.ipc.shmall: 8192
kern.ipc.shm_use_phys: 0
kern.ipc.shm_allow_removed: 0
kern.ipc.mbuf_wait: 32
kern.ipc.mbtypes: 24 550 2 0 0 0 0 0 0 0 0 0 0 0 0 0
kern.ipc.nmbufs: 262144
kern.ipc.nsfbufs: 8704
kern.ipc.nsfbufspeak: 0
kern.ipc.nsfbufsused: 0
kern.ipc.m_clreflimithits: 0
kern.ipc.mcl_pool_max: 0
kern.ipc.mcl_pool_now: 0
kern.ipc.maxsockets: 65536
I'm lost here.
Chris.
--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.11.2 - Release Date: 5/2/2005
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] Kernel Resources Solved

2005-05-03 Thread Chris Hebrard
Problem sovled by setting:
kern.ipc.semmni: 280
kern.ipc.semmns: 300
Chris.

Mark Kirkwood wrote:
Chris Hebrard wrote:
kern.ipc.shmmax and kern.ipc.shmmin will not stay to what I set them 
to.

What am I doing wrong or not doing at all?
These need to go in /etc/sysctl.conf. You might need to set shmall as 
well.

(This not-very-clear distinction between what is sysctl'abe and what 
is a kernel tunable is a bit of a downer).

cheers
Mark

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

kern.ipc.maxsockbuf: 8388608
kern.ipc.sockbuf_waste_factor: 8
kern.ipc.somaxconn: 1024
kern.ipc.max_linkhdr: 16
kern.ipc.max_protohdr: 60
kern.ipc.max_hdr: 76
kern.ipc.max_datalen: 136
kern.ipc.nmbclusters: 65536
kern.ipc.msgmax: 16384
kern.ipc.msgmni: 40
kern.ipc.msgmnb: 2048
kern.ipc.msgtql: 40
kern.ipc.msgssz: 8
kern.ipc.msgseg: 2048
kern.ipc.semmap: 30
kern.ipc.semmni: 256
kern.ipc.semmns: 272
kern.ipc.semmnu: 30
kern.ipc.semmsl: 60
kern.ipc.semopm: 100
kern.ipc.semume: 10
kern.ipc.semusz: 92
kern.ipc.semvmx: 32767
kern.ipc.semaem: 16384
kern.ipc.shmmax: 66099200
kern.ipc.shmmin: 256
kern.ipc.shmmni: 192
kern.ipc.shmseg: 128
kern.ipc.shmall: 8192
kern.ipc.shm_use_phys: 0
kern.ipc.shm_allow_removed: 0
kern.ipc.mbuf_wait: 32
kern.ipc.mbtypes: 37 552 3 0 0 0 0 0 0 0 0 0 0 0 0 0
kern.ipc.nmbufs: 262144
kern.ipc.nsfbufs: 8704
kern.ipc.nsfbufspeak: 4
kern.ipc.nsfbufsused: 0
kern.ipc.m_clreflimithits: 0
kern.ipc.mcl_pool_max: 0
kern.ipc.mcl_pool_now: 0
kern.ipc.maxsockets: 65536

I've got the values to what I want them to be now, after loading some 
values in loader.conf and others in sysctl.conf.

loader.conf:
userconfig_script_load="YES"
kern.ipc.nmbclusters="65536"
kern.maxfiles="65536"
kern.maxfilesperproc="65536"
net.inet.tcp.mssdflt="1460"
kern.somaxconn="4096"
kern.ipc.semmns="272"
kern.ipc.semmni="256"
sysctl.conf:
net.inet.tcp.rfc1323=1
kern.ipc.somaxconn=1024
kern.ipc.maxsockbuf=8388608
net.inet.tcp.sendspace=3217968
net.inet.tcp.recvspace=3217968
kern.ipc.shmmax=66099200
kern.ipc.shmmin=256
kern.ipc.shmall=16138
and kern.ipc values are now:
kern.ipc.maxsockbuf: 8388608
kern.ipc.sockbuf_waste_factor: 8
kern.ipc.somaxconn: 1024
kern.ipc.max_linkhdr: 16
kern.ipc.max_protohdr: 60
kern.ipc.max_hdr: 76
kern.ipc.max_datalen: 136
kern.ipc.nmbclusters: 65536
kern.ipc.msgmax: 16384
kern.ipc.msgmni: 40
kern.ipc.msgmnb: 2048
kern.ipc.msgtql: 40
kern.ipc.msgssz: 8
kern.ipc.msgseg: 2048
kern.ipc.semmap: 30
kern.ipc.semmni: 256
kern.ipc.semmns: 272
kern.ipc.semmnu: 30
kern.ipc.semmsl: 60
kern.ipc.semopm: 100
kern.ipc.semume: 10
kern.ipc.semusz: 92
kern.ipc.semvmx: 32767
kern.ipc.semaem: 16384
kern.ipc.shmmax: 66099200
kern.ipc.shmmin: 256
kern.ipc.shmmni: 192
kern.ipc.shmseg: 128
kern.ipc.shmall: 16138
kern.ipc.shm_use_phys: 0
kern.ipc.shm_allow_removed: 0
kern.ipc.mbuf_wait: 32
kern.ipc.mbtypes: 7 550 3 0 0 0 0 0 0 0 0 0 0 0 0 0
kern.ipc.nmbufs: 262144
kern.ipc.nsfbufs: 8704
kern.ipc.nsfbufspeak: 6
kern.ipc.nsfbufsused: 0
kern.ipc.m_clreflimithits: 0
kern.ipc.mcl_pool_max: 0
kern.ipc.mcl_pool_now: 0
kern.ipc.maxsockets: 65536
Postgres still refuses to start with 256 max_connections.
Chris.



--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.11.2 - Release Date: 5/2/2005
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] PGSQL Capacity

2005-05-09 Thread Chris Browne
[EMAIL PROTECTED] writes:
> How can i know a capacity of a pg database ?
> How many records my table can have ?
> I saw in a message that someone have 50 000 records it's possible in a table ?
> (My table have 8 string field (length 32 car)).
> Thanks for your response.

The capacity is much more likely to be limited by the size of the disk
drives and filesystems you have available to you than by anything
else.

If your table consists of 8- 32 character strings, then each tuple
will consume around 256 bytes of memory, and you will be able to fit
on the order of 30 tuples into each 8K page.

By default, you can extend a single table file to up to 1GB before it
splits off to another piece.  That would mean each file can have about
3.9M tuples.  From there, you can have as many 1GB pieces as the disk
will support.  So you could have (plenty * 3.9M tuples), which could
add up to be rather large.

If you're expecting 50K records, that will be no big deal at all.
-- 
(format nil "[EMAIL PROTECTED]" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/sap.html
Rules of the Evil Overlord #78.  "I will not tell my Legions of Terror
"And he must  be taken alive!" The command will be:  ``And try to take
him alive if it is reasonably practical.''"


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Prefetch - OffTopic

2005-05-10 Thread Chris Browne
[EMAIL PROTECTED] ("Mohan, Ross") writes:
> for time-series and "insane fast", nothing beats kdB, I believe
>
> www.kx.com

... Which is well and fine if you're prepared to require that all of
the staff that interact with data are skilled APL hackers.  Skilled
enough that they're all ready to leap into Whitney's ASCII-based
variant, K.
-- 
(format nil "[EMAIL PROTECTED]" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/functional.html
Rules of the Evil Overlord #78.  "I will not tell my Legions of Terror
"And he must  be taken alive!" The command will be:  ``And try to take
him alive if it is reasonably practical.''"


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Mount database on RAM disk?

2005-07-07 Thread Chris Browne
[EMAIL PROTECTED] (Stuart Bishop) writes:
> I'm putting together a road map on how our systems can scale as our
> load increases. As part of this, I need to look into setting up some
> fast read only mirrors of our database. We should have more than
> enough RAM to fit everything into memory. I would like to find out
> if I could expect better performance by mounting the database from a
> RAM disk, or if I would be better off keeping that RAM free and
> increasing the effective_cache_size appropriately.

If you were willing to take on a not-inconsiderable risk, I'd think
that storing WAL files on a RAMDISK would be likely to be the fastest
improvement imaginable.

If I could get and deploy some SSD (Solid State Disk) devices that
would make this sort of thing *actually safe,* I'd expect that to be a
pretty fabulous improvement, at least for write-heavy database
activity.

> I'd also be interested in knowing if this is dependant on whether I
> am running 7.4, 8.0 or 8.1.

Behaviour of all three could be somewhat different, as management of
the shared cache has been in flux...
-- 
(format nil "[EMAIL PROTECTED]" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/sap.html
Rules of the Evil Overlord #78.  "I will not tell my Legions of Terror
"And he must  be taken alive!" The command will be:  ``And try to take
him alive if it is reasonably practical.''"


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgresSQL vs. Firebird

2005-07-15 Thread Chris Mair

On Thu, 2005-07-14 at 00:19 -0700, Relaxin wrote:
> Before I ask, I don't want to start a war.
> 
> Can someone here give me an honest opinion of how PostgresSQL (PG) is better 
> than Firebird on Windows?

A colleague of mine has made some benchmarks using those two:
http://www.1006.org/pg/postgresql_firebird_win_linux.pdf

He benchmarked inserts done through *his* own Delphi code varying a few
parameters. The servers run on Windows in all tests. The clients
were on Windows or Linux.

The summary is that PG beats FB performance-wise in all tests except
when you do many small transactions (autocommit on) with fsync on.

Bye, Chris.





---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] COPY insert performance

2005-07-25 Thread Chris Isaacson
Title: Message



I need COPY via 
libpqxx to insert millions of rows into two tables.  One table has roughly 
have as many rows and requires half the storage.  In production, the 
largest table will grow by ~30M rows/day.  To test the COPY performance I 
split my transactions into 10,000 rows.  I insert roughly 5000 rows 
into table A for every 10,000 rows into table B.
 
Table A has one 
unique index:
 
"order_main_pk" 
UNIQUE, btree (cl_ord_id)
 
Table B has 1 unique 
index and 2 non-unique indexes:
 
"order_transition_pk" UNIQUE, btree 
(collating_seq)"order_transition_ak2" btree 
(orig_cl_ord_id)"order_transition_ak3" btree (exec_id)
 
My testing 
environment is as follows:
-Postgresql 
8.0.1
-libpqxx 
2.5.0
-Linux 2.6.11.4-21.7-smp x86_64 

-Dual Opteron 
246
-System disk 
(postgres data resides on this SCSI disk) -  Seagate (ST373453LC) - 15K, 73 GB (http://www.seagate.com/cda/products/discsales/marketing/detail/0,1081,549,00.html)
-2nd logical disk - 10K, 36GB IBM SCSI 
(IC35L036UCDY10-0) - WAL reside on this disk
-NO RAID
 
PostgreSQL
Here are the results 
of copying in 10M rows as fast as possible: 
(10K/transaction)
Total 
Time:    1129.556 
s
Rows/sec: 
9899.922
Transaction>1.2s    
225

Transaction>1.5s 
77

Transaction>2.0s  
4
Max 
Transaction   2.325s
 
MySQL
I ran a similar test with MySQL 
4.1.10a (InnoDB) which produced these results: (I used MySQL's INSERT 
INTO x VALUES (1,2,3)(4,5,6)(...,...,...) syntax) 
(10K/transaction)
Total 
Time: 860.000 
s
Rows/sec:11627.91

Transaction>1.2s  
0

Transaction>1.5s  
0

Transaction>2.0s  
0
Max 
Transaction   1.175s
 
Considering the 
configurations shown below, can anyone offer advice to close the 15% gap and the 
much worse variability I'm experiencing.  
Thanks
 
My 
postgresql.conf has the following non-default 
values:
# 
-# PostgreSQL configuration file# 
-listen_addresses = '*' # what IP 
interface(s) to listen on; max_connections = 100
#---# 
RESOURCE USAGE (except 
WAL)#---shared_buffers 
= 65536  # min 16, at least max_connections*2, 8KB eachwork_mem = 
2048   # min 64, size in KBmaintenance_work_mem = 
204800 # min 1024, size in KBmax_fsm_pages = 225  # min 
max_fsm_relations*16, 6 bytes eachbgwriter_delay = 200  # 10-1 
milliseconds between roundsbgwriter_percent = 10  # 0-100% of 
dirty buffers in each roundbgwriter_maxpages = 1000 # 0-1000 buffers 
max per round
#---# 
WRITE AHEAD 
LOG#---fsync 
= false   # turns forced synchronization on or offwal_buffers 
= 64  # min 4, 8KB eachcheckpoint_segments = 40 # in logfile segments, 
min 1, 16MB eachcheckpoint_timeout = 600 # range 30-3600, in 
seconds
#---# 
QUERY 
TUNING#---effective_cache_size 
= 65536 # typically 8KB eachrandom_page_cost = 2  # units are 
one sequential page fetch cost
#---# 
ERROR REPORTING AND 
LOGGING#---   
log_min_duration_statement =    250 # -1 is disabled, in 
milliseconds.
log_connections = 
truelog_disconnections = truelog_duration = truelog_line_prefix = 
'<%r%u%p%t%d%%'  # e.g. '<%u%%%d> ' 
# %u=user name %d=database 
name# %r=remote host and 
port# %p=PID %t=timestamp %i=command 
tag# %c=session id %l=session line 
number# %s=session start timestamp %x=transaction 
id# %q=stop here in non-session 
processes# %%='%'log_statement = 
'none'  # none, mod, ddl, all
#---# 
RUNTIME 
STATISTICS#---# 
- Query/Index Statistics Collector -stats_start_collector = 
truestats_command_string = truestats_block_level = 
truestats_row_level = truestats_reset_on_server_start = 
true
 
My MySQL 
my.ini has the following non default 
values:
innodb_data_home_dir = 
/var/lib/mysql/innodb_data_file_path = 
ibdata1:10M:autoextendinnodb_log_group_home_dir = 
/var/lib/mysql/innodb_log_arch_dir = /var/lib/mysql/# You can set 
.._buffer_pool_size up to 50 - 80 %# of RAM but beware of setting memory 
usage too highinnodb_buffer_pool_size = 
512Minnodb_additional_mem_pool_size = 64M# Set .._log_file_size to 25 % 
of buffer pool sizeinnodb_log_file_size = 128Minnodb_log_buffer_size = 
64Minnodb_flush_log_at_trx_commit = 1innodb_lock_wait_timeout = 
50innodb_flush_method = O_DSYNC
max_allowed_packet = 16M
 
 
 


Re: [PERFORM] COPY insert performance

2005-07-26 Thread Chris Isaacson
I do not have any foreign keys and I need the indexes on during the
insert/copy b/c in production a few queries heavily dependent on the
indexes will be issued.  These queries will be infrequent, but must be
fast when issued.

I am using InnoDB with MySQL which appears to enforce true transaction
support.  (http://dev.mysql.com/doc/mysql/en/innodb-overview.html)  If
not, how is InnoDB "cheating"?

Sorry for the confusion, but pg_xlog is currently on a dedicated drive
(10K SCSI, see below).  Would I realize further gains if I had a third
drive and put the indexes on that drive?  

I've played with the checkpoint_segments.  I noticed an enormous
improvement increasing from the default to 40, but neglible improvement
thereafter.  Do you have a recommendation for a value?

My bg_writer adjustments were a last ditch effort.  I found your advice
correct and realized no gain.  I have not tested under a querying load
which is a good next step.  I had not thought of the comparative
degradation of MySQL vs. PostgreSQL.

Thanks for the tip on the RAM usage by indexes.  I was under the
incorrect assumption that shared_buffers would take care of this.  I'll
increase work_mem to 512MB and rerun my test.  I have 1G of RAM, which
is less than we'll be running in production (likely 2G).

-Original Message-
From: John A Meinel [mailto:[EMAIL PROTECTED] 
Sent: Monday, July 25, 2005 6:09 PM
To: Chris Isaacson; Postgresql Performance
Subject: Re: [PERFORM] COPY insert performance


Chris Isaacson wrote:
> I need COPY via libpqxx to insert millions of rows into two tables.  
> One table has roughly have as many rows and requires half the storage.

> In production, the largest table will grow by ~30M rows/day.  To test 
> the COPY performance I split my transactions into 10,000 rows.  I 
> insert roughly 5000 rows into table A for every 10,000 rows into table

> B.
>
> Table A has one unique index:
>
> "order_main_pk" UNIQUE, btree (cl_ord_id)
>
> Table B has 1 unique index and 2 non-unique indexes:
>
> "order_transition_pk" UNIQUE, btree (collating_seq) 
> "order_transition_ak2" btree (orig_cl_ord_id) "order_transition_ak3" 
> btree (exec_id)

Do you have any foreign key references?
If you are creating a table for the first time (or loading a large
fraction of the data), it is common to drop the indexes and foreign keys
first, and then insert/copy, and then drop them again.

Is InnoDB the backend with referential integrity, and true transaction
support? I believe the default backend does not support either (so it is
"cheating" to give you speed, which may be just fine for your needs,
especially since you are willing to run fsync=false).

I think moving pg_xlog to a dedicated drive (set of drives) could help
your performance. As well as increasing checkpoint_segments.

I don't know if you gain much by changing the bg_writer settings, if you
are streaming everything in at once, you probably want to have it
written out right away. My understanding is that bg_writer settings are
for the case where you have mixed read and writes going on at the same
time, and you want to make sure that the reads have time to execute (ie
the writes are not saturating your IO).

Also, is any of this tested under load? Having a separate process issue
queries while you are loading in data. Traditionally MySQL is faster
with a single process inserting/querying for data, but once you have
multiple processes hitting it at the same time, it's performance
degrades much faster than postgres.

You also seem to be giving MySQL 512M of ram to work with, while only
giving 2M/200M to postgres. (re)creating indexes uses
maintenance_work_mem, but updating indexes could easily use work_mem.
You may be RAM starved.

John
=:->


>
> My testing environment is as follows:
> -Postgresql 8.0.1
> -libpqxx 2.5.0
> -Linux 2.6.11.4-21.7-smp x86_64
> -Dual Opteron 246
> -System disk (postgres data resides on this SCSI disk) -  Seagate
> (ST373453LC) - 15K, 73 GB
> (http://www.seagate.com/cda/products/discsales/marketing/detail/0,1081
> ,549,00.html)
> -2nd logical disk - 10K, 36GB IBM SCSI (IC35L036UCDY10-0) - WAL reside
> on this disk
> -NO RAID
>
> *PostgreSQL*
> Here are the results of copying in 10M rows as fast as possible:
> (10K/transaction)
> Total Time:1129.556 s
> Rows/sec: 9899.922
> Transaction>1.2s225
> Transaction>1.5s 77
> Transaction>2.0s  4
> Max Transaction   2.325s
>
> **MySQL**
> **I ran a similar test with MySQL 4.1.10a (InnoDB) which produced 
> these
> results: (I used MySQL's INSERT INTO x VALUES
> (1,2,3)(4,5,6)(...,...,...) syntax) (10K/transaction)
> Total Time: 860.000 s
> Rows/sec:11627.91
> Transaction>1.2s  0
> Transaction>1.

Re: [PERFORM] COPY insert performance

2005-07-26 Thread Chris Isaacson
John,

(FYI: got a failed to deliver to [EMAIL PROTECTED])

I do not have any foreign keys and I need the indexes on during the
insert/copy b/c in production a few queries heavily dependent on the
indexes will be issued.  These queries will be infrequent, but must be
fast when issued.

I am using InnoDB with MySQL which appears to enforce true transaction
support.  (http://dev.mysql.com/doc/mysql/en/innodb-overview.html)  If
not, how is InnoDB "cheating"?

Sorry for the confusion, but pg_xlog is currently on a dedicated drive
(10K SCSI, see below).  Would I realize further gains if I had a third
drive and put the indexes on that drive? =20

I've played with the checkpoint_segments.  I noticed an enormous
improvement increasing from the default to 40, but neglible improvement
thereafter.  Do you have a recommendation for a value?

My bg_writer adjustments were a last ditch effort.  I found your advice
correct and realized no gain.  I have not tested under a querying load
which is a good next step.  I had not thought of the comparative
degradation of MySQL vs. PostgreSQL.

Thanks for the tip on the RAM usage by indexes.  I was under the
incorrect assumption that shared_buffers would take care of this.  I'll
increase work_mem to 512MB and rerun my test.  I have 1G of RAM, which
is less than we'll be running in production (likely 2G).

-Chris

-Original Message-
From: John A Meinel [mailto:[EMAIL PROTECTED] 
Sent: Monday, July 25, 2005 6:09 PM
To: Chris Isaacson; Postgresql Performance
Subject: Re: [PERFORM] COPY insert performance


Chris Isaacson wrote:
> I need COPY via libpqxx to insert millions of rows into two tables.  
> One table has roughly have as many rows and requires half the storage.

> In production, the largest table will grow by ~30M rows/day.  To test 
> the COPY performance I split my transactions into 10,000 rows.  I 
> insert roughly 5000 rows into table A for every 10,000 rows into table

> B.
>
> Table A has one unique index:
>
> "order_main_pk" UNIQUE, btree (cl_ord_id)
>
> Table B has 1 unique index and 2 non-unique indexes:
>
> "order_transition_pk" UNIQUE, btree (collating_seq) 
> "order_transition_ak2" btree (orig_cl_ord_id) "order_transition_ak3" 
> btree (exec_id)

Do you have any foreign key references?
If you are creating a table for the first time (or loading a large
fraction of the data), it is common to drop the indexes and foreign keys
first, and then insert/copy, and then drop them again.

Is InnoDB the backend with referential integrity, and true transaction
support? I believe the default backend does not support either (so it is
"cheating" to give you speed, which may be just fine for your needs,
especially since you are willing to run fsync=false).

I think moving pg_xlog to a dedicated drive (set of drives) could help
your performance. As well as increasing checkpoint_segments.

I don't know if you gain much by changing the bg_writer settings, if you
are streaming everything in at once, you probably want to have it
written out right away. My understanding is that bg_writer settings are
for the case where you have mixed read and writes going on at the same
time, and you want to make sure that the reads have time to execute (ie
the writes are not saturating your IO).

Also, is any of this tested under load? Having a separate process issue
queries while you are loading in data. Traditionally MySQL is faster
with a single process inserting/querying for data, but once you have
multiple processes hitting it at the same time, it's performance
degrades much faster than postgres.

You also seem to be giving MySQL 512M of ram to work with, while only
giving 2M/200M to postgres. (re)creating indexes uses
maintenance_work_mem, but updating indexes could easily use work_mem.
You may be RAM starved.

John
=:->


>
> My testing environment is as follows:
> -Postgresql 8.0.1
> -libpqxx 2.5.0
> -Linux 2.6.11.4-21.7-smp x86_64
> -Dual Opteron 246
> -System disk (postgres data resides on this SCSI disk) -  Seagate
> (ST373453LC) - 15K, 73 GB
> (http://www.seagate.com/cda/products/discsales/marketing/detail/0,1081
> ,549,00.html)
> -2nd logical disk - 10K, 36GB IBM SCSI (IC35L036UCDY10-0) - WAL reside
> on this disk
> -NO RAID
>
> *PostgreSQL*
> Here are the results of copying in 10M rows as fast as possible:
> (10K/transaction)
> Total Time:1129.556 s
> Rows/sec: 9899.922
> Transaction>1.2s225
> Transaction>1.5s 77
> Transaction>2.0s  4
> Max Transaction   2.325s
>
> **MySQL**
> **I ran a similar test with MySQL 4.1.10a (InnoDB) which produced 
> these
> results: (I used MySQL's INSERT INTO x VALUES
> (1,2,3)(4,5,6)(...,...,...) syntax) (10K/transaction)
> Total Time: 860.000 s
> Rows/s

Re: [PERFORM] COPY insert performance

2005-07-26 Thread Chris Isaacson
I need the chunks for each table COPYed within the same transaction
which is why I'm not COPYing concurrently via multiple
threads/processes.  I will experiment w/o OID's and decreasing the
shared_buffers and wal_buffers.

Thanks,
Chris

-Original Message-
From: Gavin Sherry [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, July 26, 2005 7:12 AM
To: Chris Isaacson
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] COPY insert performance


Hi Chris,

Have you considered breaking the data into multiple chunks and COPYing
each concurrently?

Also, have you ensured that your table isn't storing OIDs?

On Mon, 25 Jul 2005, Chris Isaacson wrote:

> #-
> --
> 
> # RESOURCE USAGE (except WAL)
>
#---
> 
> shared_buffers = 65536  # min 16, at least max_connections*2, 8KB each

shared_buffers that high has been shown to affect performance. Try
12000.

> wal_buffers = 64  # min 4, 8KB each

Increasing wal_buffers can also have an effect on performance.

Thanks,

Gavin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Cheap RAM disk?

2005-07-26 Thread Chris Browne
[EMAIL PROTECTED] (John A Meinel) writes:
> I saw a review of a relatively inexpensive RAM disk over at
> anandtech.com, the Gigabyte i-RAM
> http://www.anandtech.com/storage/showdoc.aspx?i=2480

And the review shows that it's not *all* that valuable for many of the
cases they looked at.

> Basically, it is a PCI card, which takes standard DDR RAM, and has a
> SATA port on it, so that to the system, it looks like a normal SATA
> drive.
>
> The card costs about $100-150, and you fill it with your own ram, so
> for a 4GB (max size) disk, it costs around $500. Looking for solid
> state storage devices, the cheapest I found was around $5k for 2GB.
>
> Gigabyte claims that the battery backup can last up to 16h, which
> seems decent, if not really long (the $5k solution has a built-in
> harddrive so that if the power goes out, it uses the battery power to
> copy the ramdisk onto the harddrive for more permanent storage).
>
> Anyway, would something like this be reasonable as a drive for storing
> pg_xlog? With 4GB you could have as many as 256 checkpoint segments.
>
> I'm a little leary as it is definitely a version 1.0 product (it is
> still using an FPGA as the controller, so they were obviously pushing
> to get the card into production).

What disappoints me is that nobody has tried the CF/RAM answer; rather
than putting a hard drive on the board, you put on some form of flash
device (CompactFlash or such), where if power fails, it pushes data
onto the CF.  That ought to be cheaper (both in terms of hardware cost
and power consumption) than using a hard disk.

> But it seems like this might be a decent way to improve insert
> performance, without setting fsync=false.

That's the case which might prove Ludicrously Quicker than any of the
sample cases in the review.

> Probably it should see some serious testing (as in power spikes/pulled
> plugs, etc). I know the article made some claim that if you actually
> pull out the card it goes into "high consumption mode" which is
> somehow greater than if you leave it in the slot with the power
> off. Which to me seems like a lot of bull, and really means the 16h is
> only under best-case circumstances. But even 1-2h is sufficient to
> handle a simple power outage.

Certainly.

> Anyway, I thought I would mention it to the list, to see if anyone
> else has heard of it, or has any thoughts on the matter. I'm sure
> there are some people who are using more expensive ram disks, maybe
> they have some ideas about what this device is missing. (other than
> costing about 1/10th the price)

Well, if it hits a "2.0" version, it may get interesting...
-- 
(format nil "[EMAIL PROTECTED]" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/sap.html
Rules of the Evil Overlord #78.  "I will not tell my Legions of Terror
"And he must  be taken alive!" The command will be:  ``And try to take
him alive if it is reasonably practical.''"


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Cheap RAM disk?

2005-07-26 Thread Chris Browne
[EMAIL PROTECTED] ("Jeffrey W. Baker") writes:
> I haven't tried this product, but the microbenchmarks seem truly
> slow.  I think you would get a similar benefit by simply sticking a
> 1GB or 2GB DIMM -- battery-backed, of course -- in your RAID
> controller.

Well, the microbenchmarks were pretty pre-sophomoric, essentially
trying to express how the device would be useful to a Windows user
that *might* play games...

I'm sure it's hurt by the fact that it's using a SATA ("version 1")
interface rather than something faster.

Mind you, I'd like to see the product succeed, because they might come
up with a "version 2" of it that is what I'd really like...
-- 
(format nil "[EMAIL PROTECTED]" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/sap.html
Rules of the Evil Overlord #78.  "I will not tell my Legions of Terror
"And he must  be taken alive!" The command will be:  ``And try to take
him alive if it is reasonably practical.''"


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


  1   2   3   4   5   >