[PERFORM] dblink /synonyms?

2008-09-03 Thread Jan-Peter Seifert
Hello,

we had problems with queries via dblink needing more than three times as long 
as without dblink on the same server. This has been tested with v8.3.3 as well. 
Is this still an issue?: 
http://cunha17.cristianoduarte.pro.br/postgresql/snapshots.en_us.php

"But the two suffer from a severe problem: they bring the whole result from the 
remote query into the local database server. "

Is anybody maintaining dblink (who?)?

Are there other solutions for connecting two dbs in the work (like synonyms)?

There is another "problem". It's difficult to compare the performance of the 
queries because the server is caching the queries and dblink is using the same 
cached querie results as well. Can you 'flush' the results or prevent the 
results from being cached for being reused? Is Explain Analyze really 'stable' 
for comparing purposes?

Thank you very much,

Peter
-- 
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten 
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Partitions number limitation ?

2008-09-03 Thread s . caillet
Hi !

To improve our software insert datas performance, we have decided to use
partition architecture in our database. It works well for the following volume :
45 main tables and 288 inherited daughter tables for each, that is a total of
12960 partitions.

I have some trouble now with another customer's database with the following
volume : 87 main tables and 288 tables for each, that is 25056 partitions.

Is there some kind of limit in postgresql about the number of partitions ? Do
you know some tuning in the conf files to improve postgresql management of so
many tables ? I have already used different tablespaces, one for each main table
and its 288 partitions.

I have many datas to insert into these tables (that is ten or hundred of
thousands every five minutes for each main group). Do you think it can be done
with a hardware raid 0 SATA disk (it's the case today ...) ?

Thank you all for help.

Best regards

Sylvain Caillet

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] SAN and full_page_writes

2008-09-03 Thread Nikolas Everett
I have the honor to be configuring Postgres to back into a NetApp FAS3020
via fiber.

Does anyone know if the SAN protects me from breakage due to partial page
writes?

If anyone has an SAN specific postgres knowledge, I'd love to hear your
words of wisdom.

For reference:
[EMAIL PROTECTED] bonnie]$ ~neverett/bonnie++-1.03a/bonnie++
Writing with putc()...done
Writing intelligently...done
Rewriting...done
Reading with getc()...done
Reading intelligently...done
start 'em...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.03   --Sequential Output-- --Sequential Input-
--Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
%CP
localhost.lo 32104M 81299  94 149848  30 42747   8 45465  61 55528   4
495.5   0
--Sequential Create-- Random
Create
-Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--
  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec
%CP
 16 + +++ + +++ + +++ + +++ + +++ +
+++


Re: [PERFORM] Partitions number limitation ?

2008-09-03 Thread Scott Marlowe
On Wed, Sep 3, 2008 at 4:00 AM,  <[EMAIL PROTECTED]> wrote:
> Hi !
>
> To improve our software insert datas performance, we have decided to use
> partition architecture in our database. It works well for the following 
> volume :
> 45 main tables and 288 inherited daughter tables for each, that is a total of
> 12960 partitions.
>
> I have some trouble now with another customer's database with the following
> volume : 87 main tables and 288 tables for each, that is 25056 partitions.
>
> Is there some kind of limit in postgresql about the number of partitions ? Do
> you know some tuning in the conf files to improve postgresql management of so
> many tables ? I have already used different tablespaces, one for each main 
> table
> and its 288 partitions.

What do you mean PostgreSQL management of the partitions.  Triggers,
rules, application based partitioning?  Rules aren't fast enough with
this many partitions and if you can easily add partitioning in your
application and write directly to the right child table it might be
much faster.

The size of your disk array depends very much on how quickly you'll be
saturating your CPUS, either in the app layer or CPU layer to maintain
your partitioning.  If an app needs to insert 100,000 rows into ONE
partition, and it knows which one, it's likely to be way faster to
have the app do it instead of pgsql.  The app has to think once, the
database 100,000 times.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] limit clause breaks query planner?

2008-09-03 Thread Matt Smiley
"Tom Lane" <[EMAIL PROTECTED]> writes:
>  "Matt Smiley" <[EMAIL PROTECTED]> writes:
>  >  So an Index Scan is always going to have a higher cost estimate than
>  >  an equivalent Seq Scan returning the same result rows (unless
>  >  random_page_cost is < 1).  That's why I think the planner is always
>  >  preferring the plan that uses a Seq Scan.
>  
>  If that were the case, we'd never choose an indexscan at all...

You're right, that was a silly guess.

>  It's true that a plain indexscan is not preferred for queries that will
>  return a large fraction of the table.  However, it should be willing to
>  use a bitmap scan for this query, given default cost settings (the
>  default cost settings will cause it to prefer bitmap scan for retrieving
>  up to about a third of the table, in my experience).  I too am confused
>  about why it doesn't prefer that choice in the OP's example.

It looks like the bitmap scan has a higher cost estimate because the entire 
bitmap index must be built before beginning the heap scan and returning rows up 
the pipeline.  The row-count limit can't be pushed lower than the 
bitmap-heap-scan like it can for the basic index-scan.

test_8_3_3=# set enable_seqscan = false ;
SET

test_8_3_3=# set enable_indexscan = false ;
SET

test_8_3_3=# explain analyze select * from my_table where a is null and b = 3 
limit 15 ;
   QUERY PLAN

 Limit  (cost=17070.22..17071.02 rows=15 width=8) (actual time=606.902..607.086 
rows=15 loops=1)
   ->  Bitmap Heap Scan on my_table  (cost=17070.22..69478.96 rows=988217 
width=8) (actual time=606.892..606.983 rows=15 loops=1)
 Recheck Cond: (b = 3)
 Filter: (a IS NULL)
 ->  Bitmap Index Scan on idx_b  (cost=0.00..16823.17 rows=109 
width=0) (actual time=592.657..592.657 rows=100 loops=1)
   Index Cond: (b = 3)
 Total runtime: 607.340 ms
(7 rows)


>  It would be interesting to alter the random_page_cost setting and see if he 
> gets
>  different results.

Using an unmodified postgresql.conf, the cost estimate for an index-scan were 
so much higher than for a seqscan that random_page_cost had to be set below 0.2 
before the index-scan was preferred.  However, it looks like this was mainly 
because effective_cache_size was too small.  The planner thought the cache was 
only 128 MB, and the size of the complete table+index was 39492 + 21946 pages * 
8 KB/block = 330 MB.  It makes sense for the cost estimate to be so much higher 
if blocks are expected to be repeatedly re-fetched from disk.  I wonder if 
David's effective_cache_size is too small.

test_8_3_3=# reset all ;
RESET

test_8_3_3=# explain analyze select * from my_table where a is null and b = 3 
limit 15 ;
QUERY PLAN
---
 Limit  (cost=0.00..2.50 rows=15 width=8) (actual time=0.036..0.239 rows=15 
loops=1)
   ->  Seq Scan on my_table  (cost=0.00..164492.74 rows=988217 width=8) (actual 
time=0.028..0.138 rows=15 loops=1)
 Filter: ((a IS NULL) AND (b = 3))
 Total runtime: 0.338 ms
(4 rows)

test_8_3_3=# set enable_seqscan = false ;
SET

test_8_3_3=# show random_page_cost ;
 random_page_cost
--
 4
(1 row)

test_8_3_3=# explain analyze select * from my_table where a is null and b = 3 
limit 15 ;
QUERY PLAN
--
 Limit  (cost=0.00..45.99 rows=15 width=8) (actual time=0.051..0.200 rows=15 
loops=1)
   ->  Index Scan using idx_b on my_table  (cost=0.00..3029924.36 rows=988217 
width=8) (actual time=0.043..0.100 rows=15 loops=1)
 Index Cond: (b = 3)
 Filter: (a IS NULL)
 Total runtime: 0.308 ms
(5 rows)

test_8_3_3=# set random_page_cost = 0.19 ;
SET
test_8_3_3=# explain analyze select * from my_table where a is null and b = 3 
limit 15 ;
   QUERY PLAN
-
 Limit  (cost=0.00..2.45 rows=15 width=8) (actual time=0.050..0.201 rows=15 
loops=1)
   ->  Index Scan using idx_b on my_table  (cost=0.00..161190.65 rows=988217 
width=8) (actual time=0.042..0.097 rows=15 loops=1)
 Index Cond: (b = 3)
 Filter: (a IS NULL)
 Total runtime: 0.307 ms
(5 rows)


Now fix effective_cache_size and try again.

test_8_3_3=# reset all ;
RESET

test_8_3_3=# set effective_cache_size = '500MB' ;
SET
test_8_3_3=# set enable_seqscan = false ;
SET
test_8_3_3=# explain analyze select * from my_table where a is null and b =