Re: [PERFORM] Databases Vs. Schemas

2004-03-26 Thread CoL
hi

Josh Berkus wrote:

Stalin,


We are evaluating the options for having multiple databases vs. schemas on a
single database cluster for a custom grown app that we developed. Each app
installs same set of tables for each service. And the service could easily
be in thousands. so Is it better to have 1000 databases vs 1000 schemas in a
database cluster. What are the performance overhead of having multiple
databases vs. schemas (if any). I'm leaning towards having schemas rather
than databases but i would like to get others opinion on this. Appreciate
your reply.


No performance difference AFAIK.   The real question is whether you have to 
have queries joining several "databases".   If yes, use Schema; if no, use 
databases.
don't forget the pg_hba.conf :) You need 1000 declaration. Was a thread 
before, title: performance problem - 10.000 databases
Check this:
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=1068039213.28814.116.camel%40franki-laptop.tpi.pl&rnum=10&prev=/groups%3Fq%3D1000%2Bdatabase%2Bgroup:comp.databases.postgresql.*%26hl%3Den%26lr%3D%26ie%3DUTF-8%26group%3Dcomp.databases.postgresql.*%26selm%3D1068039213.28814.116.camel%2540franki-laptop.tpi.pl%26rnum%3D10

C.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Looking for ideas on how to speed up warehouse loading

2004-04-26 Thread CoL
hi,
Sean Shanny wrote, On 4/22/2004 23:56:
SELECT t1.id, t2.url FROM referral_temp t2 LEFT OUTER JOIN d_referral t1 
ON t2.url = t1.referral_raw_url ORDER BY t1.id
index on url (text) has no sense. Try to use and md5 (char(32) column) 
which contains the md5 hash of url field. and join these ones. You can 
have a better index on this char 32 field.

do not forget to analyze the tables after data load, and you can fine 
tune you postgresql.conf, default_statistics_target for better index 
info, and others.
check this info pages:
http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

C.
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Mysterious performance of query because of plsql function in

2004-07-07 Thread CoL
hi,
Peter Alberer wrote:
Hi there,
i have a problem with a query that uses the result of a plsql function
In
the where clause:
SELECT
   assignments.assignment_id,
   assignments.package_id AS package_id,
   assignments.title AS title,
   COUNT(*) AS Count
FROM
   assignments INNER JOIN submissions ON
   (assignments.assignment_id=submissions.assignment_id)
WHERE
   package_id=949589 AND
   submission_status(submissions.submission_id)='closed'
GROUP BY
   assignments.assignment_id, assignments.package_id, assignments.title
ORDER BY
   assignments.title;
Postgres seems to execute the function "submission_status" for every row
of
the submissions table (~1500 rows). 
what is submission_status actualy?
\df submission_status
Is the function submission_status  called stable?
C.
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] The black art of postgresql.conf tweaking

2004-08-10 Thread CoL
hi,
Paul Serby wrote:
Can anyone give a good reference site/book for getting the most out of 
your postgres server.

All I can find is contradicting theories on how to work out your settings.
This is what I followed to setup our db server that serves our web 
applications.

http://www.phpbuilder.com/columns/smith20010821.php3?page=2
We have a Dell Poweredge with the following spec.
CPU: Intel(R) Xeon(TM) CPU 3.06GHz (512 KB Cache)
CPU: Intel(R) Xeon(TM) CPU 3.06GHz (512 KB Cache)
CPU: Intel(R) Xeon(TM) CPU 3.06GHz (512 KB Cache)
CPU: Intel(R) Xeon(TM) CPU 3.06GHz (512 KB Cache)
Physical Memory: 2077264 kB
Swap Memory: 2048244 kB
Apache on the Web server can take up to 300 connections and PHP is using 
 pg_pconnect

Postgres is set with the following.
max_connections = 300
shared_buffers = 38400
sort_mem = 12000
But Apache is still maxing out the non-super user connection limit.
The machine is under no load and I would like to up the max_connections 
but I would like to know more about what you need to consider before 
doing so.
One more: In php.ini, set the pgsql.max_persistent lower then 300
; Maximum number of persistent links.  -1 means no limit.
pgsql.max_persistent = -1 -> change this
C.
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Which plattform do you recommend I run PostgreSQL for best

2004-10-19 Thread CoL
hi,
[EMAIL PROTECTED] wrote:
Hello
I am doing a comparison between MySQL and PostgreSQL.
In the MySQL manual it says that MySQL performs best with Linux 2.4 with
ReiserFS on x86. Can anyone official, or in the know, give similar
information regarding PostgreSQL?
Also, any links to benchmarking tests available on the internet between
MySQL and PostgreSQL would be appreciated.
http://www.potentialtech.com/wmoran/postgresql.php
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
http://candle.pha.pa.us/main/writings/pgsql/hw_performance/
http://database.sarang.net/database/postgres/optimizing_postgresql.html
C.
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Select max(foo) and select count(*) optimization

2004-01-08 Thread CoL
Hi,

Shridhar Daithankar wrote:

select relpages,reltuples from pg_class where relname=;

Assuming the stats are recent enough, it would be much faster and accurate..
this needs an analyze ; before select from pg_class, cause 
only after analyze will update pg the pg_class

C.

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] COUNT & Pagination

2004-01-16 Thread CoL
Hi,

David Shadovitz wrote, On 1/11/2004 7:10 PM:

I understand that COUNT queries are expensive.  So I'm looking for advice on 
displaying paginated query results.

I display my query results like this:

  Displaying 1 to 50 of 2905.
  1-50 | 51-100 | 101-150 | etc.
I do this by executing two queries.  One is of the form:

  SELECT  FROM  WHERE  LIMIT m OFFSET n

The other is identical except that I replace the select list with COUNT(*).
yes, you need 2 query. Or select it from one:
select *, (select count(*) from table) as count from table...
pg will optimize this query, and do the count only once

And an unrelated question:
I'm running PG 7.2.2 and want to upgrade to 7.4.1.  I've never upgraded PG 
before and I'm nervous.  Can I simply run pg_dumpall, install 7.4.1, and then 
feed the dump into psql?  I'm planning to use pg_dumpall rather than pg_dump 
because I want to preserve the users I've defined.  My database is the only one 
on the system.
yes. But check tha faq and the manual for a better explain.

C.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] subquery and table join, index not use for table

2004-01-16 Thread CoL
Hi, I have to following select:

set enable_seqscan = on;
set enable_indexscan =on;
select a.levelno,a.id from (select 1 as levelno,42 as id) a, menutable b 
where  b.site_id='21' and a.id=b.id;

menutable:
id bigint,
site_id bigint
Indexes: menutable_pkey primary key btree (site_id, id),

The explain analyze shows:

 QUERY PLAN

 Nested Loop  (cost=0.00..13.50 rows=1 width=34) (actual 
time=0.04..0.43 rows=1 loops=1)
   Join Filter: ("outer".id = "inner".id)
   ->  Subquery Scan a  (cost=0.00..0.01 rows=1 width=0) (actual 
time=0.01..0.01 rows=1 loops=1)
 ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual 
time=0.00..0.00 rows=1 loops=1)
   ->  Seq Scan on menutable b  (cost=0.00..13.01 rows=38 width=22) 
(actual time=0.02..0.38 rows=38 loops=1)
 Filter: (site_id = 21::bigint)
 Total runtime: 0.47 msec

setting set enable_seqscan = off;

QUERY PLAN
--
 Nested Loop  (cost=0.00..29.85 rows=1 width=34) (actual 
time=0.07..0.18 rows=1 loops=1)
   Join Filter: ("outer".id = "inner".id)
   ->  Subquery Scan a  (cost=0.00..0.01 rows=1 width=0) (actual 
time=0.01..0.01 rows=1 loops=1)
 ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual 
time=0.00..0.00 rows=1 loops=1)
   ->  Index Scan using menutable_pkey on menutable b 
(cost=0.00..29.36 rows=38 width=22) (actual time=0.02..0.12 rows=38 loops=1)
 Index Cond: (site_id = 21::bigint)
 Total runtime: 0.22 msec

I do analyze, vacumm full analyze on table but nothing changed. The same 
plan in case of join syntax.

version: PostgreSQL 7.3.3  and PostgreSQL 7.3.4

Any idea?
thx
C.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Feature request: smarter use of conditional indexes

2004-03-09 Thread CoL
hi,

John Siracusa wrote, On 3/3/2004 20:56:

Given an index like this:

CREATE UNIQUE INDEX i1 ON t1 (c1) WHERE c1 IS NOT NULL;

and a query like this:

SELECT * FROM t1 WHERE c1 = 123;

I'd like the planner to be smart enough to use an index scan using i1.  Yes,
I can change the query to this:
SELECT * FROM t1 WHERE c1 = 123 AND c1 IS NOT NULL;

In which case the index will be used, but I shouldn't have to.  More
practically, since a lot of my SQL is auto-generated, it's difficult to make
this query change just in the cases where I need it.  And I'm loathe to
change every "column = value" pair in my auto-generated SQL into a double
pair of "(column = value and column is not null)"  It's redundant and looks
pretty silly, IMO.
how about: CREATE UNIQUE INDEX i1 ON t1 (c1);
WHERE c1 IS NOT NULL in this case what is the point of doing this?
You do not need this condition.
C.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster