[PERFORM] Unused table of view

2004-06-05 Thread Laurent Martelli
Hello,

I'm using postgresql 7.4.2, and I have this view:
slooze=# \d userpictures2
 Vue «public.userpictures2»
   Colonne   |   Type   | Modificateurs 
-+--+---
 pictureid   | integer  | 
 rollid  | character varying(64)| 
 frameid | character varying(64)| 
 description | character varying(255)   | 
 filename| character varying(255)   | 
 owner   | integer  | 
 entrydate   | timestamp with time zone | 
 date| timestamp with time zone | 
 nbclick | integer  | 
 nbrates | integer  | 
 maxrate | smallint | 
 minrate | smallint | 
 averagerate | real | 
 sumrates| integer  | 
 userid  | integer  | 
Définition de la vue
 SELECT DISTINCT ON (permissions.pictureid, userid) pictures.pictureid, rollid, 
frameid, description, filename, "owner", entrydate, date, nbclick, nbrates, maxrate, 
minrate, averagerate, sumrates, userid
   FROM permissions
   JOIN groupsdef USING (groupid)
   JOIN pictures USING (pictureid)
  WHERE groupsdef.groupid = permissions.groupid
  ORDER BY permissions.pictureid, userid;


Now consider this query:

SELECT count(*) FROM userpictures JOIN topicscontent using(PictureID) WHERE 
TopicID=137 and UserID=2;

The pictures table is scanned, but it's not needed. (see plan at the end).

I believe it's not need because my tables are as follow:

CREATE TABLE pictures (
PictureID serial PRIMARY KEY,
RollID character varying(64) NOT NULL REFERENCES rolls,
FrameID character varying(64) NOT NULL,
Description character varying(255),
Filename character varying(255),
Owner integer NOT NULL REFERENCES users,
EntryDate datetime DEFAULT now(),
Date datetime,
NbClick integer DEFAULT 0,
NbRates integer DEFAULT 0,
MaxRate int2,
MinRate int2,
AverageRate float4 DEFAULT 5,
SumRates integer DEFAULT 0);

CREATE TABLE permissions (
GroupID integer NOT NULL REFERENCES groups ON DELETE cascade,
PictureID integer NOT NULL REFERENCES pictures ON DELETE cascade,
UNIQUE (GroupID, PictureID));

CREATE TABLE groupsdef (
UserID integer REFERENCES users,
GroupID integer REFERENCES groups,
PRIMARY KEY (UserID,GroupID));

CREATE TABLE topicscontent (
TopicID integer REFERENCES topics ON DELETE cascade,
PictureID integer REFERENCES pictures ON DELETE cascade,
Direct boolean NOT NULL,
PRIMARY KEY (TopicID,PictureID) );

So obviously, the join on pictures is not adding any rows, since
permissions.PictureID references pictures.PictureID and
pictures.PictureID is the primary key. 

I can workaround with a second view:

slooze=# \d userpictures2
 Vue «public.userpictures2»
  Colonne  |  Type   | Modificateurs 
---+-+---
 pictureid | integer | 
 userid| integer | 
Définition de la vue
 SELECT DISTINCT pictureid, userid
   FROM permissions
   JOIN groupsdef USING (groupid)
  WHERE groupsdef.groupid = permissions.groupid
  ORDER BY pictureid, userid;

But it would be better if Postgresql could figure it out itself. Is
there a way to currently avoid the 2nd view ?

QUERY PLAN for SELECT count(*) FROM userpictures JOIN topicscontent using(PictureID) 
WHERE TopicID=137 and UserID=2;

 Aggregate  (cost=1195.15..1195.15 rows=1 width=0) (actual time=89.252..89.253 rows=1 
loops=1)
   ->  Merge Join  (cost=1096.05..1194.98 rows=66 width=0) (actual time=84.574..89.202 
rows=8 loops=1)
 Merge Cond: ("outer".pictureid = "inner".pictureid)
 ->  Subquery Scan userpictures  (cost=995.78..1081.47 rows=4897 width=4) 
(actual time=84.386..88.530 rows=841 loops=1)
   ->  Unique  (cost=995.78..1032.50 rows=4897 width=105) (actual 
time=84.377..87.803 rows=841 loops=1)
 ->  Sort  (cost=995.78..1008.02 rows=4897 width=105) (actual 
time=84.369..84.786 rows=1433 loops=1)
   Sort Key: permissions.pictureid, groupsdef.userid
   ->  Hash Join  (cost=371.82..695.65 rows=4897 width=105) 
(actual time=23.328..56.498 rows=5076 loops=1)
 Hash Cond: ("outer".pictureid = "inner".pictureid)
 ->  Index Scan using pictures_pkey on pictures  
(cost=0.00..164.87 rows=2933 width=97) (actual time=0.015..4.591 rows=2933 loops=1)
 ->  Hash  (cost=359.58..359.58 rows=4897 width=8) 
(actual time=23.191..23.191 rows=0 loops=1)
   ->  Merge Join  (cost=10.16..359.58 rows=4897 
width=8) (actual time=0.110..19.365 rows=5076 loops=1)

Re: [PERFORM] Unused table of view

2004-06-05 Thread Tom Lane
Laurent Martelli <[EMAIL PROTECTED]> writes:
> The pictures table is scanned, but it's not needed.

Yes it is.  For example, if pictures is empty then the view yields
zero rows.  Omitting the join to pictures could give a different result.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Unused table of view

2004-06-05 Thread Laurent Martelli
> "Tom" == Tom Lane <[EMAIL PROTECTED]> writes:

  Tom> Laurent Martelli <[EMAIL PROTECTED]> writes:
  >> The pictures table is scanned, but it's not needed.

  Tom> Yes it is.  For example, if pictures is empty then the view
  Tom> yields zero rows.  Omitting the join to pictures could give a
  Tom> different result.

Since Permission is like this:

CREATE TABLE permissions (
GroupID integer NOT NULL REFERENCES groups ON DELETE cascade,
PictureID integer NOT NULL REFERENCES pictures ON DELETE cascade,
UNIQUE (GroupID, PictureID));

if the pictures table is empty, so is permissions, because
permissions.PictureID references pictures. 

-- 
Laurent Martelli
[EMAIL PROTECTED]Java Aspect Components
http://www.aopsys.com/  http://jac.objectweb.org


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] Using a COPY...FROM through JDBC?

2004-06-05 Thread Steve Wampler

I've got a simple database (no indices, 6 columns) that I need
to write data quickly into through JDBC connections from
multiple such connections simultaneously in a distributed
environment.  (This is going to be a message logging service
for software generated messages.)

Using a PreparedStatement, I can get about 400/s inserted.  If I
(on the java side) buffer up the entries and dump them in large
transaction blocks I can push this up to about 1200/s.  I'd
like to go faster.  One approach that I think might be
promising would be to try using a COPY command instead of
an INSERT, but I don't have a file for input, I have a 
Java collection, so COPY isn't quite right.  Is there anyway to
efficiently use COPY without having to create a file (remember
that the java apps are distributed on a LAN and aren't running
on the DB server.)  Is this a dead end because of the way
COPY is implemented to only use a file?

Is there something else I can do?  Ultimately, this will end
up on a machine running 1+0 RAID, so I expect that will give
me some performance boost as well, but I'd like to push it
up as best I can with my current hardware setup.

Thanks for any advice!
-Steve
-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Slow in morning hours

2004-06-05 Thread ohp
Have you tried VACUUM ANALYZE at least one a day?

Regards
On Fri, 20 Feb 2004 [EMAIL PROTECTED] wrote:

> Date: Fri, 20 Feb 2004 14:46:15 +0530
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: [PERFORM] Slow in morning hours
>
> Hi All,
>
> I am using Linux 7.2 and postgresql 7.2.
>
> Our Office hours are over at 6pm but we use to keep our server
> running 24 hours a day.  On the second day morning, Our PGSQL
> Server becomes very slow.
>
> After continuous usage of one hour, It gradually starts responding
> faster ! This has become every day routine !
>
> do u have any idea related to this  Is there any other reason that I
> need to check up?
>
> Please any any idea to get relief daily morning problem !!
>
> Thanxs,
> Vishal
>
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
>

-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou   +33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] General performance questions about postgres on Apple

2004-06-05 Thread scott.marlowe
On Fri, 20 Feb 2004, Sean Shanny wrote:

> max_connections = 100
> 
> # - Memory -
>  
> shared_buffers = 16000  # min 16, at least max_connections*2, 
> 8KB each
> sort_mem = 256000   # min 64, size in KB

You might wanna drop sort_mem somewhat and just set it during your imports 
to something big like 512000 or larger.  That way with 100 users during 
the day you won't have to worry about swap storms, and when you run your 
updates, you get all that sort_mem.

> Actual row count in the temp table:
> 
> select count(*) from referral_temp ;
>   502347
> 
> Actual row count in d_referral table:
> 
> select count(*) from d_referral ;
>   27908024
>  
> 
> Note: that an analyze had not been performed on the referral_temp table 
> prior to the explain analyze run.
> 
> explain analyze  SELECT t1.id, t2.md5, t2.url from referral_temp t2 LEFT 
> OUTER JOIN d_referral t1 ON t2.md5 = t1.referral_md5
> 
> Nested Loop Left Join  (cost=0.00..3046.00 rows=1001 width=68) (actual 
> time=136.513..6440616.541 rows=502347 loops=1)
>->  Seq Scan on referral_temp t2  (cost=0.00..20.00 rows=1000 
> width=64) (actual time=21.730..10552.421 rows=502347 loops=1)
>->  Index Scan using d_referral_referral_md5_key on d_referral t1  
> (cost=0.00..3.01 rows=1 width=40) (actual time=12.768..14.022 rows=1 
> loops=502347)
>  Index Cond: ("outer".md5 = t1.referral_md5)
> 
> 
> Thanks.
> 
> --sean
>  Total runtime: 6441969.698 ms
> (5 rows)
> 
> 
> Here is an explain analyze after the analyze was done.  Unfortunately I 
> think a lot of the data was still in cache when I did this again :-(
> 
> explain analyze  SELECT t1.id, t2.md5, t2.url from referral_temp t2 LEFT 
> OUTER JOIN d_referral t1 ON t2.md5 = t1.referral_md5;
> 
> Nested Loop Left Join  (cost=0.00..1468759.69 rows=480082 width=149) 
> (actual time=69.576..3226854.850 rows=502347 loops=1)
>->  Seq Scan on referral_temp t2  (cost=0.00..16034.81 rows=480081 
> width=145) (actual time=11.206..4003.521 rows=502347 loops=1)
>->  Index Scan using d_referral_referral_md5_key on d_referral t1  
> (cost=0.00..3.01 rows=1 width=40) (actual time=6.396..6.402 rows=1 
> loops=502347)
>  Index Cond: ("outer".md5 = t1.referral_md5)
>  Total runtime: 3227830.752 ms

Hmmm.  It looks like postgresql is still picking a nested loop when it 
should be sorting something faster.  Try doing a "set enable_nestloop = 
off" and see what you get.

If that makes it faster, you may want to adjust the costs of the cpu_* 
stuff higher to see if that can force it to do the right thing.

Looking at the amount of time taken by the nested loop, it looks like the 
problem to me.

And why are you doing a left join of ONE row from one table against the 
whole temp table?  Do you really need to do that?  since there's only one 
row in the source table, and I'd guess is only matches one or a few rows 
from the temp table, this means you're gonna have that one row and a bunch 
of null filled rows to go with it.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings