ENC: RES: [PERFORM] pg_dump slow - Solution

2006-01-23 Thread Franklin Haut


Hi,


Finally i found the problem of slow backup/restore, i´m only instaled de
Windows 2000 Service Pack 4... :)

Thanks to all


Franklin
 

-Mensagem original-
De: Richard Huxton [mailto:[EMAIL PROTECTED] Enviada em: quarta-feira, 30 de
novembro de 2005 14:28
Para: Franklin Haut
Cc: 'Ron'; pgsql-performance@postgresql.org
Assunto: Re: RES: [PERFORM] pg_dump slow

Franklin Haut wrote:
> Hi,
> 
> Yes, my problem is that the pg_dump takes 40 secs to complete under 
> WinXP and 50 minutes under W2K! The same database, the same hardware!, 
> only diferrent Operational Systems.
> 
> The hardware is: 
>Pentium4 HT 3.2 GHz
>1024 Mb Memory
>HD 120Gb SATA

There have been reports of very slow network performance on Win2k systems
with the default configuration. You'll have to check the archives for
details I'm afraid. This might apply to you.

If you're happy that doesn't affect you then I'd look at the disk system
- perhaps XP has newer drivers than Win2k.

What do the MS performance-charts show is happening? Specifically, CPU and
disk I/O.

--
   Richard Huxton
   Archonet Ltd


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [PERFORMANCE] Stored Procedures

2006-01-23 Thread Marcos
Em Sex, 2006-01-20 às 15:34 -0600, Jim C. Nasby escreveu:
> On Fri, Jan 20, 2006 at 08:38:23PM +0100, Rikard Pavelic wrote:
> > This would solve problems with prepare which is per session, so for 
> > prepared function to be
> > optimal one must use same connection.
> 
> If you're dealing with something that's performance critical you're not
> going to be constantly re-connecting anyway, so I don't see what the
> issue is.

This one was my doubt, perhaps in based desktop applications this is
true, but in web applications this is not the re-connecting is
constant :(.

Then the preprare not have very advantage because your duration is per
session.

Marcos.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Suspending SELECTs

2006-01-23 Thread Alessandro Baretta

August Zajonc wrote:

Alessandro Baretta wrote:

Alessandro,

I've very much enjoyed reading your thoughts and the problem your facing
and everyone's responses.


Thank you for your interest, Agust.


Since you control the middle layer, could you not use a cookie to keep a
cursor open on the middle layer and tie into it on subsequent queries?


I do. The AS/Xcaml calls it "session key". It is usually passed in a cookie for
websites and elsewhere--query string or url--for Intranet/Extranet web
applications. The session information kept by the AS/Xcaml virtual machine
includes cached query results and state information for servlets. I
could--although not terribly easily--also use the Xcaml session manager to
handle allocation of DB connections from the pool, thus allocating one
connection per active session. The session garbage collector would then also
have to manage the recycling of stale DB connections.


If you are concerned with too many connections open, you could timeout
the sessions quickly and recreate the cursor if someone came back. If
they waited 5 minutes to make the next query, certainly they could wait
a few extra seconds to offset and reacquire a cursor?


Yes I could. Actually, there are quite a few means of handling this issue. The
possible strategies for garbage collecting resources allocated to a remote peer
is are collectively called "failure-detection algorithms" in the theory of
distributed computing. In most cases an "eventually weak failure detector" is
necessary and sufficient to guarantee a number of desirable properties in
asynchronous systems: termination of execution, bounded open connections, and
others.

Yet, it is a theorm that no asynchronous algorithm can be devised to implement
an eventually weak failure detector. This, in turn, implies that no distributed
asynchronous system--i.e. a web application--possesses the above mentioned
desirable properties. Hence, from a theoretical standpoint, we can either choose
to relax the axioms of the system allowing synchronicity--a failure detector
based on a timeout explicitly requires the notion of time--or, as I would
prefer, by eliminating the need for termination of execution--i.e. explicit
client logout--and bounded open connections by delegating to the client the
responsibility of maintaing all relevant state information. Under these
assumptions we can live happily in a perfectly asynchronous stateless world like
that of HTTP.

Now, neither of the two solutions is perfect. In an Intranet/Extranet context, I
want to store server side a significant amount of state information, including
cached query results, thus entailing the need for a synchronous
failure-detector--let's call it "implicit logout detector"--to garbage collect
the stale session files generated by the application. In an open website--no
login--I do not usually want to use sessions, so I prefer to implement the
application so that all relevant state informatoin is maintained by the client.
This model is perfect until we reach the issue of "suspending SELECTs", that is,
limiting the the cardinality of the record set to a configurable "page-size",
allowing the user to page through a vast query result.



The hitlist idea was also a nice one if the size of the data returned is
not overwhelming and does not need to track the underlying db at all
(ie, no refresh).


In an open website, immediate refresh is not critical, so long as I can
guarantee some decent property of data consistency. Full consistency cannot be
achieved, as Tom pointed out. I cordially disagree with Tom on the commandment
that "Thou shalt have no property of consistency other than me". Just like we
have two different transaction isolation levels, guarateeing different degrees
of ACIDity, we could, conceivably wish to formalize a weaker notion of
consistency and implement functionality to match with it, which would not be
possible under the stronger definition property.


Mark had a number of good general suggestions though, and I'd like to
echo the materialized views as an option that I could see a lot of uses
for (and have worked around in the past with SELECT INTO's and like).


I already use materialized views. The database design layer of the AS/Xcaml
allows the definition of fragmented materialized views: the view is split in
fragments, that is, equivalence classes of the record set with respect to the
operation of projection of the view signature to a (small) subset of its
columns. Yet, this actually makes the original problem worse, for materialiazed
view fragments must be garbage collected at some point, thus offering much of
the same conceptual difficulties as cursor pooling strategy.

Alex


--
*
http://www.barettadeit.com/
Baretta DE&IT
A division of Baretta SRL

tel. +39 02 370 111 55
fax. +39 02 370 111 54

Our technology:

The Application System/Xcaml (AS/Xcaml)


The FreerP Project


Re: [PERFORM] [PERFORMANCE] Stored Procedures

2006-01-23 Thread Markus Schaber
Hi, Marcos,

Marcos wrote:

> This one was my doubt, perhaps in based desktop applications this is
> true, but in web applications this is not the re-connecting is
> constant :(.

If this is true, then you have a much bigger performance problem than
query plan preparation.

You really should consider using a connection pool (most web application
servers provide pooling facilities) or some other means to keep the
connection between several http requests.

Worried,
Markus

-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] [PERFORMANCE] Stored Procedures

2006-01-23 Thread Marcos
Hi Markus

> You really should consider using a connection pool (most web application
> servers provide pooling facilities) or some other means to keep the
> connection between several http requests.

Yes. I'm finding a connection pool, I found the pgpool but yet don't
understand how it's work I'm go read more about him.

Thanks

Marcos


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [PERFORMANCE] Stored Procedures

2006-01-23 Thread Dave Dutcher
I don't think pgpool is what you need.  If I understand pgpool
correctly, pgpool lets you pool multiple postgres servers together.  You
are just looking for database connection pooling.  

A simple connection pool is basically just an application wide list of
connections.  When a client needs a connection, you just request a
connection from the pool.  If there is an unused connection in the pool,
it is given to the client and removed from the unused pool.  If there is
no unused connection in the pool, then a new connection is opened.  When
the client is done with it, the client releases it back into the pool.

You can google for 'database connection pool' and you should find a
bunch of stuff.  It's probably a good idea to find one already written.
If you write your own you have to make sure it can deal with things like
dead connections, synchronization, and maximum numbers of open
connections.

Dave


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Marcos
Sent: Monday, January 23, 2006 7:27 AM
To: Markus Schaber
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] [PERFORMANCE] Stored Procedures

Hi Markus

> You really should consider using a connection pool (most web
application
> servers provide pooling facilities) or some other means to keep the
> connection between several http requests.

Yes. I'm finding a connection pool, I found the pgpool but yet don't
understand how it's work I'm go read more about him.

Thanks

Marcos


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] [PERFORMANCE] Stored Procedures

2006-01-23 Thread Frank Wiles
On Mon, 23 Jan 2006 10:23:17 -0600
"Dave Dutcher" <[EMAIL PROTECTED]> wrote:

> I don't think pgpool is what you need.  If I understand pgpool
> correctly, pgpool lets you pool multiple postgres servers together.
> You are just looking for database connection pooling.  

  While pgpool can let you pool together multiple backend servers,
  it also functions well as just a connection pooling device with
  only one backend. 

 -
   Frank Wiles <[EMAIL PROTECTED]>
   http://www.wiles.org
 -


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Slow queries consisting inner selects and order bys & hack to speed up

2006-01-23 Thread Tom Lane
=?ISO-8859-1?Q?=DCmit_=D6ztosun?= <[EMAIL PROTECTED]> writes:
> Our application uses typical queries similar to following (very simplified):

> SELECT
>  part_id,
>  part_name,
>  (SELECT
>   SUM(amount) FROM part_movements M
>   WHERE P.part_id =3D M.part_id
>  ) as part_amount
> FROM parts P
> ORDER BY part_name
> LIMIT 50

> Postgres seems to compute all possible rows and then sorts the
> results, which nearly renders the paging meaningless.

Yeah.  The general rule is that sorting happens after computing the
SELECT values --- this is more or less required for cases where the
ORDER BY refers to a SELECT-list item.  You'd probably have better
results by writing a sub-select:

SELECT
 part_id,
 part_name,
 (SELECT
  SUM(amount) FROM part_movements M
  WHERE P.part_id = M.part_id
 ) as part_amount
FROM
 (SELECT part_id, part_name FROM parts P
  WHERE whatever ORDER BY whatever LIMIT n) as P;

This will do the part_movements stuff only for rows that make it out of
the sub-select.

Another approach is to make sure the ORDER BY is always on an indexed
column; in cases where the ORDER BY is done by an indexscan instead
of a sort, calculation of the unwanted SELECT-list items does not
happen.  However, this only works as long as LIMIT+OFFSET is fairly
small.

Lastly, are you on a reasonably current Postgres version (performance
complaints about anything older than 8.0 will no longer be accepted
with much grace), and are your statistics up to date?  The ANALYZE
shows rowcount estimates that seem suspiciously far off:

->  Seq Scan on scf_stokkart stok  (cost=0.00..142622.54 rows=25 
width=503) (actual time=4.726..19324.633 rows=4947 loops=1)
  Filter: (upper((durum)::text) = 'A'::text)

This is important because, if the planner realized that the SELECT-list
items were going to be evaluated 5000 times not 25, it might well choose
a different plan.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] unsubscribe

2006-01-23 Thread Adnan HOTMAIL






unsubscribe