Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Matt Clark
> On Sat, Oct 04, 2003 at 12:29:55AM +0100, Matt Clark wrote:
> > My real world experience on a *very* heavily updated OLTP type
> DB, following
> > advice from this list (thanks guys!), is that there is
> essentially zero cost
> > to going ahead and vacuuming as often as you feel like it.  Go
> crazy, and
> > speed up your DB!
>
> That's not quite true.  If vacuums start running into each other, you
> can very easily start eating up all your I/O bandwidth.  Even if you
> gots lots of it.

Very true, which is why all my scripts write a lockfile and delete it when
they're finished, to prevent that happening.  I should have mentioned that.

> Also, a vacuum pretty much destroys your shared buffers, so you have
> to be aware of that trade-off too.  Vacuum is not free.  It's _way_
> cheaper than it used to be, though.

That's _very_ interesting.  I've never been quite clear what's in shared
buffers apart from scratch space for currently running transactions.  Also
the docs imply that vacuum uses it's own space for working in.  Do you have
more info on how it clobbers shared_buffers?

M


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Andrew Sullivan
On Sun, Oct 05, 2003 at 12:14:24PM +0100, Matt Clark wrote:
> more info on how it clobbers shared_buffers?

Vacuum is like a seqscan.  It touches everything on a table.  So it
doesn't clobber them, but that's the latest data.  It's unlikely your
buffers are big enough to hold your database, unless your database is
small.  So you'll end up expiring potentially useful data in the
buffer.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Matt Clark
> On Sun, Oct 05, 2003 at 12:14:24PM +0100, Matt Clark wrote:
> > more info on how it clobbers shared_buffers?
>
> Vacuum is like a seqscan.  It touches everything on a table.  So it
> doesn't clobber them, but that's the latest data.  It's unlikely your
> buffers are big enough to hold your database, unless your database is
> small.  So you'll end up expiring potentially useful data in the
> buffer.

OK I'm definitely missing something here.  I thought that the FSM was there
to keep track of potentially free pages, and that all VACUUM did was double
check and then write that info out for all to see?  The promise being that a
VACUUM FULL will walk all pages on disk and do a soft-shoe-shuffle to
aggresively recover space, but a simple VACUUM won't (merely confirming
pages as available for reuse).

As for buffers, my understanding is that they are *not* meant to be big
enough to hold the DB, as PG explicitly leaves caching up to the underlying
OS.  'buffers' here meaning shared memory between PG processes, and 'cache'
meaning OS cache.  'buffers' only need to be big enough to hold the
intermediate calcs and the results for any current transactions?

M


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Tom Lane
"Matt Clark" <[EMAIL PROTECTED]> writes:
> OK I'm definitely missing something here.

The point is that a big seqscan (either VACUUM or a plain table scan)
hits a lot of pages, and thereby tends to fill your cache with pages
that aren't actually likely to get hit again soon, perhaps pushing out
pages that will be needed again soon.  This happens at both the
shared-buffer and kernel-disk-cache levels of caching.

It would be good to find some way to prevent big seqscans from
populating cache, but I don't know of any portable way to tell the OS
that we don't want it to cache a page we are reading.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Matt Clark
> The point is that a big seqscan (either VACUUM or a plain table scan)
> hits a lot of pages, and thereby tends to fill your cache with pages
> that aren't actually likely to get hit again soon, perhaps pushing out
> pages that will be needed again soon.  This happens at both the
> shared-buffer and kernel-disk-cache levels of caching.

OK, I had thought (wrongly it seems, as usual, but this is how we learn!)
that a plain VACUUM did not incur a read of all pages.  I still don't
understand *why* it does, but I'll take your word for it.

Clearly if it distorts the 'normal' balance of pages in any caches, PG's or
the OS's, that's a _bad thing_.  I am currently in the nice position of
having a DB that (just about) fits in RAM, so I pretty much don't care about
read performance, but I will have to soon as it grows beyond 3GB :-(  These
conversations are invaluable in planning for that dread time...

> It would be good to find some way to prevent big seqscans from
> populating cache, but I don't know of any portable way to tell the OS
> that we don't want it to cache a page we are reading.

Quite.  The only natural way would be to read those pages through some
special device, but then you might as well do raw disk access from the
get-go.  Portability vs. Performance, the age old quandary.  FWIW I and many
others stand back in pure amazement at the sheer _quality_ of PostgreSQL.


Rgds,

Matt



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Tom Lane
"Matt Clark" <[EMAIL PROTECTED]> writes:
> OK, I had thought (wrongly it seems, as usual, but this is how we learn!)
> that a plain VACUUM did not incur a read of all pages.  I still don't
> understand *why* it does, but I'll take your word for it.

Mainly 'cause it doesn't know where the dead tuples are till it's
looked.  Also, VACUUM is the data collector for the free space map,
and so it is also charged with finding out how much free space exists
on every page.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Matt Clark
> Mainly 'cause it doesn't know where the dead tuples are till it's
> looked.

At this point I feel very stupid...

> Also, VACUUM is the data collector for the free space map,
> and so it is also charged with finding out how much free space exists
> on every page.

Ah, now I just feel enlightened!  That makes perfect sense.  I think I had
been conflating free pages with free space, without understanding what the
difference was.  Of course I still don't really understand, but at least I
now _know_ I don't.

Many thanks

Matt


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] count(*) slow on large tables

2003-10-05 Thread Josh Berkus
Bruce,

> OK, I beefed up the TODO:
>
>   * Use a fixed row count and a +/- count with MVCC visibility rules
> to allow fast COUNT(*) queries with no WHERE clause(?)
>
> I can always give the details if someone asks.  It doesn't seem complex
> enough for a separate TODO.detail item.

Hmmm ... this doesn't seem effort-worthy to me.   How often does anyone do 
COUNT with no where clause, except GUIs that give you a record count?  (of 
course, as always, if someone wants to code it, feel free ...)

And for those GUIs, wouldn't it be 97% as good to run an ANALYZE and give the 
approximate record counts for large tables?

As for counts with a WHERE clause, this is obviously up to the user.  Joe 
Conway and I tested using a C trigger to track some COUNT ... GROUP BY values 
for large tables based on additive numbers.   It worked fairly well for 
accuracy, but the performance penalty on data writes was significant ... 8% 
to 25% penalty for UPDATES, depending on the frequency and batch size (> 
frequency > batch size -->  > penalty)

It's possible that this could be improved through some mechanism more tightly 
integrated with the source code.   However,the coding effort would be 
significant ( 12-20 hours ) and it's possible that there would be no 
improvement, which is why we didn't do it.

We also discussed an asynchronous aggregates collector that would work 
something like the statistics collector, and keep pre-programmmed aggregate 
data, updating during "low-activity" periods.  This would significantly 
reduce the performance penalty, but at the cost of accuracy ... that is, a 
1%-5% variance on high-activity tables would be unavoidable, and all cached 
aggregates would have to be recalculated on database restart, significantly 
slowing down startup.   Again, we felt that the effort-result payoff was not 
worthwhile.

Overall, I think the stuff we already have planned ... the hash aggregates in 
7.4 and Tom's suggestion of adding an indexable flag to pg_aggs ... are far 
more likely to yeild useful fruit than any caching plan.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] count(*) slow on large tables

2003-10-05 Thread Rod Taylor
> And for those GUIs, wouldn't it be 97% as good to run an ANALYZE and give the 
> approximate record counts for large tables?

Interfaces which run a COUNT(*) like that are broken by design. They
fail to consider the table may really be a view which of course could
not be cached with results like that and may take days to load a full
result set (we had some pretty large views in an old billing system).


signature.asc
Description: This is a digitally signed message part


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Christopher Browne
After a long battle with technology,[EMAIL PROTECTED] ("Matt Clark"), an earthling, 
wrote:
>> The point is that a big seqscan (either VACUUM or a plain table scan)
>> hits a lot of pages, and thereby tends to fill your cache with pages
>> that aren't actually likely to get hit again soon, perhaps pushing out
>> pages that will be needed again soon.  This happens at both the
>> shared-buffer and kernel-disk-cache levels of caching.
>
> OK, I had thought (wrongly it seems, as usual, but this is how we learn!)
> that a plain VACUUM did not incur a read of all pages.  I still don't
> understand *why* it does, but I'll take your word for it.

How does it know what to do on any given page if it does not read it
in?  It has to evaluate whether tuples can be thrown away or not, and
that requires looking at the tuples.  It may only be looking at a
small portion of the page, but that still requires reading each page.

No free lunch, unfortunately...
-- 
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','cbbrowne.com').
http://www3.sympatico.ca/cbbrowne/sgml.html
"End users  are just test loads  for verifying that  the system works,
kind of like resistors in an electrical circuit."
-- Kaz Kylheku in c.o.l.d.s

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Neil Conway
On Sat, 2003-10-04 at 11:22, Andrew Sullivan wrote:
> Also, a vacuum pretty much destroys your shared buffers, so you have
> to be aware of that trade-off too.

True, although there is no reason that this necessary needs to be the
case (at least, as far as the PostgreSQL shared buffer goes). As has
been pointed out numerous times on -hackers and in the literature, using
LRU for a DBMS shared buffer cache is far from optimal, and better
algorithms have been proposed (e.g. LRU-K, ARC). We could even have the
VACUUM command inform the bufmgr that the pages it is in the process of
reading in are part of a seqscan, and so are unlikely to be needed in
the immediate future.

-Neil



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Tom Lane
Neil Conway <[EMAIL PROTECTED]> writes:
> ... We could even have the
> VACUUM command inform the bufmgr that the pages it is in the process of
> reading in are part of a seqscan, and so are unlikely to be needed in
> the immediate future.

This would be relatively easy to fix as far as our own buffering is
concerned, but the thing that's needed to make it really useful is
to prevent caching of seqscan-read pages in the kernel disk buffers.
I don't know any portable way to do that :-(

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Neil Conway
On Sun, 2003-10-05 at 19:43, Tom Lane wrote:
> This would be relatively easy to fix as far as our own buffering is
> concerned, but the thing that's needed to make it really useful is
> to prevent caching of seqscan-read pages in the kernel disk buffers.

True.

> I don't know any portable way to do that :-(

For the non-portable way of doing this, are you referring to O_DIRECT?

Even if it isn't available everywhere, it might be worth considering
this at least for the platforms on which it is supported.

-Neil



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Andrew Sullivan
On Sun, Oct 05, 2003 at 07:32:47PM -0400, Neil Conway wrote:

> been pointed out numerous times on -hackers and in the literature, using
> LRU for a DBMS shared buffer cache is far from optimal, and better
> algorithms have been proposed (e.g. LRU-K, ARC). We could even have the
> VACUUM command inform the bufmgr that the pages it is in the process of
> reading in are part of a seqscan, and so are unlikely to be needed in
> the immediate future.

Hey, when that happens, you'll find me first in line to praise the
implementor; but until then, it's important that people not get the
idea that vacuum is free.

It is _way_ imporved, and on moderately loaded boxes, it'salmost
unnoticable.  But under heavy load, you need to be _real_ careful
about calling vacuum.  I think one of the biggest needs in the AVD is
some sort of intelligence about current load on the postmaster, but I
haven't the foggiest idea how to give it such intelligence.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] count(*) slow on large tables

2003-10-05 Thread Shridhar Daithankar
Bruce Momjian wrote:
OK, I beefed up the TODO:

* Use a fixed row count and a +/- count with MVCC visibility rules
  to allow fast COUNT(*) queries with no WHERE clause(?)
I can always give the details if someone asks.  It doesn't seem complex
enough for a separate TODO.detail item.
May I propose alternate approach for this optimisation?

- Postgresql allows to maintain user defined variables in shared memory.
- These variables obey transactions but do not get written to disk at all.
- There should be a facility to detect whether such a variable is initialized or 
not.

How it will help? This is in addition to trigger proposal that came up earlier. 
With  triggers it's not possible to make values visible across backends unless 
trigger updates a table, which eventually leads to vacuum/dead tuples problem.

1. User creates a trigger to check updates/inserts for certain conditions.
2. It updates the count as and when required.
3. If the trigger detects the count is not initialized, it would issue the same 
query first time. There is no avoiding this issue.

Besides providing facility of resident variables could be used imaginatively as 
well.

Does this make sense? IMO this is more generalised approach over all.

Just a thought.

 Shridhar



---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Postgres low end processing.

2003-10-05 Thread Shridhar Daithankar
Stef wrote:

On Fri, 03 Oct 2003 12:32:00 -0400
Tom Lane <[EMAIL PROTECTED]> wrote:
=> What exactly is failing?  And what's the platform, anyway?

Nothing is really failing atm, except the funds for better 
hardware. JBOSS and some other servers need to be 
run on these machines, along with linux, which will be 
a minimal RH >= 7.2 with kernel 2.4.21
(Any better suggestions here?)

In this case, whatever is the least amount of memory
postgres can run on, is what is needed. So this is still
a kind of feasibility study. Of course, it will still be thoroughly
tested, if it turns out to be possible. (Which I know it is, but not how)
If you mean to say that postgresql should use just 8 MB of RAM rather than 
running it on a 8MB machine, then that is impossible given how much postgresql 
relies upon OS cache.

You may configure postgresql with 8MB shared memory or the old holy default of 
512K, but if your database is 100MB and OS is caching half of it on behalf of 
postgresql, your goal is already missed..

 Shridhar

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] COUNT(*) again (was Re: [HACKERS] Index/Function organized

2003-10-05 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> I think that's not happening, conditionally or otherwise.  The atomicity
> >> problems alone are sufficient reason why not, even before you look at
> >> the performance issues.
> 
> > What are the atomicity problems of adding a create/expire xid to the
> > index tuples?
> 
> You can't update a tuple's status in just one place ... you have to
> update the copies in the indexes too.

But we don't update the tuple status for a commit, we just mark the xid
as committed.  We do have lazy status bits that prevent later lookups in
pg_clog, but we have those in the index already also.

What am I missing?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend