Bruce Momjian wrote:
> > Bruce Momjian <[EMAIL PROTECTED]> writes:
> > > Only shared memory gives us near-zero cost for write/read. 99% of
> > > backends will not be using stats, so it has to be cheap.
> >
> > Not with a circular buffer it's not cheap, because you need interlocking
> > on writes. Your claim that you can get away without that is simply
> > false. You won't just get lost messages, you'll get corrupted messages.
>
> How do I get corrupt messages if they are all five bytes? If I write
> five bytes, and another does the same, I guess the assembler could
> intersperse the writes so the oid gets to be a corrupt value. Any cheap
> way around this, perhaps by skiping/clearing the write on a collision?
>
> >
> > > The collector program can read the shared memory stats and keep hashed
> > > values of accumulated stats. It uses the "Loops" variable to know if it
> > > has read the current information in the buffer.
> >
> > And how does it sleep until the counter has been advanced? Seems to me
> > it has to busy-wait (bad) or sleep (worse; if the minimum sleep delay
> > is 10 ms then it's guaranteed to miss a lot of data under load).
>
> I figured it could just wake up every few seconds and check. It will
> remember the loop counter and current pointer, and read any new
> information. I was thinking of a 20k buffer, which could cover about 4k
> events.
Here I wonder what your EVENT is. With an Oid as identifier
and a 1 byte (even if it'd be anoter 32-bit value), how many
messages do you want to generate to get these statistics:
- Number of sequential scans done per table.
- Number of tuples returned via sequential scans per table.
- Number of buffer cache lookups done through sequential
scans per table.
- Number of buffer cache hits for sequential scans per
table.
- Number of tuples inserted per table.
- Number of tuples updated per table.
- Number of tuples deleted per table.
- Number of index scans done per index.
- Number of index tuples returned per index.
- Number of buffer cache lookups done due to scans per
index.
- Number of buffer cache hits per index.
- Number of valid heap tuples returned via index scan per
index.
- Number of buffer cache lookups done for heap fetches via
index scan per index.
- Number of buffer cache hits for heap fetches via index
scan per index.
- Number of buffer cache lookups not accountable for any of
the above.
- Number of buffer cache hits not accountable for any of
the above.
What I see is that there's a difference in what we two want
to see in the statistics. You're talking about looking at the
actual querystring and such. That's information useful for
someone actually looking at a server, to see what a
particular backend is doing. On my notebook a parallel
regression test (containing >4,000 queries) passes by under
1:30, that's more than 40 queries per second. So that doesn't
tell me much.
What I'm after is to collect the above data over a week or so
and then generate a report to identify the hot spots of the
schema. Which tables/indices cause the most disk I/O, what's
the average percentage of tuples returned in scans (not from
the query, I mean from the single scan inside of the joins).
That's the information I need to know where to look for
possibly better qualifications, useless indices that aren't
worth to maintain and the like.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== [EMAIL PROTECTED] #
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster