> >> This bothers me a bit, because in
> >> fact the effects if any of the tested query would have been rolled
> >> back. Not sure we have any choice though. If we expose the error
> >> then we'll have problems with clients not showing the EXPLAIN
> >> results.
>
> > I think we should leave
> "MvO" == Martijn van Oosterhout writes:
MvO> What we want is just a monotonically increasing counter that can
MvO> be read quickly and consistantly, we're not majorly fussed if it
MvO> doesn't match real time. This puts us back to CPU cycle counters,
MvO> but they have drawbacks of their ow
Greg Stark <[EMAIL PROTECTED]> writes:
> There seem to be two types of overhead going on. There's the amount of time
> spent in gettimeofday itself which is pretty consistent.
That is a fact not in evidence. The impression I had from that
linux-kernel discussion was that the problematic kernel co
Martijn van Oosterhout writes:
> The interesting thing about this is that they obviously are gearing
> gettimeofday() to be accurate, rather than just considering it a
> counter that is somewhat close to real time. At the expense of speed.
Not sure that that's an accurate description. What I thi
Martijn van Oosterhout writes:
> Right now I'm confused though. I was under the impression the changes
> were going to be ripped out because it was decided to be unworkable. I
> think improvements can be made but I'm unsure if there's any
> interest...
I've reverted the current patch because it c
On Fri, Jun 09, 2006 at 03:55:28PM -0400, Tom Lane wrote:
> Greg Stark <[EMAIL PROTECTED]> writes:
> > There seem to be two types of overhead going on. There's the amount of time
> > spent in gettimeofday itself which is pretty consistent.
>
> That is a fact not in evidence. The impression I had
Tom Lane <[EMAIL PROTECTED]> writes:
> Greg Stark <[EMAIL PROTECTED]> writes:
> > To avoid user confusion it would reasonable to print out a line at the
> > bottom:
>
> > Explain analyze profiling overhead removed: xxx ms
>
> > That also gives the user feedback on how precise their explain an
Greg Stark <[EMAIL PROTECTED]> writes:
> To avoid user confusion it would reasonable to print out a line at the bottom:
> Explain analyze profiling overhead removed: xxx ms
> That also gives the user feedback on how precise their explain analyze results
> are. If they see that the overhead bein
Tom Lane <[EMAIL PROTECTED]> writes:
> Greg Stark <[EMAIL PROTECTED]> writes:
> > Are we still going to subtract out the gettimeofday overhead?
>
> > I was always more excited about that than the sampling aspect. I've run into
> > queries where EXPLAIN ANALYZE results were deceptive due to the ge
Greg Stark <[EMAIL PROTECTED]> writes:
> Are we still going to subtract out the gettimeofday overhead?
> I was always more excited about that than the sampling aspect. I've run into
> queries where EXPLAIN ANALYZE results were deceptive due to the gettimeofday
> overhead but I've never run into a
Tom Lane <[EMAIL PROTECTED]> writes:
> I was intending to push forward with the idea of being able to get
> numbers out of a canceled EXPLAIN. That will allow you to get some
> information even when the underlying query runs longer than you're
> willing to tolerate. I still say that the number o
On Fri, Jun 09, 2006 at 11:22:11AM -0400, A.M. wrote:
> It would be nice to keep the gettimeofday()s wherever they are most useful
> on hardware/software where they are cheap. Perhaps a compile-time option?
I think making it optional at compile time is worse, because then at
support time you might
"Zeugswetter Andreas DCP SD" <[EMAIL PROTECTED]> writes:
>> This bothers me a bit, because in
>> fact the effects if any of the tested query would have been
>> rolled back. Not sure we have any choice though. If we
>> expose the error then we'll have problems with clients not
>> showing the E
On Fri, 2006-06-09 at 10:56 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > I propose we revert the sampling patch (sorry Martijn)
>
> yah ...
>
> > and go with the
> > patch to have an explain_analyze_timing parameter (default=on).
>
> This I'm unexcited about. EXPLAIN out
Martijn van Oosterhout writes:
> On Fri, Jun 09, 2006 at 10:00:20AM -0400, Tom Lane wrote:
>> I had thought we were applying an order-of-ten-percent correction by
>> subtracting SampleOverhead, not an order-of-10x correction :-(
> Eh? The whole point is to call gettimeofday() much less often. If
It would be nice to keep the gettimeofday()s wherever they are most useful
on hardware/software where they are cheap. Perhaps a compile-time option?
On Fri, June 9, 2006 11:18 am, Martijn van Oosterhout wrote:
> On Fri, Jun 09, 2006 at 10:00:20AM -0400, Tom Lane wrote:
>
>> To tell you the truth,
On Fri, Jun 09, 2006 at 10:00:20AM -0400, Tom Lane wrote:
> To tell you the truth, this information makes me even less pleased with
> the sampling-gettimeofday patch than I was before. If gettimeofday() in
> itself increases the runtime of a node by a factor of 10, then just
> trying to subtract o
Simon Riggs <[EMAIL PROTECTED]> writes:
> I propose we revert the sampling patch (sorry Martijn)
yah ...
> and go with the
> patch to have an explain_analyze_timing parameter (default=on).
This I'm unexcited about. EXPLAIN output isn't all that transparent
anyway, and losing the extra cue of se
On Fri, 2006-06-09 at 10:00 -0400, Tom Lane wrote:
> I think the bottom line here is that there are some machines out there
> where gettimeofday() is fast enough for our purposes, and some where
> it is nowhere near fast enough. And that kernel-level fixes may be
> possible for some of the latter
"Larry Rosenman" writes:
> Does my post from yesterday (39ms without explain analyze, 280+ with explain
> analyze) on FreeBSD/amd64, Dual Xeon's in HTT mode help?
Well, it confirms that FreeBSD is subject to the same problem ;-).
I think the bottom line here is that there are some machines out t
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Hmm, I take that back. I updated to the current CVS tip on this machine
> > and now I don't see the problem. Without EXPLAIN ANALYZE, the times are
> > around 115-117 ms. With EXPLAIN ANALYZE, 149-152ms.
>
> At the moment, CVS HEA
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
>> Hmm, I take that back. I updated to the current CVS tip on this
>> machine and now I don't see the problem. Without EXPLAIN ANALYZE,
>> the times are around 115-117 ms. With EXPLAIN ANALYZE, 149-152ms.
>
> At the moment, CVS HEAD s
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Hmm, I take that back. I updated to the current CVS tip on this machine
> and now I don't see the problem. Without EXPLAIN ANALYZE, the times are
> around 115-117 ms. With EXPLAIN ANALYZE, 149-152ms.
At the moment, CVS HEAD still has the sampling pat
Simon Riggs wrote:
> On Thu, 2006-06-08 at 17:21 -0400, Tom Lane wrote:
> > Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > > Tried on two machines. The first (Turion64 laptop) gives 44-45 ms for
> > > the SELECT, and 50-51 ms for the EXPLAIN ANALYZE.
> >
> > > The second machine, desktop Celeron
Alvaro Herrera wrote:
> The second machine, desktop Celeron 533, gives 197-200 ms for the SELECT
> and 788-790 for the EXPLAIN ANALYZE. I guess this is the reproduction
> you were looking for.
Hmm, I take that back. I updated to the current CVS tip on this machine
and now I don't see the proble
On Thu, 2006-06-08 at 17:21 -0400, Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Tried on two machines. The first (Turion64 laptop) gives 44-45 ms for
> > the SELECT, and 50-51 ms for the EXPLAIN ANALYZE.
>
> > The second machine, desktop Celeron 533, gives 197-200 ms for the S
> This bothers me a bit, because in
> fact the effects if any of the tested query would have been
> rolled back. Not sure we have any choice though. If we
> expose the error then we'll have problems with clients not
> showing the EXPLAIN results.
I think we should leave it in top level, thro
It's worth noting that on Darwin (on Apple hardware) gettimeofday is
never a syscall whereas on Linux (AFAIK), it always is.
On Jun 8, 2006, at 7:58 PM, Mark Kirkwood wrote:
Tom Lane wrote:
Alvaro Herrera <[EMAIL PROTECTED]> writes:
Wow, that is slow. Maybe a problem in the kernel? Perhaps
Tom Lane wrote:
Alvaro Herrera <[EMAIL PROTECTED]> writes:
Wow, that is slow. Maybe a problem in the kernel? Perhaps something
similar to this:
http://www.ussg.iu.edu/hypermail/linux/kernel/0603.2/index.html#1282
Yeah, that's a pretty interesting thread. I came across something
similar on a
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Wow, that is slow. Maybe a problem in the kernel? Perhaps something
> similar to this:
> http://www.ussg.iu.edu/hypermail/linux/kernel/0603.2/index.html#1282
Yeah, that's a pretty interesting thread. I came across something
similar on a Red Hat inter
On Thu, Jun 08, 2006 at 05:32:36PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> > The powerbook tests were not very repeatable at 100,000 rows, so I
> > bumped up to 1M. The results still aren't very repeatable...
>
> Hmm, I suspect you pushed it into the realm where it's
Just tested 2 windows boxes, both running 8.1.3 and XP SP2.
P4 2.8G desktop 47ms297ms
Pentium M 2G laptop 40ms240ms
--
Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED]
Pervasive Software http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasi
"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> The powerbook tests were not very repeatable at 100,000 rows, so I
> bumped up to 1M. The results still aren't very repeatable...
Hmm, I suspect you pushed it into the realm where it's doing actual
I/O ... which we don't want for this particular test.
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Tried on two machines. The first (Turion64 laptop) gives 44-45 ms for
> > the SELECT, and 50-51 ms for the EXPLAIN ANALYZE.
>
> > The second machine, desktop Celeron 533, gives 197-200 ms for the SELECT
> > and 788-790 for the EXPLA
On Thu, Jun 08, 2006 at 04:58:07PM -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > So the timing is clearly responsible for the additional time I'm
> > personally experiencing and very likely to be that for others also.
>
> Well, that's certainly unsurprising, but the question
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tried on two machines. The first (Turion64 laptop) gives 44-45 ms for
> the SELECT, and 50-51 ms for the EXPLAIN ANALYZE.
> The second machine, desktop Celeron 533, gives 197-200 ms for the SELECT
> and 788-790 for the EXPLAIN ANALYZE. I guess this is
Tom Lane wrote:
> If anyone else is still following this discussion, could you try the
> test case Simon gave a bit earlier (select count(*) from 10-row
> table, check \timing with and without explain analyze)? If we could get
> a clearer picture of which platforms show the problem and which
Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
>> So the timing is clearly responsible for the additional time I'm
>> personally experiencing and very likely to be that for others also.
>
> Well, that's certainly unsurprising, but the question is why it's such
> a large overhead for you
Simon Riggs wrote:
> postgres=# explain analyze select count(*) from accounts;
>QUERY PLAN
>
> Aggregate (cost=2890.00
Simon Riggs <[EMAIL PROTECTED]> writes:
> So the timing is clearly responsible for the additional time I'm
> personally experiencing and very likely to be that for others also.
Well, that's certainly unsurprising, but the question is why it's such
a large overhead for you when it's not on other ap
On Thu, 2006-06-08 at 12:56 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > So far we have myself, Kevin, Martijn and Luke all saying there is a
> > distortion or a massive overhead caused by EXPLAIN ANALYZE.
> > http://archives.postgresql.org/pgsql-hackers/2006-03/msg00954.php
Tom Lane wrote:
> A possible objection to this is that running a query inside a
> subtransaction might have different/worse performance than running it
> at top level. I don't recall any severe bottlenecks of that kind but
> that doesn't mean there aren't any (Alvaro, any comments?)
Nope, nothin
Tom Lane <[EMAIL PROTECTED]> writes:
> Gregory Stark <[EMAIL PROTECTED]> writes:
> > Tom Lane <[EMAIL PROTECTED]> writes:
> > Would it be possible to make a whole new protocol message for EXPLAIN
> > results?
>
> I'm really unwilling to get into that. For one thing, that would
> absolutely posi
Gregory Stark <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
> Would it be possible to make a whole new protocol message for EXPLAIN results?
I'm really unwilling to get into that. For one thing, that would
absolutely positively break *all* use of EXPLAIN from un-fixed clients
Tom Lane <[EMAIL PROTECTED]> writes:
> After we've printed the results, we have a bit of a problem: if
> ExplainCancelPending is set, we now need to abort the transaction. It
> would not do at all to allow an incompletely executed UPDATE to commit.
> I experimented with throwing an elog() at the
I wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
>> A full EXPLAIN ANALYZE is always desirable - we agree on that. The
>> question is what we do when one is not available.
> The least bad alternative I've heard is to let EXPLAIN ANALYZE print
> out stats-so-far if the query is canceled by contro
Simon Riggs <[EMAIL PROTECTED]> writes:
> So far we have myself, Kevin, Martijn and Luke all saying there is a
> distortion or a massive overhead caused by EXPLAIN ANALYZE.
> http://archives.postgresql.org/pgsql-hackers/2006-03/msg00954.php
> http://archives.postgresql.org/pgsql-patches/2006-05/msg
On Thu, 2006-06-08 at 10:27 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > On Wed, 2006-06-07 at 17:28 -0400, Tom Lane wrote:
> >> The overhead seems to be on the order of a couple tens of percent usually.
> >> I don't see how that makes the difference between an EXPLAIN ANALY
Simon Riggs <[EMAIL PROTECTED]> writes:
> On Wed, 2006-06-07 at 17:28 -0400, Tom Lane wrote:
>> The overhead seems to be on the order of a couple tens of percent usually.
>> I don't see how that makes the difference between an EXPLAIN ANALYZE you
>> can run and one you can't.
> Well, thats not my
On Wed, 2006-06-07 at 17:28 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > On Wed, 2006-06-07 at 16:56 -0400, Tom Lane wrote:
> >> Certainly the removal of timing
> >> is not going to convert an intolerable EXPLAIN ANALYZE runtime into an
> >> acceptable one;
>
> > I disagre
On Wed, Jun 07, 2006 at 05:05:23PM -0400, Tom Lane wrote:
> The other thing that was bothering me was whether disk I/O might be
> undercounted by an interrupt-driven method. I kinda doubt that any
> kernel will save up N interrupts that occur while the process is blocked
> on a slow read() ... you
Simon Riggs <[EMAIL PROTECTED]> writes:
> On Wed, 2006-06-07 at 16:56 -0400, Tom Lane wrote:
>> Certainly the removal of timing
>> is not going to convert an intolerable EXPLAIN ANALYZE runtime into an
>> acceptable one;
> I disagree, as have others.
The overhead seems to be on the order of a co
On Wed, 2006-06-07 at 16:56 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > On Wed, 2006-06-07 at 15:32 -0400, Tom Lane wrote:
> >> I propose we revert this patch and think about an interrupt-driven
> >> sampling method instead.
>
> > I don't have much more faith in crazy sche
On Wed, Jun 07, 2006 at 04:04:33PM -0400, Tom Lane wrote:
> Martijn van Oosterhout writes:
> > Anyway, as a test, if you take the approach that the measurement at
> > item X only applies to the tuples immediately preceding it, for the
> > data you posted you get a result of 0.681148 seconds. How l
"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> On Wed, Jun 07, 2006 at 11:34:30AM -0400, Tom Lane wrote:
>> timer interrupt routine does this once every few milliseconds:
> One issue is that on very fast queries, you'd get absolutely no data
> this way.
Yeah. Of course, on "very fast" queries you
Simon Riggs <[EMAIL PROTECTED]> writes:
> On Wed, 2006-06-07 at 15:32 -0400, Tom Lane wrote:
>> I propose we revert this patch and think about an interrupt-driven
>> sampling method instead.
> I don't have much more faith in crazy scheme No.2 either. (Mine or
> yours...)
> Can we just have an opt
On Wed, 2006-06-07 at 15:32 -0400, Tom Lane wrote:
> Ah-hah, I've sussed it.
> so I'm
> afraid this approach to sampling EXPLAIN ANALYZE is a failure.
Hmmm, I thought we already got that bit...sorry to sound negative.
> I propose we revert this patch and think about an interrupt-driven
> sampli
On Wed, Jun 07, 2006 at 11:34:30AM -0400, Tom Lane wrote:
> timer interrupt routine does this once every few milliseconds:
One issue is that on very fast queries, you'd get absolutely no data
this way. A possible solution would be to start with an extremely short
timer interval, and ramp it up if
Martijn van Oosterhout writes:
> Anyway, as a test, if you take the approach that the measurement at
> item X only applies to the tuples immediately preceding it, for the
> data you posted you get a result of 0.681148 seconds. How long did that
> query run that produced that data?
I didn't save t
On Wed, Jun 07, 2006 at 03:32:35PM -0400, Tom Lane wrote:
> On reflection it's easy to imagine other cases where some iterations
> take much longer than others in a not-very-predictable way. For
> instance, a join where only a subset of the outer tuples have a match
> is going to act that way. I
Ah-hah, I've sussed it. The faulty assumption can actually be stated
as: "all the executions, except maybe the first, will take approximately
the same amount of time". The failing test case I've been looking at
is one where the system decides to use a "batched" hash join, and in
this plan type so
Greg Stark <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
>> Nope, that is certainly not the explanation, because the hash table is
>> loaded in the (single) call of the Hash node at the start of the query.
>> It is static all through the sampled-and-not executions of the Hash J
Tom Lane <[EMAIL PROTECTED]> writes:
> Nope, that is certainly not the explanation, because the hash table is
> loaded in the (single) call of the Hash node at the start of the query.
> It is static all through the sampled-and-not executions of the Hash Join
> node, which is where our problem is.
Greg Stark <[EMAIL PROTECTED]> writes:
> Martijn van Oosterhout writes:
>> This would mean that we wouldn't be assuming that tuples near the end
>> take as long as tuples near the beginning. Except we're now dealing
>> will smaller numbers, so I'm worried about error accumlation.
> Hm, that would
Martijn van Oosterhout writes:
> This would mean that we wouldn't be assuming that tuples near the end
> take as long as tuples near the beginning. Except we're now dealing
> will smaller numbers, so I'm worried about error accumlation.
Hm, that would explain why Hash joins suffer from this esp
Martijn van Oosterhout writes:
> On Wed, Jun 07, 2006 at 09:53:32AM -0400, Tom Lane wrote:
>> If we do have to revert, I'd propose that we pursue the notion of
>> interrupt-driven sampling like gprof uses.
> How would that work? You could then estimate how much time was spent in
> each node, but
On Wed, Jun 07, 2006 at 09:53:32AM -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > Do we agree the idea can't go further? What next?
>
> It still needs investigation; I'm no longer convinced that the TLB-flush
> theory is correct. See rest of thread. We may well have to reve
Simon Riggs <[EMAIL PROTECTED]> writes:
> Do we agree the idea can't go further? What next?
It still needs investigation; I'm no longer convinced that the TLB-flush
theory is correct. See rest of thread. We may well have to revert the
current patch, but I'd like to be sure we understand why.
If
On Tue, 2006-06-06 at 16:50 -0400, Tom Lane wrote:
> I have a theory about this, and it's not pleasant at all. What I
> think is that we have a Heisenberg problem here: the act of invoking
> gettimeofday() actually changes what is measured.
> If this theory is correct, then the entire notion of
Greg Stark <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
>> And two, that upper plan nodes seem much more affected than lower
>> ones. That makes sense because the execution cycle of an upper node
>> will involve touching more userspace data than a lower node, and
>> therefore
Tom Lane <[EMAIL PROTECTED]> writes:
> And two, that upper plan nodes seem much more affected than lower
> ones. That makes sense because the execution cycle of an upper node
> will involve touching more userspace data than a lower node, and
> therefore more of the flushed TLB entries will need
Martijn van Oosterhout writes:
> This doesn't make any sense at all. How can a sampling run that only
> sampled 7% of the actual tuples, end up with a actual measured time
> that's more than 50% of the actual final runtime?
AFAICS, the only conclusion is that the sampled executions are in fact
ta
On Tue, Jun 06, 2006 at 04:50:28PM -0400, Tom Lane wrote:
> But that still leaves me with a problem because my machine is clearly
> overestimating the correction needed. I added some printouts and got
>
> raw totaltime = 0.370937
> per_iter = 0.000156913, SampleOverhead = 3.28e-06
> adj totaltime
"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> If that's the case, then maybe a more sopdisticated method of measuring
> the overhead would work.
I think you missed the point: the time spent in gettimeofday() itself
is not the major overhead of EXPLAIN ANALYZE. At least it appears that
this is the
On Tue, Jun 06, 2006 at 04:50:28PM -0400, Tom Lane wrote:
> I have a theory about this, and it's not pleasant at all. What I
> think is that we have a Heisenberg problem here: the act of invoking
> gettimeofday() actually changes what is measured. That is, the
> runtime of the "second part" of Ex
I wrote:
> The problem looks to be an underestimation of SampleOverhead, and on
> reflection it's clear why: what CalculateSampleOverhead is measuring
> isn't the total overhead, but the time between the two gettimeofday
> calls. Which is probably about half the true overhead.
On further thought,
Martijn van Oosterhout writes:
> Wierd. Can you get the output of *instr in each call of
> InstrEndLoop? Preferably after it does the calculation but before it
> clears the values. So we get an idea of number of samples and what it
> guesses. SampleOverhead would be good too.
The problem looks to
On Mon, Jun 05, 2006 at 11:02:33PM -0400, Tom Lane wrote:
> Just got this rather surprising result:
> The "Total runtime" is correct AFAICT, which puts the top node's "actual
> time" rather far out in left field. This is pretty repeatable on my old
> slow HPPA machine. A new Xeon shows less of
Just got this rather surprising result:
regression=# \timing
Timing is on.
regression=# explain analyze select * from tenk1 a join tenk1 b on a.unique1 =
b.unique2;
QUERY PLAN
79 matches
Mail list logo