Re: [PERFORM] poor execution plan because column dependence

2011-04-14 Thread Václav Ovsík
On Wed, Apr 13, 2011 at 12:24:06PM -0400, Tom Lane wrote:
> > Interesting the original index tickets5 is still used for
> > int4eq(main.effectiveid, main.id), no need to build a different.
> 
> Well, no, it won't be.  This hack is entirely dependent on the fact that
> the optimizer mostly works with operator expressions, and is blind to
> the fact that the underlying functions are really the same thing.
> (Which is something I'd like to see fixed someday, but in the meantime
> it gives you an escape hatch.)  If you use the int4eq() construct in a
> context where you'd like to see it transformed into an index qual, it
> won't be.  For this particular case that doesn't matter because there's
> no use in using an index for that clause anyway.  But you'll need to be
> very careful that your changes in the query generator don't result in
> using int4eq() in any contexts other than the "main.EffectiveId=main.id"
> check.

Sorry I'm not certain understand your paragraph completely...

I perfectly understand the fact that change from
A = B   intoint4eq(A, B)
stopped bad estimate and execution plan is corrected, but that can
change someday in the future.

I'm not certain about your sentence touching int4eq() and index. The
execution plan as show in my previous mail contains information about
using index tickets5:

...
   ->  Index Scan using tickets5 on tickets main  (cost=0.00..4.38 
rows=1 width=162) (actual time=0.006..0.006 rows=0 loops=15593)
 Index Cond: (main.id = transactions_1.objectid)
 Filter: (((main.status)::text <> 'deleted'::text) AND 
(main.lastupdated > '2008-12-31 23:00:00'::timestamp without time zone) AND 
(main.created > '2005-12-31 23:00:00'::timestamp without time zone) AND 
int4eq(main.effectiveid, main.id) AND (main.queue = 15) AND ((main.type)::text 
= 'ticket'::text) AND ((main.status)::text = 'resolved'::text))
...


Filter condition contains int4eq(main.effectiveid, main.id) and tickets5
is: "tickets5" btree (id, effectiveid)

That means tickets5 index was used for int4eq(main.effectiveid, main.id).
Is it right? Or am I something missing?

Well the index will not be used generally probably, because of
selectivity of int4eq() you mention (33%). The planner thinks it is
better to use seq scan then. I tried this now.

I did hack for this particular case only:


diff --git a/local/lib/DBIx/SearchBuilder.pm b/local/lib/DBIx/SearchBuilder.pm
index f3ee1e1..9e3a6a6 100644
--- a/local/lib/DBIx/SearchBuilder.pm
+++ b/local/lib/DBIx/SearchBuilder.pm
@@ -1040,7 +1040,9 @@ sub _CompileGenericRestrictions {
 $result .= ' '. $entry . ' ';
 }
 else {
-$result .= join ' ', @{$entry}{qw(field op value)};
+my $term = join ' ', @{$entry}{qw(field op value)};
+$term =~ s/^(main|Tickets_\d+)\.(EffectiveId) = 
(\1)\.(id)$/int4eq($1.$2, $3.$4)/i;
+$result .= $term;
 }
 }
 $result .= ')';


It works as expected.
Thanks
Best Regards
-- 
Zito

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance

2011-04-14 Thread tv
> On Thu, Apr 14, 2011 at 1:26 AM, Tomas Vondra  wrote:
>> Workload A: Touches just a very small portion of the database, to the
>> 'active' part actually fits into the memory. In this case the cache hit
>> ratio can easily be close to 99%.
>>
>> Workload B: Touches large portion of the database, so it hits the drive
>> very often. In this case the cache hit ratio is usually around RAM/(size
>> of the database).
>
> You've answered it yourself without even realized it.
>
> This particular factor is not about an abstract and opaque "Workload"
> the server can't know about. It's about cache hit rate, and the server
> can indeed measure that.

OK, so it's not a matter of tuning random_page_cost/seq_page_cost? Because
tuning based on cache hit ratio is something completely different (IMHO).

Anyway I'm not an expert in this field, but AFAIK something like this
already happens - btw that's the purpose of effective_cache_size. But I'm
afraid there might be serious fail cases where the current model works
better, e.g. what if you ask for data that's completely uncached (was
inactive for a long time). But if you have an idea on how to improve this,
great - start a discussion in the hackers list and let's see.

regards
Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-14 Thread Florian Weimer
* Jesper Krogh:

> If you have a 1 socket system, all of your data can be fetched from
> "local" ram seen from you cpu, on a 2 socket, 50% of your accesses
> will be "way slower", 4 socket even worse.

There are non-NUMA multi-socket systems, so this doesn't apply in all
cases.  (The E5320-based system is likely non-NUMA.)

Speaking about NUMA, do you know if there are some non-invasive tools
which can be used to monitor page migration and off-node memory
accesses?

-- 
Florian Weimer
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-14 Thread Cédric Villemain
2011/4/14 Florian Weimer :
> * Jesper Krogh:
>
>> If you have a 1 socket system, all of your data can be fetched from
>> "local" ram seen from you cpu, on a 2 socket, 50% of your accesses
>> will be "way slower", 4 socket even worse.
>
> There are non-NUMA multi-socket systems, so this doesn't apply in all
> cases.  (The E5320-based system is likely non-NUMA.)
>
> Speaking about NUMA, do you know if there are some non-invasive tools
> which can be used to monitor page migration and off-node memory
> accesses?

I am unsure it is exactly what you are looking for, but linux do
provide access to counters in:
/sys/devices/system/node/node*/numastat

I also find usefull to check meminfo per node instead of via /proc


-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance

2011-04-14 Thread Cédric Villemain
2011/4/14 Tom Lane :
> Nathan Boley  writes:
>> FWIW, awhile ago I wrote a simple script to measure this and found
>> that the *actual* random_page / seq_page cost ratio was much higher
>> than 4/1.
>
> That 4:1 ratio is based on some rather extensive experimentation that
> I did back in 2000.  In the interim, disk transfer rates have improved
> quite a lot more than disk seek times have, and the CPU cost to process
> a page's worth of data has also improved compared to the seek time.
> So yeah, you'd likely get a higher number if you redid those experiments
> on modern hardware (at least assuming it was rotating media and not SSD).
> On the other hand, the effects of caching push the numbers in the other
> direction, and modern machines also have a lot more RAM to cache in than
> was typical ten years ago.  I'm not sure how much point there is in
> trying to improve the default number in the abstract --- we'd really
> need to have a more robust model of cache effects before I'd trust any
> automatic tuning procedure to set the value for me.

Well, at spare time, I am doing some POC with  "ANALYZE OSCACHE
relation;", pg stats are updated accordingly with new data ( it is not
finish yet) : at least the percentage in OS cache, maybe the number of
groups in cache and/or the distribution.

Anyway the idea is to allow the planner to use random and seq page
cost to be applyed on the part not-in-cache, without replacing the
algo using effective_cache_size. The planner may have one other GUC
like 'mem_page_cost' to set a cost on access from cache and use it
while estinating the cost...

Side effect is that random page cost and seq page cost should be more
stable and easiest to set based on a script because they won't have
the mixed sources of disk/memory, only the disk acces cost. (if
ANALYZE OSCACHE is good enough)
-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] poor execution plan because column dependence

2011-04-14 Thread Tom Lane
=?iso-8859-1?Q?V=E1clav_Ovs=EDk?=  writes:
> I'm not certain about your sentence touching int4eq() and index. The
> execution plan as show in my previous mail contains information about
> using index tickets5:

>->  Index Scan using tickets5 on tickets main  
> (cost=0.00..4.38 rows=1 width=162) (actual time=0.006..0.006 rows=0 
> loops=15593)
>  Index Cond: (main.id = transactions_1.objectid)
>  Filter: (((main.status)::text <> 'deleted'::text) AND 
> (main.lastupdated > '2008-12-31 23:00:00'::timestamp without time zone) AND 
> (main.created > '2005-12-31 23:00:00'::timestamp without time zone) AND 
> int4eq(main.effectiveid, main.id) AND (main.queue = 15) AND 
> ((main.type)::text = 'ticket'::text) AND ((main.status)::text = 
> 'resolved'::text))

> That means tickets5 index was used for int4eq(main.effectiveid, main.id).
> Is it right? Or am I something missing?

No, the clause that's being used with the index is
main.id = transactions_1.objectid
The "filter condition" is just along for the ride --- it doesn't matter
what sort of expressions are in there, so long as they only use
variables available at this point in the plan.  But if you had coded
that clause as
int4eq(main.id, transactions_1.objectid)
it would have been unable to create this plan at all.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-14 Thread Scott Carey


On 4/13/11 9:23 PM, "Greg Smith"  wrote:

>Scott Carey wrote:
>> If postgres is memory bandwidth constrained, what can be done to reduce
>> its bandwidth use?
>>
>> Huge Pages could help some, by reducing page table lookups and making
>> overall access more efficient.
>> Compressed pages (speedy / lzo) in memory can help trade CPU cycles for
>> memory usage for certain memory segments/pages -- this could potentially
>> save a lot of I/O too if more pages fit in RAM as a result, and also
>>make
>> caches more effective.
>>   
>
>The problem with a lot of these ideas is that they trade the memory
>problem for increased disruption to the CPU L1 and L2 caches.  I don't
>know how much that moves the bottleneck forward.  And not every workload
>is memory constrained, either, so those that aren't might suffer from
>the same optimizations that help in this situation.

Compression has this problem, but I'm not sure where the plural "a lot of
these ideas" comes from.

Huge Pages helps caches.
Dual-Pivot quicksort is more cache friendly and is _always_ equal to or
faster than traditional quicksort (its a provably improved algorithm).
Smaller hash tables help caches.

>
>I just posted my slides from my MySQL conference talk today at
>http://projects.2ndquadrant.com/talks , and those include some graphs of
>recent data collected with stream-scaling.  The current situation is
>really strange in both Intel and AMD's memory architectures.  I'm even
>seeing situations where lightly loaded big servers are actually
>outperformed by small ones running the same workload.  The 32 and 48
>core systems using server-class DDR3/1333 just don't have the bandwidth
>to a single core that, say, an i7 desktop using triple-channel DDR3-1600
>does.  The trade-offs here are extremely hardware and workload
>dependent, and it's very easy to tune for one combination while slowing
>another.
>
>-- 
>Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
>PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
>"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
>


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-14 Thread Claudio Freire
On Thu, Apr 14, 2011 at 10:05 PM, Scott Carey  wrote:
> Huge Pages helps caches.
> Dual-Pivot quicksort is more cache friendly and is _always_ equal to or
> faster than traditional quicksort (its a provably improved algorithm).

If you want a cache-friendly sorting algorithm, you need mergesort.

I don't know any algorithm as friendly to caches as mergesort.

Quicksort could be better only when the sorting buffer is guaranteed
to fit on the CPU's cache, and that's usually just a few 4kb pages.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-14 Thread Scott Carey

On 4/14/11 1:19 PM, "Claudio Freire"  wrote:

>On Thu, Apr 14, 2011 at 10:05 PM, Scott Carey 
>wrote:
>> Huge Pages helps caches.
>> Dual-Pivot quicksort is more cache friendly and is _always_ equal to or
>> faster than traditional quicksort (its a provably improved algorithm).
>
>If you want a cache-friendly sorting algorithm, you need mergesort.
>
>I don't know any algorithm as friendly to caches as mergesort.
>
>Quicksort could be better only when the sorting buffer is guaranteed
>to fit on the CPU's cache, and that's usually just a few 4kb pages.

Of mergesort variants, Timsort is a recent general purpose variant favored
by many since it is sub- O(n log(n)) on partially sorted data.

Which work best under which circumstances depends a lot on the size of the
data, size of the elements, cost of the compare function, whether you're
sorting the data directly or sorting pointers, and other factors.

Mergesort may be more cache friendly (?) but might use more memory
bandwidth.  I'm not sure.

I do know that dual-pivot quicksort provably causes fewer swaps (but the
same # of compares) as the usual single-pivot quicksort.  And swaps are a
lot slower than you would expect due to the effects on processor caches.
Therefore it might help with multiprocessor scalability by reducing
memory/cache pressure.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-14 Thread Claudio Freire
On Fri, Apr 15, 2011 at 12:42 AM, Scott Carey  wrote:
> I do know that dual-pivot quicksort provably causes fewer swaps (but the
> same # of compares) as the usual single-pivot quicksort.  And swaps are a
> lot slower than you would expect due to the effects on processor caches.
> Therefore it might help with multiprocessor scalability by reducing
> memory/cache pressure.

I agree, and it's quite non-disruptive - ie, a drop-in replacement for
quicksort, whereas mergesort or timsort both require bigger changes
and heavier profiling.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance