date:20151218

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-18 Thread Amit Kapila

On Thu, Dec 17, 2015 at 8:44 PM, Andres Freund  wrote:
>
> On 2015-12-17 09:47:57 -0500, Robert Haas wrote:
> > On Tue, Dec 15, 2015 at 7:25 AM, Andres Freund 
wrote:
> > > I'd consider using a LWLock instead of a spinlock here. I've seen this
> > > contended in a bunch of situations, and the queued behaviour, combined
> > > with directed wakeups on the OS level, ought to improve the worst case
> > > behaviour measurably.
> >
> > Amit had the idea a while back of trying to replace the HASHHDR mutex
> > with something based on atomic ops.  It seems hard to avoid the
> > attendant A-B-A problems but maybe there's a way.
>
> It'd really like to see it being replaced by a queuing lock
> (i.e. lwlock) before we go there. And then maybe partition the freelist,
> and make nentries an atomic.  Just doing those might already be good
> enough and should be a lot easier.
>

makes sense to me, but I think we should as well try the Group leader idea
used for ProcArrayLock optimisation as during those tests, I found that
it gives better results as compare to partitioning.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-18 Thread Aleksander Alekseev

> Oops, 3.5-4 _times_ more TPS, i.e. 2206 TPS vs 546 TPS.

In fact these numbers are for similar but a little bit different
benchmark (same schema without checks on child tables). Here are exact
numbers for a benchmark described above.



Before:

$ pgbench -j 64 -c 64 -f pgbench.sql -T 30 my_database
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 64
number of threads: 64
duration: 30 s
number of transactions actually processed: 20433
latency average: 93.966 ms
tps = 679.698439 (including connections establishing)
tps = 680.353897 (excluding connections establishing)

$ pgbench -j 64 -c 64 -f pgbench.sql -T 30 my_database
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 64
number of threads: 64
duration: 30 s
number of transactions actually processed: 19111
latency average: 100.466 ms
tps = 635.763523 (including connections establishing)
tps = 636.112682 (excluding connections establishing)

$ pgbench -j 64 -c 64 -f pgbench.sql -T 30 my_database
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 64
number of threads: 64
duration: 30 s
number of transactions actually processed: 19218
latency average: 99.906 ms
tps = 639.506848 (including connections establishing)
tps = 639.838757 (excluding connections establishing)


After:

$ pgbench -j 64 -c 64 -f pgbench.sql -T 30 my_database
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 64
number of threads: 64
duration: 30 s
number of transactions actually processed: 95900
latency average: 20.021 ms
tps = 3194.142762 (including connections establishing)
tps = 3196.091843 (excluding connections establishing)

$ pgbench -j 64 -c 64 -f pgbench.sql -T 30 my_database
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 64
number of threads: 64
duration: 30 s
number of transactions actually processed: 96837
latency average: 19.827 ms
tps = 3225.822355 (including connections establishing)
tps = 3227.762847 (excluding connections establishing)

$ pgbench -j 64 -c 64 -f pgbench.sql -T 30 my_database
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 64
number of threads: 64
duration: 30 s
number of transactions actually processed: 96143
latency average: 19.970 ms
tps = 3202.637126 (including connections establishing)
tps = 3204.070466 (excluding connections establishing)


Ratio:

$ python

min(3194.0, 3225.0, 3202.0) / max(679.0, 635.0, 639.0)
4.703976435935199


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-18 Thread Amit Kapila

On Thu, Dec 17, 2015 at 9:33 PM, Aleksander Alekseev <
a.aleks...@postgrespro.ru> wrote:
>
> > It'd really like to see it being replaced by a queuing lock
> > (i.e. lwlock) before we go there. And then maybe partition the
> > freelist, and make nentries an atomic.
>
> I believe I just implemented something like this (see attachment). The
> idea is to partition PROCLOCK hash table manually into NUM_LOCK_
> PARTITIONS smaller and non-partitioned hash tables. Since these tables
> are non-partitioned spinlock is not used and there is no lock
> contention.
>

This idea can improve the situation with ProcLock hash table, but I
think IIUC what Andres is suggesting would reduce the contention
around dynahash freelist and can be helpful in many more situations
including BufMapping locks.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] psql - -dry-run option

2015-12-18 Thread Shulgin, Oleksandr

On Thu, Dec 17, 2015 at 9:13 PM, Tom Lane  wrote:

>
> Whether we really need a feature like that isn't clear though; it's not
> like it's hard to test things that way now.  Stick in a BEGIN with no
> COMMIT, you're there.  The problem only comes in if you start expecting
> the behavior to be bulletproof.  Maybe I'm being too pessimistic about
> what people would believe a --dry-run switch to be good for ... but
> I doubt it.
>

I'm on the same line: BEGIN/ROLLBACK requires trivial effort and a
--dry-run option might give a false sense of security, but it cannot
possibly rollback side-effects of user functions which modify filesystem or
interact with the outside world in some other way.

--
Alex

Re: [HACKERS] parallel joins, and better parallel explain

2015-12-18 Thread Dilip Kumar

On Fri, Dec 18, 2015 at 7.59 AM Robert Haas  Wrote,

> Uh oh.  That's not supposed to happen.  A GatherPath is supposed to
> have parallel_safe = false, which should prevent the planner from
> using it to form new partial paths.  Is this with the latest version
> of the patch?  The plan output suggests that we're somehow reaching
> try_partial_hashjoin_path() with inner_path being a GatherPath, but I
> don't immediately see how that's possible, because
> create_gather_path() sets parallel_safe to false unconditionally, and
> hash_inner_and_outer() never sets cheapest_safe_inner to a path unless
> that path is parallel_safe.

Yes, you are right, that create_gather_path() sets parallel_safe to false
unconditionally but whenever we are building a non partial path, that time
we should carry forward the parallel_safe state to its parent, and it seems
like that part is missing here..


I have done quick hack in create_nestloop_path and error is gone with this
change..

create_nestloop_path
{
   pathnode->path.param_info =
get_joinrel_parampathinfo(root,
  joinrel,
  outer_path,
  inner_path,
  sjinfo,
  required_outer,
  &restrict_clauses);
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = joinrel->consider_parallel;  //may be
joinrel is parallel safe but particular path is unsafe, so we need take
this from inner_path and outer_path

// if any of the child path is parallel unsafe the mark parent as
parallel unsafe..
*pathnode->path.parallel_safe = (inner_path->parallel_safe &
outer_path->parallel_safe); *

}

New plan is attached in the mail.

with this change now we can see execution time is also become half (may be
because warning messages are gone)


> Do you have a self-contained test case that reproduces this, or any
> insight as to how it's happening here?

This is TPC-H benchmark case:
we can setup like this..
1. git clone https://tkej...@bitbucket.org/tkejser/tpch-dbgen.git
2. complie using make
3. ./dbgen –v –s 5
4. ./qgen


Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

On Fri, Dec 18, 2015 at 7:59 AM, Robert Haas  wrote:

> On Thu, Dec 17, 2015 at 12:33 AM, Amit Kapila 
> wrote:
> > While looking at plans of Q5 and Q7, I have observed that Gather is
> > pushed below another Gather node for which we don't have appropriate
> > way of dealing.  I think that could be the reason why you are seeing
> > the errors.
>
> Uh oh.  That's not supposed to happen.  A GatherPath is supposed to
> have parallel_safe = false, which should prevent the planner from
> using it to form new partial paths.  Is this with the latest version
> of the patch?  The plan output suggests that we're somehow reaching
> try_partial_hashjoin_path() with inner_path being a GatherPath, but I
> don't immediately see how that's possible, because
> create_gather_path() sets parallel_safe to false unconditionally, and
> hash_inner_and_outer() never sets cheapest_safe_inner to a path unless
> that path is parallel_safe.
>
> Do you have a self-contained test case that reproduces this, or any
> insight as to how it's happening here?
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


q7_parallel_new.out
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-18 Thread Aleksander Alekseev

> This idea can improve the situation with ProcLock hash table, but I
> think IIUC what Andres is suggesting would reduce the contention
> around dynahash freelist and can be helpful in many more situations
> including BufMapping locks.

I agree. But as I understand PostgreSQL community doesn't generally
аpprоvе big changes that affects whole system. Especially if original
problem was only in one particular place. Therefore for now I suggest
only a small change. Naturally if it will be accepted there is no
reason not to apply same changes for BufMapping or even dynahash itself
with corresponding PROCLOCK hash refactoring.

BTW could you (or anyone?) please help me find this thread regarding
BufMapping or perhaps provide a benchmark? I would like to reproduce
this issue but I can't find anything relevant in a mailing list. Also
it seems to be a good idea to compare alternative approaches that were
mentioned (atomics ops, group leader). Are there any discussions,
benchmarks or patches regarding this topic?

Frankly I have serious doubts regarding atomics ops since they will more
likely create the same contention that a spinlock does. But perhaps
there is a patch that works not the way I think it could work.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-18 Thread Andres Freund

On 2015-12-18 11:40:58 +0300, Aleksander Alekseev wrote:
> $ pgbench -j 64 -c 64 -f pgbench.sql -T 30 my_database

What's in pgbench.sql?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-18 Thread Aleksander Alekseev

> What's in pgbench.sql?

It's from first message of this thread:

http://www.postgresql.org/message-id/20151211170001.78ded9d7@fujitsu


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] Proposal for \rotate in psql

2015-12-18 Thread Pavel Stehule

2015-12-17 21:33 GMT+01:00 Pavel Stehule :

>
>
> 2015-12-14 23:09 GMT+01:00 Daniel Verite :
>
>> Pavel Stehule wrote:
>>
>> > postgres=# \crosstabview 4 +month label
>> >
>> > Maybe using optional int order column instead label is better - then
>> you can
>> > do sort on client side
>> >
>> > so the syntax can be "\crosstabview VCol [+/-]HCol [[+-]HOrderCol]
>>
>> In the meantime I've followed a different idea: allowing the
>> vertical header to be sorted too, still server-side.
>>
>> That's because to me, the first impulse for a user noticing that
>> it's not sorted vertically would be to write
>>  \crosstabview +customer month
>> rather than figure out the
>>  \crosstabview customer +month_number month_name
>> invocation.
>> But both ways aren't even mutually exclusive. We could support
>>  \crosstabview [+|-]colV[:labelV] [+|-]colH[:labelH]
>> it's more complicated to understand, but not  harder to implement.
>>
>> Also, a non-zero FETCH_COUNT is supported by this version of the patch,
>> if the first internal FETCH retrieves less than FETCH_COUNT rows.
>> Otherwise a specific error is emitted.
>>
>> Also there are minor changes in arguments and callers following
>> recent code changes for \o
>>
>> Trying to crosstab with 10k+ distinct values vertically, I've noticed
>> that the current code is too slow, spending too much time
>> sorting.  I'm currently replacing its simple arrays of distinct values
>> with AVL binary trees, which I expect to be much more efficient for
>> this.
>>
>
> I played with last version and it is looking well. I have only one notice,
> but it is subjective - so can be ignored if you don't like it.
>
> The symbol 'X' in two column mode should be centred - now it is aligned to
> left, what is not nice. For unicode line style I prefer some unicode symbol
> - your chr(10003) is nice.
>
>
I checked code and I have only one note. The name "sortColumns" is not
valid now, and it isn't well - maybe ServerSideSort or some similar can be
better. The error message "Unexpected value when sorting horizontal
headers" is obsolete too.

Regards

Pavel




> Regards
>
> Pavel
>
>
>
>>
>> Best regards,
>> --
>> Daniel Vérité
>> PostgreSQL-powered mailer: http://www.manitou-mail.org
>> Twitter: @DanielVerite
>>
>
>

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-18 Thread Amit Kapila

On Fri, Dec 18, 2015 at 2:50 PM, Aleksander Alekseev <
a.aleks...@postgrespro.ru> wrote:
>
> > This idea can improve the situation with ProcLock hash table, but I
> > think IIUC what Andres is suggesting would reduce the contention
> > around dynahash freelist and can be helpful in many more situations
> > including BufMapping locks.
>
> I agree. But as I understand PostgreSQL community doesn't generally
> approve big changes that affects whole system. Especially if original
> problem was only in one particular place. Therefore for now I suggest
> only a small change. Naturally if it will be accepted there is no
> reason not to apply same changes for BufMapping or even dynahash itself
> with corresponding PROCLOCK hash refactoring.
>
> BTW could you (or anyone?) please help me find this thread regarding
> BufMapping or perhaps provide a benchmark?
>

You can find that in below thread:
http://www.postgresql.org/message-id/CAA4eK1+U+GQDc2sio4adRk+ux6obFYRPxkY=ch5bknabtoo...@mail.gmail.com

This even contains the original idea and initial patch for replacing
spinlocks with atomic ops.  I have mentioned about the A-B-A problem
few mails down in that thread and given a link to paper suggesting how
that can be solved.  It needs more work, but doable.

> I would like to reproduce
> this issue but I can't find anything relevant in a mailing list. Also
> it seems to be a good idea to compare alternative approaches that were
> mentioned (atomics ops, group leader). Are there any discussions,
> benchmarks or patches regarding this topic?
>

You can find the discussion and patch related to Group leader approach
in the thread:
http://www.postgresql.org/message-id/caa4ek1jbx4fzphignt0jsaz30a85bpjv+ewhk+wg_o-t6xu...@mail.gmail.com
This patch is already committed.

> Frankly I have serious doubts regarding atomics ops since they will more
> likely create the same contention that a spinlock does. But perhaps
> there is a patch that works not the way I think it could work.
>

I think it is difficult to say without implementing it.  If we want
to evaluate
multiple approaches then what we can do here is I can help with writing a
patch using LWLocks and you can once evaluate the atomic ops approach
and we already have your current patch, then we can see what works out
best.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

[HACKERS] Costing foreign joins in postgres_fdw

2015-12-18 Thread Ashutosh Bapat

Hi All,
Costs for foreign queries are either obtained from the foreign server using
EXPLAIN (if use_remote_estimate is ON) otherwise they are cooked up locally
based on the statistics available. For joins as well, we have to do the
same. If use_remote_estimates [1] is ON, we can get the costs from the
foreign server. Rest of the mail discusses approaches for estimating the
costs when use_remote_estimates is OFF.

1. Unlike base relations where the table data "has to be" fetched from the
foreign server, a join doesn't "have to be" fetched from the foreign
server. So, even if use_remote_estimate is OFF for a base relation, we do
try to estimate something locally. But for a join that's not compulsory, so
we can choose not to estimate anything and not push down the join if
use_remote_estimate is OFF. Whether we do that depends upon how well we can
estimate the join cost when use_remote_estimate is OFF.

2. Locally estimating the cost of join that will be performed on the
foreign server is difficult because we do not know which join strategy the
foreign server is going to use, which in turn depends upon the availability
of indexes, work memory, statistics about joining expressions etc. One way
to do this is to use the cost of cheapest local join path built upon
foreign outer and inner paths, to estimate the cost of executing the join
at the foreign server The startup and run time costs for sending, parsing
and planning query at the foreign server as well as the cost to fetch the
tuples need to be adjusted, so that it doesn't get counted twice. We may
assume that the cost for the foreign join will be some factor of the
adjusted cost, like we have done for estimating cost of sort pushdown. The
reason we choose cheapest path with foreign inner and outer paths is
because that's likely to be a closer to the real estimate than the path
which does not have foreign inner and outer paths. In the absence of such
path, we should probably not push the join down since no local path has
found pushing inner and outer to be cheaper and it's likely (certainly not
a rule) that pushing the join in question down is not going to be cheaper
than the local paths.

1st option is easy but it sounds too restrictive. 2nd option liberal but is
complex.

Any other ideas as to how we can estimate cost of foreign join when
use_remote_estimate is OFF?

[1]
http://www.postgresql.org/message-id/CAFjFpRepSC2e3mZ1uYSopJD6R19fOZ0dNNf9Z=gnyksb6wg...@mail.gmail.com
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] Costing foreign joins in postgres_fdw

2015-12-18 Thread Albe Laurenz

Ashutosh Bapat wrote:
> Costs for foreign queries are either obtained from the foreign server using 
> EXPLAIN (if
> use_remote_estimate is ON) otherwise they are cooked up locally based on the 
> statistics available. For
> joins as well, we have to do the same. If use_remote_estimates [1] is ON, we 
> can get the costs from
> the foreign server. Rest of the mail discusses approaches for estimating the 
> costs when
> use_remote_estimates is OFF.
> 
> 
> 1. Unlike base relations where the table data "has to be" fetched from the 
> foreign server, a join
> doesn't "have to be" fetched from the foreign server. So, even if 
> use_remote_estimate is OFF for a
> base relation, we do try to estimate something locally. But for a join that's 
> not compulsory, so we
> can choose not to estimate anything and not push down the join if 
> use_remote_estimate is OFF. Whether
> we do that depends upon how well we can estimate the join cost when 
> use_remote_estimate is OFF.
> 
> 2. Locally estimating the cost of join that will be performed on the foreign 
> server is difficult
> because we do not know which join strategy the foreign server is going to 
> use, which in turn depends
> upon the availability of indexes, work memory, statistics about joining 
> expressions etc. One way to do
> this is to use the cost of cheapest local join path built upon foreign outer 
> and inner paths, to
> estimate the cost of executing the join at the foreign server The startup and 
> run time costs for
> sending, parsing and planning query at the foreign server as well as the cost 
> to fetch the tuples need
> to be adjusted, so that it doesn't get counted twice. We may assume that the 
> cost for the foreign join
> will be some factor of the adjusted cost, like we have done for estimating 
> cost of sort pushdown. The
> reason we choose cheapest path with foreign inner and outer paths is because 
> that's likely to be a
> closer to the real estimate than the path which does not have foreign inner 
> and outer paths. In the
> absence of such path, we should probably not push the join down since no 
> local path has found pushing
> inner and outer to be cheaper and it's likely (certainly not a rule) that 
> pushing the join in question
> down is not going to be cheaper than the local paths.
> 
> 
> 1st option is easy but it sounds too restrictive. 2nd option liberal but is 
> complex.

My gut feeling is that for a join where all join predicates can be pushed down, 
it
will usually be a win to push the join to the foreign server.

So in your first scenario, I'd opt for always pushing down the join
if possible if use_remote_estimate is OFF.

Your second scenario is essentially to estimate that a pushed down join will
always be executed as a nested loop join, which will in most cases produce
an unfairly negative estimate.

What about using local statistics to come up with an estimated row count for
the join and use that as the basis for an estimate?  My idea here is that it
is always be a win to push down a join unless the result set is so large that
transferring it becomes the bottleneck.
Maybe, to come up with something remotely realistic, a formula like

sum of locally estimated costs of sequential scan for the base table
plus count of estimated result rows (times a factor)

Yours,
Laurenz Albe



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] statistics for array types

2015-12-18 Thread Alexander Korotkov

On Wed, Sep 16, 2015 at 8:01 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> On Mon, Aug 24, 2015 at 8:26 PM, Jeff Janes  wrote:
>
>> On Thu, Aug 20, 2015 at 6:00 PM, Tomas Vondra <
>> tomas.von...@2ndquadrant.com> wrote:
>>
>>> Hi,
>>>
>>> On 08/11/2015 04:38 PM, Jeff Janes wrote:
>>>
 When reviewing some recent patches, I decided the statistics gathered
  for arrays had some pre-existing shortcomings.

 The main one is that when the arrays contain rare elements there is
 no histogram to fall back upon when the MCE array is empty, the way
 there is for scalar stats.  So it has to punt completely and resort
 to saying that it is 0.5% selectivity without recourse to any data at
 all.

 The rationale for applying the threshold before things are eligible
 for inclusion in the MCE array seems to be that this puts some
 theoretical bound on the amount of error we are likely to have in
 that element.  But I think it is better to exceed that theoretical
 bound than it is to have no data at all.

 The attached patch forces there to be at least one element in MCE,
 keeping the one element with the highest predicted frequency if the
 MCE would otherwise be empty.  Then any other element queried for is
 assumed to be no more common than this most common element.

>>>
>>> We only really need the frequency, right? So do we really need to keep
>>> the actual MCV element? I.e. most_common_elem_freqs does not have the
>>> same number of values as most_common_elems anyway:
>>>
>>>   A list of the frequencies of the most common element values, i.e., the
>>>   fraction of rows containing at least one instance of the given value.
>>>   Two or three additional values follow the per-element frequencies;
>>>   these are the minimum and maximum of the preceding per-element
>>>   frequencies, and optionally the frequency of null elements.
>>>   (Null when most_common_elems is.)
>>>
>>> So we might modify it so that it's always defined - either it tracks the
>>> same values as today (when most_common_elems is defined), or the
>>> frequency of the most common element (when most_common_elems is NULL).
>>>
>>
>> I had also considered that.  It requires more changes to make it happen,
>> and it seems to create a more complex contract on what those columns mean,
>> but without giving a corresponding benefit.
>>
>>
>>>
>>> This way we can keep the current theoretical error-bound on the MCE
>>> frequencies, and if that's not possible we can have at least the new
>>> value without confusing existing code.
>>
>>
>> But if the frequency of the most common element was grossly wrongly, then
>> whatever value we stick in there is still going to be grossly wrong.
>> Removing the value associated with it isn't going to stop it from being
>> wrong.  When we do query with the (incorrectly thought) first most common
>> element, either it will find and use the wrong value from slot 1, or it
>> will find nothing and fall back on the same wrong value from slot 3.
>>
>
> Hmm, I think we should store cutoff_freq / nonnull_cnt as minfreq when we
> collect no MCEs. Moreover, I think we should store it even when num_mcelem
> >= track_len and we haven't cut MCEs we find. In this case we can get more
> precise estimation for rare element using the knowledge that all MCEs which
> exceeds the threshold are present (assuming their frequecies could be much
> higher than threshold).
>
> When there are no MCEs then we should use assumption that there are no
> elements more frequent than cutoff_freq / nonnull_cnt. Using lower values
> wouldn't be statistically correct.
>

The patch implementing my idea above is attached. In your example it
gives following result.

# explain (analyze) select * from foobar where foo @> '{567}';
  QUERY PLAN
---
 Seq Scan on foobar  (cost=0.00..2387.00 rows=30 width=61) (actual
time=28.691..28.691 rows=0 loops=1)
   Filter: (foo @> '{567}'::integer[])
   Rows Removed by Filter: 10
 Planning time: 0.044 ms
 Execution time: 28.707 ms
(5 rows)

In this particular example it gives less accurate estimate. However, I
believe it would be more safe estimate in general.

I've faced difficulties with storing empty mcelements
array. update_attstats turns empty array into null, and get_attstatsslot
throws error when trying to read null array. I've changes get_attstatsslot
so that it returns empty array when it meets null. I'm not sure in this
solution.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


array_typanalyze_0_mce.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-18 Thread Teodor Sigaev


Oh, that's an interesting idea.  I guess the problem is that if the
freelist is unshared, then users might get an error that the lock
table is full when some other partition still has elements remaining.


Could we split one freelist in hash to NUM_LOCK_PARTITIONS freelists?
Each partition will have its own freelist and if freelist is empty then 
partition should search an entry in freelists of other partitions. To prevent 
concurrent access it's needed to add one LWLock to hash, each partition should 
lock LWlock in share mode to work with its own freelist and exclusive to work 
with other freelists.


Actually, I'd like to improve all partitioned hashes instead of improve only one 
case.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function and view to retrieve WAL receiver status

2015-12-18 Thread Michael Paquier

On Fri, Dec 18, 2015 at 8:39 AM, Robert Haas  wrote:
> On Mon, Dec 14, 2015 at 7:23 PM, Michael Paquier
>  wrote:
>> On Tue, Dec 15, 2015 at 5:27 AM, Gurjeet Singh wrote:
>>> The function, maybe. But emitting an all-nulls row from a view seems
>>> counter-intuitive, at least when looking at it in context of relational
>>> database.
>>
>> OK, noted. Any other opinions?
>
> I wouldn't bother with the view.  If we're going to do it, I'd say
> just provide the function and let people SELECT * from it if they want
> to.

OK, I took some time to write a patch for that as attached, added in
the next CF here:
https://commitfest.postgresql.org/8/447/
I am fine switching to an SRF depending on other opinions of people
here, it just seems like an overkill knowing the uniqueness of the WAL
sender in a server.

I have finished with a function and a system view, this came up more
in line with the existing things like pg_stat_archiver, and this makes
as well the documentation clearer, at least that was my feeling when
hacking that.
Regards,
-- 
Michael


0001-Add-system-view-and-function-to-report-WAL-receiver-.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-18 Thread Aleksander Alekseev

> Could we split one freelist in hash to NUM_LOCK_PARTITIONS freelists?
> Each partition will have its own freelist and if freelist is empty
> then partition should search an entry in freelists of other
> partitions. To prevent concurrent access it's needed to add one
> LWLock to hash, each partition should lock LWlock in share mode to
> work with its own freelist and exclusive to work with other freelists.
> 
> Actually, I'd like to improve all partitioned hashes instead of
> improve only one case.

It seems to be a most promising direction of research for now. I will
send a patch and benchmark results soon.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A question regarding LWLock in ProcSleep

2015-12-18 Thread Amit Kapila

On Thu, Dec 17, 2015 at 1:51 PM, Kenan Yao  wrote:

> Hi there,
>
> In function ProcSleep, after the process has been waken up, either with
> lock granted or deadlock detected, it would re-acquire the lock table's
> partition LWLock.
>
> The code episode is here:
>
> /*
>  * Re-acquire the lock table's partition lock.  We have to do this to hold
>  * off cancel/die interrupts before we can mess with lockAwaited (else we
>  * might have a missed or duplicated locallock update).
>  */
> LWLockAcquire(partitionLock, LW_EXCLUSIVE);
>
> /*
>  * We no longer want LockErrorCleanup to do anything.
>  */
> lockAwaited = NULL;
>
> /*
>  * If we got the lock, be sure to remember it in the locallock table.
>  */
> if (MyProc->waitStatus == STATUS_OK)
> GrantAwaitedLock();
>
> /*
>  * We don't have to do anything else, because the awaker did all the
>  * necessary update of the lock table and MyProc.
>  */
> return MyProc->waitStatus;
>
> 
> Questions are:
>
> (1) The comment says that "we might have a missed or duplicated locallock
> update", in what cases would we hit this if without holding the LWLock?
>
> (2) The comment says "we have to do this to hold off cancel/die
> interrupts", then:
>
>- why using LWLockAcquire instead of HOLD_INTERRUPTS directly?
>- From the handler of SIGINT and SIGTERM, seems nothing serious would
>be processed here, since no CHECK_FOR_INTERRUPS is called before releasing
>this LWLock. Why we should hold off cancel/die interrupts here?
>
> (3) Before releasing this LWLock, the only share memory access is
> MyProc->waitStatus; since the process has been granted the lock or removed
> from the lock's waiting list because of deadlock, is it possible some other
> processes would access this field? if not, then why we need LWLock here?
> what does this lock protect?
>
>
I think the other thing which needs protection of LWLock is
access to proclock which is done in the caller
(LockAcquireExtended).




With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Remove array_nulls?

2015-12-18 Thread Robert Treat

On Thu, Dec 17, 2015 at 4:31 PM, Robert Haas  wrote:
> On Wed, Dec 16, 2015 at 10:48 PM, Jim Nasby  wrote:
>> IIUC, that means supporting backwards compat. GUCs for 10 years, which seems
>> a bit excessive. Granted, that's about the worse-case scenario for what I
>> proposed (ie, we'd still be supporting 8.0 stuff right now).
>
> Not to me.  GUCs like array_nulls don't really cost much - there is no
> reason to be in a hurry about removing them that I can see.
>

Perhaps not with rock solid consistency, but we've certainly used the
argument of the "not a major major version release" to shoot down
introducing incompatible features / improvements (protocol changes
come to mind), which further lends credence to Jim's point about
people expecting backwards incompatible breakage to be in a major
major version changes.

Given the overhead from a development standpoint is low, whats the
better user experience: delay removal for as long as possible (~10
years) to narrow the likely of people being affected, or make such
changes as visible as possible (~6+ years) so that people have clear
expectations / lines of demarcation?

Robert Treat
play: xzilla.net
work: omniti.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-18 Thread Aleksander Alekseev

> This idea can improve the situation with ProcLock hash table, but I
> think IIUC what Andres is suggesting would reduce the contention
> around dynahash freelist and can be helpful in many more situations
> including BufMapping locks.

I agree. But as I understand PostgreSQL community doesn't generally
approve big changes that affects whole system. Especially if original
problem was only in one particular place. Therefore for now I suggest
only a small change. Naturally if it will be accepted there is no
reason not to apply same changes for BufMapping or even dynahash itself
with corresponding PROCLOCK hash refactoring.

BTW could you (or anyone?) please help me find this thread regarding
BufMapping or perhaps provide a benchmark? I would like to reproduce
this issue but I can't find anything relevant in a mailing list. Also
it seems to be a good idea to compare alternative approaches that were
mentioned (atomics ops, group leader). Are there any discussions,
benchmarks or patches regarding this topic?

Frankly I have serious doubts regarding atomics ops since they will more
likely create the same contention that a spinlock does. But perhaps
there is a patch that works not the way I think it could work.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

79 matches

Mail list logo