Re: MCV lists for highly skewed distributions

2018-03-17 Thread Dean Rasheed
> On 03/11/2018 10:23 AM, Dean Rasheed wrote:
>> I'm moving this back to a status of "Needs review" partly because the
>> code has changed significantly, but also because I want to do more
>> testing, particularly with larger datasets.
>>

John, Tomas,

Thanks for looking at this, and sorry for my delayed reply. The
demands of my day job keep getting in the way. I'm feeling pretty good
about this patch though. There may still be some fine tuning to do,
but it's looking like a definite improvement over HEAD, so I'm
optimistic that I'll be able to finish it soonish.

I'll post back more detailed responses to your specific points shortly.

Regards,
Dean



Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

2018-03-17 Thread Amit Kapila
On Mon, Mar 12, 2018 at 6:33 PM, amul sul  wrote:
> On Mon, Mar 12, 2018 at 11:45 AM, amul sul  wrote:
>> On Sat, Mar 10, 2018 at 5:25 PM, Amit Kapila  wrote:
>>> complete CTID.
>>
>> Sure, will do that.
>
> I did the aforementioned changes in the attached patch, thanks.
>

--- a/src/include/storage/itemptr.h
+++ b/src/include/storage/itemptr.h
@@ -23,7 +23,9 @@
  * This is a pointer to an item within a disk page of a known file
  * (for example, a cross-link from an index to its parent table).
  * blkid tells us which block, posid tells us which entry in the linp
- * (ItemIdData) array we want.
+ * (ItemIdData) array we want.  blkid is marked InvalidBlockNumber when
+ * a tuple is moved to another partition relation due to an update of
+ * the partition key.

I think instead of updating this description in itemptr.h, you should
update it in htup_details.h where we already have a description of
t_ctid.  After this patch, the t_ctid column value in heap_page_items
function will show it as InvalidBlockNumber and in the documentation,
we have given the reference of htup_details.h.   Other than that the
latest version looks good to me.

I have marked this patch as RFC as this is a small change, hope you
can update the patch soon.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



[GSOC 18] Perf Farm——Description of patch

2018-03-17 Thread Hongyuan Ma
Hi,
This email is a description of 0001-add-apps-directory.patch.
In this patch, I created the apps directory and created two applications, 
test_item and user_operation.
The test_item application model classes are: TestBranch, TestItem
The model classes in the user_operation application are: Alias, UserMachine, 
TestResult


The relationship between User and UserMachine is 1 : m
The relationship between TestResult and UserMachine is: n : 1
The relationship between TestItem and TestResult is 1 : m


I also created the db_tools directory. The scripts in this directory make it 
easy for developers to import simulation data during development.


Here are some of my ideas:
Since django-rest-framework does not support django 1.8, I am anxious that in 
the next patch, I can upgrade the Django version to the TLS version, which is 
now 1.11.


I'm not sure if there is an existing alias database. If not, maybe I should 
write a web crawler that collects aliases (such as plants‘ names) from web 
pages (https://en.wikipedia.org/wiki/List_of_garden_plants).


In addition, can you please provide PGAUTH_REDIRECT and PGAUTH_KEY constants 
for use in development. I plan to change the return type of the auth module to 
json format.

Re: SSL passphrase prompt external command

2018-03-17 Thread Peter Eisentraut
On 3/16/18 12:38, Daniel Gustafsson wrote:
>> Maybe this is a bit too cute.  We could instead add another setting
>> "ssl_passphrase_command_support_reload”.
> I think thats a good idea, this feels like an easy thing to be confused about
> and get wrong (which I might have done with the above).
> 
> Either way, there is a clear path forward on this (the new setting or just
> ignoring me due to being confused) so I am going to mark this ready for
> committer as the remaining work is known and small.

Committed with that change.  Thanks!

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Tomas Vondra
Hi,

I happened to be updating our machine running our buildfarm animals, and
I noticed something quite strange - the machine was unexpectedly running
out of disk space, which is rather suspicious as it's running just the
regression tests :-/

After a bit of investigation, I found this:

# pwd /mnt/fulmar/build/HEAD/pgsql.build/src/pl/plpgsql/src/results
# ls -l
total 57089152
-rw-r--r--. 1 pgbuild pgbuild 714 Jan  1 00:27 plpgsql_call.out
-rw-r--r--. 1 pgbuild pgbuild 58458515597 Mar 17 13:54
plpgsql_control.out

Right - the plpgsql_control.out output has about 58GB, which is somewhat
excessive I guess. The reason is fairly simple:

NOTICE:  1..3: i = 1
NOTICE:  1..3: i = 2
NOTICE:  1..3: i = 3
NOTICE:  1..10 by 3: i = 1
NOTICE:  1..10 by 3: i = 4
NOTICE:  1..10 by 3: i = 7
NOTICE:  1..10 by 3: i = 10
NOTICE:  1..11 by 3: i = 1
NOTICE:  1..11 by 3: i = 4
NOTICE:  1..11 by 3: i = 7
NOTICE:  1..11 by 3: i = 10
NOTICE:  reverse 10..0 by 3: i = 10
NOTICE:  reverse 10..0 by 3: i = 7
NOTICE:  reverse 10..0 by 3: i = 4
NOTICE:  reverse 10..0 by 3: i = 1
NOTICE:  2147483620..2147483647 by 10: i = 2147483620
NOTICE:  2147483620..2147483647 by 10: i = 2147483630
NOTICE:  2147483620..2147483647 by 10: i = 2147483640
NOTICE:  2147483620..2147483647 by 10: i = -2147483646
NOTICE:  2147483620..2147483647 by 10: i = -2147483636
NOTICE:  2147483620..2147483647 by 10: i = -2147483626
... many more NOTICE messages ...


Looking at the plpgsql_call.out timestamp, this seems to be running
since January 1, which also correlates with the last results reported to
pgbuildfarm.

Does that remind some known issue to anyone? It seems a bit like an
undetected overflow, but I don't see why should the behavior change
suddenly (AFAICS there were no updates to the machine just before
January 1st).

I wonder if this might be related ICC 14.0.3, but again - it has been
like that for a very long time.

Ideas?

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Problems with Error Messages wrt Domains, Checks

2018-03-17 Thread john frazer
Today I realized a number of points where PostgreSQL (v10.3) is rather
lackluster and sparse in its error messages.

The first point is illustrated by this code:

drop schema if exists X cascade;create schema X;
create domain X.an_illegal_regex as text check ( value ~ '(' );
create table X.table_with_illegal_constraint (
  a text,
  constraint "column a must have a bogus value" check (
a::X.an_illegal_regex = a ) );
select * from X.table_with_illegal_constraint;
insert into X.table_with_illegal_constraint values
  ( 'xxx' ),
  -- ( 'xxx' ),
  ( 'foo' ),
  ( 'xyx' );

This code will throw with

psql:db/experiments/pg-error-fail-illegal-regex.sql:17: ERROR:
invalid regular expression: parentheses () not balanced

There are several problems with this error message:

FAILURE: the error is really in line 5 where a syntactically invalid RegEx
is created; the fact that it is a RegEx and not a general string is obvious
from the semantics of the ~ (tilde) operator at that point in time.

FAILURE: the offending RegEx is not referred to and not quoted in the error
message. As such, it could be anywhere in my many, many kLOCs big DB
definition. I cannot even search the RegEx with a RegEx because all I know
is some parenthesis is missing, somewhere: RegExes cannot match
parentheses, and PG RegExes do not have a unique syntactic marker to them.

FAILURE: before the insert statement, everything runs dandy. We could have
built an entire data warehouse application on top of a table definition
that can never be syntactically processed but which will only fail when
someone accidentally tries to insert a line.

FAILURE: I can select from a table with a syntactically invalid definition.

The second point is related:

drop schema if exists X cascade;create schema X;
create domain X.a_legal_regex as text check ( value ~ '^x' );
create table X.table_with_constraints (
  a text,
  constraint "column a must start with x" check ( a::X.a_legal_regex = a ),
  constraint "field b must have 3 characters" check (
character_length( a ) = 3 ) );
insert into X.table_with_constraints values
  ( 'xxx' ),
  ( 'foo' ),/* A: violates first constraint */
  -- ( '' ),   /* B: violates second constraint */
  ( 'xyx' );

With only line B active, this gives:

psql:db/experiments/pg-error-fail-no-constraint-name.sql:16:
ERROR:  new row for relation "table_with_constraints" violatescheck
constraint "field b must have 3 characters"
DETAIL:  Failing row contains ().

SUCCESS: we get the name of the relation *and* the name of the violated
rule.

SUCCESS: the offending piece of data is quoted.

FAILURE: we don't get the full name of the relation, which is
"X"."table_with_constraints". Neither do we get the name of the column that
received the offending value.

Lastly, with only line A (not line B) active:

psql:db/experiments/pg-error-fail-no-constraint-name.sql:16:
ERROR:  value for domain x.a_legal_regex violates check constraint
"a_legal_regex_check"

FAILURE: no reference to the affected table, column is made.

FAILURE: no reference to the offending piece of data is made.

FAILURE: no reference to the offended constraint is made ("column a must
start with x").

What are the best practices or workarounds for the above shortcomings? I've
been trying for several hours to figure out what causes an error message a
la value for domain xxx violates check constraint "xxx_check" by rewriting
table definitions, inserting data row by row and so on, to no avail. What I
need is a full chain of the objects (column -> table -> constraint ->
domain -> check) that are involved in the error.
I'm writing this to the developers' list because I see the above
observations as serious shortcomings in an otherwise great piece of
software that can probably not be fixed by using client-side code only.


Re: disable SSL compression?

2018-03-17 Thread Peter Eisentraut
On 3/11/18 12:07, Magnus Hagander wrote:
> I think it's worth mentioning in the docs around "it's now considered
> insecure" that it's still an option to use if compression is the main
> thing one is looking for, rather than security. As in, it doesn't make
> it any less secure than no ssl at all. (obviously not those words)
> 
> +1 otherwise. 

committed like that

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



ECPG oracle mode test program patch

2018-03-17 Thread Shinoda, Noriyoshi
Hi, Hackers

The attached small patch is a modification to the test program of the function 
recently added to ECPG.
[Add Oracle like handling of char arrays.]

https://git.postgresql.org/pg/commitdiff/3b7ab4380440d7b14ee390fabf39f6d87d7491e2

When the char_array program is executed, the following error occurs in the 
closed part of the CURSOR.

[NO_PID]: ecpg_check_PQresult on line 54: bad response - ERROR: cursor "cstr" 
does not exist
[NO_PID]: sqlca: code: 0, state: 0
[NO_PID]: raising sqlstate 34000 (sqlcode -400): cursor "cstr" does not exist 
on line 54

The attached patch corrects the cursor name.

Regards,

Noriyoshi Shinoda




char_array.patch
Description: char_array.patch


Re: Re: Re: [GSOC 18] Performance Farm Project——Initialization Project

2018-03-17 Thread Magnus Hagander
On Wed, Mar 14, 2018 at 10:33 PM, Dave Page  wrote:

> Hi
>
> On Tue, Mar 13, 2018 at 11:31 PM, Hongyuan Ma 
> wrote:
>
>> Hi Dave,
>> I am willing to use React to re-create the front-end application. Since
>> I plan to use separate front-end and back-end development methods, this
>> means that there will be no html files in Django applications. Front-end
>> and back-end applications will interact via restful api. I will use
>> react to rewrite some existing html files if needed. Before initializing
>> the react application, I want to learn more about your tendency toward
>> front-end technology.
>>
>>  - About React: Which version of React (15.x or 16) and react-router
>> (2.x or 4) do you tend to use?
>>
>
> pgAdmin is using React 16.2.
>
>
>>  - About Package Manager: Do you tend to use yarn or npm?
>>
>
> Yarn for pgAdmin, though npm would probably be a little easier for a
> project running on pginfra.
>

If at all possible, please stick to things that are packaged in debian
stable (currently stretch).

Anything else will increase the maintenance burden by orders of magnitude,
and make it a lot less likely to be accepted.

(That is for anything that needs to run on the deploy of course -- if a
tool is needed to update the contents of the repo before deploy or such,
then it's of course no problem)

//Magnus


Re: disable SSL compression?

2018-03-17 Thread Peter Eisentraut
On 3/11/18 13:28, Tom Lane wrote:
>> My proposal is the attached patch that sets the default in libpq to off
>> and adjusts the documentation a bit so it doesn't sound like we have
>> missed the news altogether.
> 
> Seems reasonable as far as it goes, but do we need to make corresponding
> server-side changes?

We could add a setting to disable SSL compression on the server, as some
web servers have been doing, but the niche of version combinations where
that would be applicable seems pretty low.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Tom Lane
Tomas Vondra  writes:
> I happened to be updating our machine running our buildfarm animals, and
> I noticed something quite strange - the machine was unexpectedly running
> out of disk space, which is rather suspicious as it's running just the
> regression tests :-/

> After a bit of investigation, I found this:
> Right - the plpgsql_control.out output has about 58GB, which is somewhat
> excessive I guess. The reason is fairly simple:

> NOTICE:  2147483620..2147483647 by 10: i = 2147483620
> NOTICE:  2147483620..2147483647 by 10: i = 2147483630
> NOTICE:  2147483620..2147483647 by 10: i = 2147483640
> NOTICE:  2147483620..2147483647 by 10: i = -2147483646
> NOTICE:  2147483620..2147483647 by 10: i = -2147483636
> NOTICE:  2147483620..2147483647 by 10: i = -2147483626
> ... many more NOTICE messages ...

> Looking at the plpgsql_call.out timestamp, this seems to be running
> since January 1, which also correlates with the last results reported to
> pgbuildfarm.

Ouch.  That test is in fact new as of 31 Dec, and what this seems to
prove is that plpgsql's handling of loop-variable overflow doesn't
work on fulmar.

regards, tom lane



Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Tomas Vondra


On 03/17/2018 03:32 PM, Tom Lane wrote:
> Tomas Vondra  writes:
>> I happened to be updating our machine running our buildfarm animals, and
>> I noticed something quite strange - the machine was unexpectedly running
>> out of disk space, which is rather suspicious as it's running just the
>> regression tests :-/
> 
>> After a bit of investigation, I found this:
>> Right - the plpgsql_control.out output has about 58GB, which is somewhat
>> excessive I guess. The reason is fairly simple:
> 
>> NOTICE:  2147483620..2147483647 by 10: i = 2147483620
>> NOTICE:  2147483620..2147483647 by 10: i = 2147483630
>> NOTICE:  2147483620..2147483647 by 10: i = 2147483640
>> NOTICE:  2147483620..2147483647 by 10: i = -2147483646
>> NOTICE:  2147483620..2147483647 by 10: i = -2147483636
>> NOTICE:  2147483620..2147483647 by 10: i = -2147483626
>> ... many more NOTICE messages ...
> 
>> Looking at the plpgsql_call.out timestamp, this seems to be running
>> since January 1, which also correlates with the last results reported to
>> pgbuildfarm.
> 
> Ouch. That test is in fact new as of 31 Dec, and what this seems to 
> prove is that plpgsql's handling of loop-variable overflow doesn't 
> work on fulmar.
> 

Yeah :-( I'll try restarting the tests on fulmar, to see if it's
reproducible. If yes, I'll see if a newer version of ICC fixes it.

If not, I'll temporarily disable fulmar, so that the two other animals
on the same machine (magpie, treepie) are not stuck behind it. That
should tell us if the issue is limited to ICC, or if there's something
else wrong with that system.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Tom Lane
I wrote:
> Ouch.  That test is in fact new as of 31 Dec, and what this seems to
> prove is that plpgsql's handling of loop-variable overflow doesn't
> work on fulmar.

Some of the other icc-using critters haven't reported in since
December, either :-(

Looking at the code, we do this like so:

/*
 * Increase/decrease loop value, unless it would overflow, in which
 * case exit the loop.
 */
if (stmt->reverse)
{
if ((int32) (loop_value - step_value) > loop_value)
break;
loop_value -= step_value;
}
else
{
if ((int32) (loop_value + step_value) < loop_value)
break;
loop_value += step_value;
}

I imagine what's happening is that the compiler is assuming no overflow
occurs (due to lacking any equivalent of -fwrapv), then deducing that the
if-tests are no-ops and throwing them away.

We could avoid the dependency on -fwrapv with something like

if (stmt->reverse)
{
if (loop_value < (PG_INT32_MIN + step_value))
break;
loop_value -= step_value;
}
else
{
if (loop_value > (PG_INT32_MAX - step_value))
break;
loop_value += step_value;
}

which is safe because we enforce step_value > 0.  It's kind of ugly
because it hard-codes knowledge of what the limits are, but we may not
have much choice.

regards, tom lane



Re: Problems with Error Messages wrt Domains, Checks

2018-03-17 Thread David G. Johnston
On Sat, Mar 17, 2018 at 6:14 AM, john frazer 
wrote:

> Today I realized a number of points where PostgreSQL (v10.3) is rather
> lackluster and sparse in its error messages.
>
> ​You may find the following thread and its predecessors enlightening.

https://www.postgresql.org/message-id/CAD3a31WR8X1TpjR_MoZxuz4S0BO3ZkPAeLoQ9rPxKHG%3D728eoQ%40mail.gmail.com

​Basically, the fundamental problem is type input is performed in a
relatively isolated fashion since there is no requirement that a table or
column of said type even exist.

> psql:db/experiments/pg-error-fail-illegal-regex.sql:17: ERROR:
> invalid regular expression: parentheses () not balanced
>
> There are several problems with this error message:
>
> FAILURE: the error is really in line 5 where a syntactically invalid RegEx
> is created; the fact that it is a RegEx and not a general string is obvious
> from the semantics of the ~ (tilde) operator at that point in time.
>
> ​Yeah, the fact that we don't "compile" expressions is unfortunate.  Not
sure if there are any plans to do so or what limitations there are​


> FAILURE: the offending RegEx is not referred to and not quoted in the
> error message.
>
​This seems like an easy enough oversight to correct.​  In all the
discussion about being challenging to identify location I don't recall
seeing anything about why we aren't at least showing the offending input
value.

> As such, it could be anywhere in my many, many kLOCs big DB definition. I
> cannot even search the RegEx with a RegEx because all I know is some
> parenthesis is missing, somewhere:
>
​Well, the error does point to the first statement in the chain of issues -
working backward a couple of steps is possible.

> RegExes cannot match parentheses,
>
​Sure they can​.

> and PG RegExes do not have a unique syntactic marker to them.
>
​True​

> FAILURE: before the insert statement, everything runs dandy. We could
> have built an entire data warehouse application on top of a table
> definition that can never be syntactically processed but which will only
> fail when someone accidentally tries to insert a line.
>
​Since this is going to fail every single time you add a record I'm lacking
sympathy here.  "Accidentally tries to insert a line" - if the table wasn't
meant to be used why does it exist in the first place?​  And if it is
intended to be used then functional testing should quickly point out
something like this.

> FAILURE: I can select from a table with a syntactically invalid definition.
>
​You're stretching here if you think this is an important failure point.
Since the table cannot not have valid data there would be nothing to
select.  Checking constraints during selection is undesireable - they
should be an are only checked during insertion or when the constraint
itself changes.

With only line B active, this gives:
>
> psql:db/experiments/pg-error-fail-no-constraint-name.sql:16:
> ERROR:  new row for relation "table_with_constraints" violatescheck 
> constraint "field b must have 3 characters"
> DETAIL:  Failing row contains ().
>
> SUCCESS: we get the name of the relation *and* the name of the violated
> rule.
>
> SUCCESS: the offending piece of data is quoted.
>
> FAILURE: we don't get the full name of the relation, which is
> "X"."table_with_constraints". Neither do we get the name of the column that
> received the offending value.
>
​No, you get "check constraint field b must have 3 characters" with the
owning table.  You've defined a table constraint so there is no directly
attached column to report - the expression as a whole fails and we don't
report which boolean aspects of the expression where true and false.  You
do get the input value which makes manually resolving the expression
possible.  The lack of schema-qualification on the table identifier seems
like an oversight but at the moment I'd not willing to go find evidence in
support to opposition to that thought.​

Lastly, with only line A (not line B) active:
>
> psql:db/experiments/pg-error-fail-no-constraint-name.sql:16:
> ERROR:  value for domain x.a_legal_regex violates check constraint 
> "a_legal_regex_check"
>
> FAILURE: no reference to the affected table, column is made.
>
FAILURE: no reference to the offending piece of data is made
>
​Repeat of the first example, same explanations apply.  Hopefully this gets
improved eventually.​


> FAILURE: no reference to the offended constraint is made ("column a must
> start with x").
>
​It never got that far into the validation process.  It couldn't even form
a value of correct type that the constraint expression could evaluate.  I
suppose this is just another aspect of the first problem - the isolation of
type conversion and the absence of keeping and reporting a stack-trace.
Someone more knowledgeable than I would need to expound on the similarities
and differences.

> What are the best practices or workarounds for the above shortcomings?
> I've been trying for several hours to figure out what causes an error
> mes

Re: Google Summer of Code: Potential Applicant

2018-03-17 Thread Christos Maris
I just submitted my draft proposal.

Could you please take a look at it and give me any feedback on how to
improve it?

Thanks a lot in advance!

On Wed, Mar 14, 2018 at 10:12 AM, Aleksander Alekseev <
a.aleks...@postgrespro.ru> wrote:

> Hello Stephen,
>
> > > Protobuf is fine too, but unfortunately we don't have any
> > > Protobuf-related projects this time.
> >
> > Just to be clear, the list on the wiki is just a set of suggestions-
> > students are welcome to propose their own projects as well.
>
> Oh, I didn't know that. In this case I will be happy to see any sorts of
> projects related to Protobuf or any other binary format - Thrift, Avro,
> Cap'n'Proto, MessagePack, etc!
>
> --
> Best regards,
> Aleksander Alekseev
>


Re: MCV lists for highly skewed distributions

2018-03-17 Thread Dean Rasheed
On 13 March 2018 at 08:39, John Naylor  wrote:
>> Also, this is common enough that in fact that distribution
>> can be reasonably approximated by a normal distribution.
>
> For the archives, because it's typically seen 10 times in the sample,
> per the rule of thumb mention upthread?
>

Actually, I think that the rule of thumb of at least 10 instances in
the sample for a normal approximation is probably too strict.

Looking around, values as low as 5 seem to be quite commonly used. Of
course there is no right answer here, it depends on what you're doing
with it, and how close you want the normal distribution to be to the
hypergeometric one. For our purposes, I don't think we really need it
to be that close. We just want to be able to justify statements like
"the value is *probably* more common than X, and it was unlikely that
that was just random sampling error".


>> Making use of
>> the non-MCV-based estimate allows us to do better -- if the MCV-based
>> estimate is statistically significantly higher than the non-MCV-based
>> estimate, then it would almost certainly be better to include that
>> value in the MCV list, even if its spread is quite large.
>
> Because we can treat the sample as normal, testing for > 2 standard
> deviations means that we're "~95% sure" it's more common in the table
> than the non-MCV selectivity would suggest, right? (I realize I'm not
> using technically accurate language)
>

Actually, it's more like 97.5% (remember the normal distribution is
symmetric, so there's around 2.5% below the 2-stddev interval, around
95% in it, and another 2.5% above it), but that's just nit-picking.

There are several nice online calculators for playing around with
hypergeometric distributions, such as
http://keisan.casio.com/exec/system/1180573201

Consider this distribution:

N = 100
n = 1
K = 500
Mean = n * K / N = 5
Variance = (K / N) * (1 - K / N) * n * (N - n) / (N - 1) = 4.9
Stddev = sqrt(Variance) = 2.2

Using the calculator above, you can see that the distribution is
fairly normal-like, but with a noticeable positive skew. The 2-stddev
interval is 0.6 to 9.4, and according to the calculator the
probability of the value being less than or equal to 1 is in the
ballpark of the 2.5% figure expected. So even with just 5 occurrences
in the sample, it's fairly close to a normal distribution.

Regards,
Dean



Re: ECPG oracle mode test program patch

2018-03-17 Thread Michael Meskes
> The attached patch corrects the cursor name.

Fixed, thanks for the patch.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL



Re: MCV lists for highly skewed distributions

2018-03-17 Thread Dean Rasheed
On 17 March 2018 at 17:16, Dean Rasheed  wrote:
> Using the calculator above, you can see that the distribution is
> fairly normal-like, but with a noticeable positive skew. The 2-stddev
> interval is 0.6 to 9.4, and according to the calculator the
> probability of the value being less than or equal to 1 is in the
> ballpark of the 2.5% figure expected. So even with just 5 occurrences
> in the sample, it's fairly close to a normal distribution.
>

One thing this does illustrate is that the hypergeometric distribution
is a discrete distribution and there can be quite large jumps in the
probability from one value to the next, so care may be needed when
approximating it with a continuous distribution. The standard
technique used to handle this is to apply what is known as a
continuity correction factor. Suppose that X is a random variable with
a discrete hypergeometric distribution, and Y is a continuous normal
distribution, with the same mean and variance, being used to
approximate X. Then P(X>i) for some integer i is the same as
P(X>=i+1), because X can only be integer-valued. The idea is then that
you can use P(Y>i+0.5) to get a fairly good approximation to P(X>i).
That would correspond to adding 0.5 to the right-hand side of the
current test, i.e.,

if (mcv_counts[num_mcv - 1] > selec * samplerows + 2 * stddev + 0.5)
=> Common enough to be included in MCV-list

A lot of the time that won't make much difference, except when dealing
with the smaller counts at the tail end of the MCV list, where it
might help avoid the too-many-mcvs problem, so I think it's worth
trying out.

Apparently, in textbooks, an interval like the mean +/- 2*stddev is
known as a Wald confidence interval, and the mean +/- (2*devdev+0.5)
is the continuity-corrected Wald interval, so it's probably worth
mentioning that in the comments. They are generally regarded as quite
crude approximations for hypergeometric distributions, and there's
quite a bit of research around establishing more accurate confidence
intervals for this kind of distribution, but the formulae involved
tend to be quite complicated, whereas the Wald interval has the
advantage of being very simple. In this context, I don't think we need
to establish a particularly accurate confidence interval. We just want
to be able to say that the value is probably more common than the
non-mcv values, without being too rigorous about what "probably"
means, as long as it works in practice to discount values that just
happen to be a bit more common in the sample, but aren't actually more
common in the table as a whole.

Regards,
Dean



Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full

2018-03-17 Thread Magnus Hagander
On Fri, Mar 9, 2018 at 2:11 PM, Julian Markwort <
julian.markw...@uni-muenster.de> wrote:

> Hello Magnus,
>
> > I think this makes a lot of sense, and can definitely be a useful
> > option.
>
> I was hesistant to write a long and elaborate patch as I wasn't certain
> if there was any interest for such an addition, but I'm thankful for
> your input.
>
> > However, the patch is completely lacking documentation, which
> > obviously make it a no-starter.
>
> I'll write the missing documentation shortly.
>

Great!



> > Also if I read it right, if the CN is not correct, it will give the
> > error message "certificate authentication failed for user ...". I
> > realize this comes from the re-use of the code, but I don't think
> > this makes it very useful. We  need to separate these two things.
>
> The error message "certificate authentication failed for user XYZ:
> client certificate contains no user name" is the result of calling
> CheckCertAuth when the user presented a certificate without a CN in it.
>

That is arguably wrong, since it's actually password authentication that
fails. That is the authentication type that was picked in pg_hba.conf. It's
more "certificate validation" that failed.


The error message that is presented to the user upon trying to connect
> with a certificate containing a CN other than the username is:
>
> -
> psql: FATAL: password authentication failed for user "nottestuser"
> -
>
> The server's log contains the lines:
>
> -
> 2018-03-09 13:06:43.111 CET [3310] LOG:  provided user name
> (nottestuser) and authenticated user name (testuser) do not match
> 2018-03-09 13:06:43.111 CET [3310] FATAL:  password authentication
> failed for user "nottestuser"
> 2018-03-09 13:06:43.111 CET [3310] DETAIL:  Connection matched
> pg_hba.conf line 97: "hostssl all nottestuser 127.0.0.1/32 password
> clientcert=verify-full"
> -
>
> I'd argue that the message in the log file is consistent and useful,
> however the message given by psql (or any libpq application for that
> matter) leaves uncertainty regarding the correctness of a provided
> password, for example.

I could attach the log message of CheckCertAuth to the logdetail,
> however then I'd have issues if there is already something written to
> the logdetail.
> I could also use an additional ereport() call whenever clientcert was
> set to verify-full and the user name didn't match the CN.
>

It's hard to do too much about the client connection one when failing,
without leaking too much. It's pretty common that you have to actually look
in the server logfile to figure out what actually went wrong. I think that
message is fine.

I agree that the log message is useful. Though it could be good to clearly
indicate that it was caused specifically because of the verify-full, to
differentiate it from other cases of the same message.

For example, what about the scenario where I use GSSAPI authentication and
clientcert=verify-full. Then we suddenly have three usernames (gssapi,
certificate and specified) -- how is the user supposed to know which one
came from the cert and which one came from gssapi for example?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Andres Freund


On March 17, 2018 7:56:40 AM PDT, Tom Lane  wrote:
>I wrote:
>> Ouch.  That test is in fact new as of 31 Dec, and what this seems to
>> prove is that plpgsql's handling of loop-variable overflow doesn't
>> work on fulmar.
>
>Some of the other icc-using critters haven't reported in since
>December, either :-(
>
>Looking at the code, we do this like so:
>
>/*
> * Increase/decrease loop value, unless it would overflow, in which
> * case exit the loop.
> */
>if (stmt->reverse)
>{
>if ((int32) (loop_value - step_value) > loop_value)
>break;
>loop_value -= step_value;
>}
>else
>{
>if ((int32) (loop_value + step_value) < loop_value)
>break;
>loop_value += step_value;
>}
>
>I imagine what's happening is that the compiler is assuming no overflow
>occurs (due to lacking any equivalent of -fwrapv), then deducing that
>the
>if-tests are no-ops and throwing them away.
>
>We could avoid the dependency on -fwrapv with something like
>
>if (stmt->reverse)
>{
>if (loop_value < (PG_INT32_MIN + step_value))
>break;
>loop_value -= step_value;
>}
>else
>{
>if (loop_value > (PG_INT32_MAX - step_value))
>break;
>loop_value += step_value;
>}
>
>which is safe because we enforce step_value > 0.  It's kind of ugly
>because it hard-codes knowledge of what the limits are, but we may not
>have much choice.

On the current branch just using the new overflow safe functions in int.h 
should work. But unless we are OK leaving this broken in the back branches, or 
want to backport the functionality, that's probably not sufficient.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Tomas Vondra


On 03/17/2018 06:27 PM, Andres Freund wrote:
> 
> 
> On March 17, 2018 7:56:40 AM PDT, Tom Lane  wrote:
>> I wrote:
>>> Ouch.  That test is in fact new as of 31 Dec, and what this seems to
>>> prove is that plpgsql's handling of loop-variable overflow doesn't
>>> work on fulmar.
>>
>> Some of the other icc-using critters haven't reported in since
>> December, either :-(
>>
>> Looking at the code, we do this like so:
>>
>>/*
>> * Increase/decrease loop value, unless it would overflow, in which
>> * case exit the loop.
>> */
>>if (stmt->reverse)
>>{
>>if ((int32) (loop_value - step_value) > loop_value)
>>break;
>>loop_value -= step_value;
>>}
>>else
>>{
>>if ((int32) (loop_value + step_value) < loop_value)
>>break;
>>loop_value += step_value;
>>}
>>
>> I imagine what's happening is that the compiler is assuming no overflow
>> occurs (due to lacking any equivalent of -fwrapv), then deducing that
>> the
>> if-tests are no-ops and throwing them away.
>>
>> We could avoid the dependency on -fwrapv with something like
>>
>>if (stmt->reverse)
>>{
>>if (loop_value < (PG_INT32_MIN + step_value))
>>break;
>>loop_value -= step_value;
>>}
>>else
>>{
>>if (loop_value > (PG_INT32_MAX - step_value))
>>break;
>>loop_value += step_value;
>>}
>>
>> which is safe because we enforce step_value > 0.  It's kind of ugly
>> because it hard-codes knowledge of what the limits are, but we may not
>> have much choice.
> 
> On the current branch just using the new overflow safe functions in
> int.h should work. But unless we are OK leaving this broken in the back
> branches, or want to backport the functionality, that's probably not
> sufficient.
> 

Not sure, but the backbranches seem to be working fine, and the commit
that triggers the issue is from December 31. Maybe the issue was there
but we were lucky not to trip on it before.

Anyway, I can confirm that the fix suggested by Tom does the trick
(well, at least on Fulmar, which is running icc 14.0.3). I've also
disassembled exec_stmt_fori both with and without the patch - reading
assembly in not my strength, but if you're interested it's attached. The
interesting part seems to be the last ~50 lines or so.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Dump of assembler code for function exec_stmt_fori:
   0x7fb85bbcaf16 <+0>: push   %rbp
   0x7fb85bbcaf17 <+1>: mov%rsp,%rbp
   0x7fb85bbcaf1a <+4>: sub$0x120,%rsp
   0x7fb85bbcaf21 <+11>:mov%rbx,-0x10(%rbp)
   0x7fb85bbcaf25 <+15>:mov%rdi,-0xd0(%rbp)
   0x7fb85bbcaf2c <+22>:mov%rsi,-0xc8(%rbp)
=> 0x7fb85bbcaf33 <+29>:movb   $0x0,-0x120(%rbp)
   0x7fb85bbcaf3a <+36>:movl   $0x0,-0x11c(%rbp)
   0x7fb85bbcaf44 <+46>:mov-0xd0(%rbp),%rax
   0x7fb85bbcaf4b <+53>:mov-0xc8(%rbp),%rdx
   0x7fb85bbcaf52 <+60>:mov0x10(%rdx),%rdx
   0x7fb85bbcaf56 <+64>:mov0x4(%rdx),%edx
   0x7fb85bbcaf59 <+67>:movslq %edx,%rdx
   0x7fb85bbcaf5c <+70>:shl$0x3,%rdx
   0x7fb85bbcaf60 <+74>:add0x70(%rax),%rdx
   0x7fb85bbcaf64 <+78>:mov(%rdx),%rax
   0x7fb85bbcaf67 <+81>:mov%rax,-0xc0(%rbp)
   0x7fb85bbcaf6e <+88>:mov-0xd0(%rbp),%rax
   0x7fb85bbcaf75 <+95>:mov-0xc8(%rbp),%rdx
   0x7fb85bbcaf7c <+102>:   mov0x18(%rdx),%rdx
   0x7fb85bbcaf80 <+106>:   lea-0x11f(%rbp),%rcx
   0x7fb85bbcaf87 <+113>:   lea-0x118(%rbp),%rbx
   0x7fb85bbcaf8e <+120>:   lea-0x114(%rbp),%rsi
   0x7fb85bbcaf95 <+127>:   mov%rax,%rdi
   0x7fb85bbcaf98 <+130>:   mov%rsi,-0x80(%rbp)
   0x7fb85bbcaf9c <+134>:   mov%rdx,%rsi
   0x7fb85bbcaf9f <+137>:   mov%rcx,%rdx
   0x7fb85bbcafa2 <+140>:   mov%rbx,%rcx
   0x7fb85bbcafa5 <+143>:   mov-0x80(%rbp),%rax
   0x7fb85bbcafa9 <+147>:   mov%rax,%r8
   0x7fb85bbcafac <+150>:   callq  0x7fb85bbd4746 
   0x7fb85bbcafb1 <+155>:   mov%rax,-0xb0(%rbp)
   0x7fb85bbcafb8 <+162>:   mov-0xb0(%rbp),%rax
   0x7fb85bbcafbf <+169>:   mov%rax,-0xb8(%rbp)
   0x7fb85bbcafc6 <+176>:   add$0xfff0,%rsp
   0x7fb85bbcafca <+180>:   mov-0xd0(%rbp),%rax
   0x7fb85bbcafd1 <+187>:   mov-0xb8(%rbp),%rdx
   0x7fb85bbcafd8 <+194>:   lea-0x11f(%rbp),%rcx
   0x7fb85bbcafdf <+201>:   mov-0x118(%rbp),%ebx
   0x7fb85bbcafe5 <+207>:   mov-0x114(%rbp),%esi
   0x7fb85bbcafeb <+213>:   mov-0xc0(%rbp),%rdi
   0x7fb85bbcaff2 <+220>:   mov0x20(%rdi),%rdi
   0x7fb85bbcaff6 <+224>:   mov0x8(%rdi),%edi
   0x7fb85bb

Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Andres Freund
Hi,

On 2018-03-17 18:55:11 +0100, Tomas Vondra wrote:
> Not sure, but the backbranches seem to be working fine, and the commit
> that triggers the issue is from December 31.

Well, that added the test.  Are you saying that if you execute similar
code on an older branch it doesn't fail?


> Anyway, I can confirm that the fix suggested by Tom does the trick

Could you try the attached patch, too?


I'm a bit afraid that we'll have to go a lot further if we can't make
icc do safe signed overflow in all cases. This case is easy enough to
fix, but we were lucky in a way to find it.  Given that apparently we
know that ICC also optimizes based on overflow sematics it seems we
should really work towards making all of the backend safe with that :(.


- Andres
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 68da7cf669e..8874fb9416e 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -22,6 +22,7 @@
 #include "access/tupconvert.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "executor/execExpr.h"
 #include "executor/spi.h"
 #include "executor/spi_priv.h"
@@ -2566,15 +2567,13 @@ exec_stmt_fori(PLpgSQL_execstate *estate, PLpgSQL_stmt_fori *stmt)
 		 */
 		if (stmt->reverse)
 		{
-			if ((int32) (loop_value - step_value) > loop_value)
+			if (pg_sub_s32_overflow(loop_value, step_value, &loop_value))
 break;
-			loop_value -= step_value;
 		}
 		else
 		{
-			if ((int32) (loop_value + step_value) < loop_value)
+			if (pg_add_s32_overflow(loop_value, step_value, &loop_value))
 break;
-			loop_value += step_value;
 		}
 	}
 


Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Tom Lane
Tomas Vondra  writes:
> Not sure, but the backbranches seem to be working fine, and the commit
> that triggers the issue is from December 31. Maybe the issue was there
> but we were lucky not to trip on it before.

Yeah, we were simply not testing that overflow-detection code before.
Undoubtedly it would fail in the back branches too if we tested it.

> Anyway, I can confirm that the fix suggested by Tom does the trick
> (well, at least on Fulmar, which is running icc 14.0.3). I've also
> disassembled exec_stmt_fori both with and without the patch - reading
> assembly in not my strength, but if you're interested it's attached. The
> interesting part seems to be the last ~50 lines or so.

Hm, did you get the "master" and "patched" versions backwards?  The
allegedly-patched version does the !reverse case like this:

   0x7f71219457ae <+2200>:  mov-0x108(%rbp),%eax
   0x7f71219457b4 <+2206>:  test   %eax,%eax
   0x7f71219457b6 <+2208>:  jl 0x7f71219457cf 
   0x7f71219457b8 <+2210>:  mov-0x108(%rbp),%eax
   0x7f71219457be <+2216>:  add-0x110(%rbp),%eax
   0x7f71219457c4 <+>:  mov%eax,-0x110(%rbp)
   0x7f71219457ca <+2228>:  jmpq   0x7f7121945573 

so that it's apparently optimized

if ((int32) (loop_value + step_value) < loop_value)
break;

into

if (step_value < 0)
break;

which of course never exits the loop.  That's slightly different
(and stupider) than what I'd been hypothesizing, but it's a valid
transformation if you ignore the possibility of integer overflow.

It might be worth studying the icc manual to see if it has an
equivalent of -fwrapv.  Although we can and probably should fix
this case by changing the code, I'm worried about what other bugs
may exist only in icc builds.  I know Andres would like to get
rid of the need for -fwrapv but I suspect that's not really going
to happen soon.

regards, tom lane



Re: MCV lists for highly skewed distributions

2018-03-17 Thread Dean Rasheed
On 16 March 2018 at 15:26, Tomas Vondra  wrote:
> Actually, one question - when deciding whether to keep the item in the
> MCV list, analyze_mcv_list only compares it's frequency with an average
> of the rest. But as we're removing items from the MCV list, the average
> frequency of the non-MCV items is growing (we're removing items with
> higher and higher frequencies). That means the estimates for the least
> common items will get higher and higher - essentially overestimates. So,
> couldn't/shouldn't analyze_mcv_list consider this too?
>

Yes, that's the intention. At the start, sumcount is set to the count
of all but the last (least common) MCV item, so it can estimate the
frequency of the non-MCV items if the last MCV item were to be
removed. Then each time through the loop, sumcount is decreased by the
removed item's count, increasing the estimated frequency of the
non-MCV items accordingly.

Regards,
Dean



Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Tom Lane
Andres Freund  writes:
> On the current branch just using the new overflow safe functions in
> int.h should work. But unless we are OK leaving this broken in the back
> branches, or want to backport the functionality, that's probably not
> sufficient.

Yeah ... I don't like either of the last two things, so probably we should
go with the patch as I had it.  Yours might perform a shade better on
compilers with the built-in, but it'll be a lot worse on those without.

What I was wondering about was whether to back-patch a test case.
It doesn't seem really necessary, and we'd have to put it someplace
else than where it is in HEAD, so I'm leaning against.

regards, tom lane



Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Andres Freund
Hi,

On 2018-03-17 14:20:26 -0400, Tom Lane wrote:
> It might be worth studying the icc manual to see if it has an
> equivalent of -fwrapv.

Yes.

A *quick* look through https://software.intel.com/en-us/node/522795
unfortunately didn't show anything.


> Although we can and probably should fix this case by changing the
> code, I'm worried about what other bugs may exist only in icc builds.

Yea, I know that I can produce a number of "bugs" today by removing
-fwrapv for gcc, and found a few more by manual inspection.  I'm sure
icc can trigger at least some of them.


> I know Andres would like to get rid of the need for -fwrapv but I
> suspect that's not really going to happen soon.

And definitely not in anything released or close to it.  I do want to
get rid of it because various compilers don't have comparable flags, and
because it causes slowdowns. But I really would like to do it without
running headfirst into a wall, and that'll mean going a bit slower.

I think it'd be good practice to get rid of the known overflow hazards
by using int.h, but I don't want to drop fwrapv immediately.

Greetings,

Andres Freund



Re: MCV lists for highly skewed distributions

2018-03-17 Thread Tomas Vondra


On 03/17/2018 07:28 PM, Dean Rasheed wrote:
> On 16 March 2018 at 15:26, Tomas Vondra  wrote:
>> Actually, one question - when deciding whether to keep the item in the
>> MCV list, analyze_mcv_list only compares it's frequency with an average
>> of the rest. But as we're removing items from the MCV list, the average
>> frequency of the non-MCV items is growing (we're removing items with
>> higher and higher frequencies). That means the estimates for the least
>> common items will get higher and higher - essentially overestimates. So,
>> couldn't/shouldn't analyze_mcv_list consider this too?
>>
> 
> Yes, that's the intention. At the start, sumcount is set to the count
> of all but the last (least common) MCV item, so it can estimate the
> frequency of the non-MCV items if the last MCV item were to be
> removed. Then each time through the loop, sumcount is decreased by the
> removed item's count, increasing the estimated frequency of the
> non-MCV items accordingly.
> 

I know it's updated like that, but that's not quite what I meant. Sorry
for not making the question clearer, so let me rephrase it.

Currently, analyze_mcv_list only checks if the frequency of the current
item is significantly higher than the non-MCV selectivity. My question
is if it shouldn't also consider if removing the item from MCV would not
increase the non-MCV selectivity too much.

But perhaps that would be a silly idea, because the non-MCV items may
also be seen as a random noise.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Tomas Vondra


On 03/17/2018 07:20 PM, Tom Lane wrote:
> Tomas Vondra  writes:
>> Not sure, but the backbranches seem to be working fine, and the commit
>> that triggers the issue is from December 31. Maybe the issue was there
>> but we were lucky not to trip on it before.
> 
> Yeah, we were simply not testing that overflow-detection code before.
> Undoubtedly it would fail in the back branches too if we tested it.
> 
>> Anyway, I can confirm that the fix suggested by Tom does the trick
>> (well, at least on Fulmar, which is running icc 14.0.3). I've also
>> disassembled exec_stmt_fori both with and without the patch - reading
>> assembly in not my strength, but if you're interested it's attached. The
>> interesting part seems to be the last ~50 lines or so.
> 
> Hm, did you get the "master" and "patched" versions backwards?  The
> allegedly-patched version does the !reverse case like this:
> 
>0x7f71219457ae <+2200>:mov-0x108(%rbp),%eax
>0x7f71219457b4 <+2206>:test   %eax,%eax
>0x7f71219457b6 <+2208>:jl 0x7f71219457cf 
> 
>0x7f71219457b8 <+2210>:mov-0x108(%rbp),%eax
>0x7f71219457be <+2216>:add-0x110(%rbp),%eax
>0x7f71219457c4 <+>:mov%eax,-0x110(%rbp)
>0x7f71219457ca <+2228>:jmpq   0x7f7121945573 
> 
> 
> so that it's apparently optimized
> 
> if ((int32) (loop_value + step_value) < loop_value)
> break;
> 
> into
> 
> if (step_value < 0)
> break;
> 
> which of course never exits the loop.  That's slightly different
> (and stupider) than what I'd been hypothesizing, but it's a valid
> transformation if you ignore the possibility of integer overflow.
> 

Yeah, it seems I've mixed up the files by accident. Sorry.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Andres Freund


On March 17, 2018 11:32:36 AM PDT, Tom Lane  wrote:
>Andres Freund  writes:
>> On the current branch just using the new overflow safe functions in
>> int.h should work. But unless we are OK leaving this broken in the
>back
>> branches, or want to backport the functionality, that's probably not
>> sufficient.
>
>Yeah ... I don't like either of the last two things, so probably we
>should
>go with the patch as I had it.  Yours might perform a shade better on
>compilers with the built-in, but it'll be a lot worse on those without.

I don't think performance is a prime driver here, or shouldn't be at least. 
Obviousness / grepability seem much more important.  I'd vote for using my 
version in master, and yours in the back branches.  I can do that, of you want.

I'm OK with skipping the test for now.

Andres

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Tom Lane
Andres Freund  writes:
> I don't think performance is a prime driver here, or shouldn't be at least. 
> Obviousness / grepability seem much more important.  I'd vote for using my 
> version in master, and yours in the back branches.  I can do that, of you 
> want.

I dunno, I think the code as I had it is quite obvious.  It's just that
I don't really like hard-coding references to INT_MIN/MAX, which I guess
is a personal style thing rather than anything I can defend well.

> I'm OK with skipping the test for now.

If we're not putting a test into the back branches, then we darn well
better be using the same code there as in HEAD, else we won't know that
it actually solves the problem.

regards, tom lane



Re: MCV lists for highly skewed distributions

2018-03-17 Thread Dean Rasheed
On 17 March 2018 at 18:40, Tomas Vondra  wrote:
> Currently, analyze_mcv_list only checks if the frequency of the current
> item is significantly higher than the non-MCV selectivity. My question
> is if it shouldn't also consider if removing the item from MCV would not
> increase the non-MCV selectivity too much.
>

Oh, I see what you're saying. In theory, each MCV item we remove is
not significantly more common than the non-MCV items at that point, so
removing it shouldn't significantly increase the non-MCV selectivity.
It's possible the cumulative effect of removing multiple items might
start to add up, but I think it would necessarily be a slow effect,
and I think it would keep getting slower and slower as more items are
removed -- isn't this equivalent to constructing a sequence of numbers
where each number is a little greater than the average of all the
preceding numbers, and ends up virtually flat-lining.

Regards,
Dean



Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Andres Freund


On March 17, 2018 12:25:57 PM PDT, Tom Lane  wrote:
>Andres Freund  writes:
>> I don't think performance is a prime driver here, or shouldn't be at
>least. Obviousness / grepability seem much more important.  I'd vote
>for using my version in master, and yours in the back branches.  I can
>do that, of you want.
>
>I dunno, I think the code as I had it is quite obvious.  It's just that
>I don't really like hard-coding references to INT_MIN/MAX, which I
>guess
>is a personal style thing rather than anything I can defend well.

Certainly harder to grep for. There's lots of other uses of the min/max macros. 
And the logic to get or right depends on an earlier piece of code ensuring the 
step is positive...

>> I'm OK with skipping the test for now.
>
>If we're not putting a test into the back branches, then we darn well
>better be using the same code there as in HEAD, else we won't know that
>it actually solves the problem.

I was thinking of committing your version everywhere and then revising in 
master after a bf cycle.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Tom Lane
Andres Freund  writes:
> On March 17, 2018 12:25:57 PM PDT, Tom Lane  wrote:
>> If we're not putting a test into the back branches, then we darn well
>> better be using the same code there as in HEAD, else we won't know that
>> it actually solves the problem.

> I was thinking of committing your version everywhere and then revising in 
> master after a bf cycle.

Meh.  I don't really feel a need to change it at all, but if we do,
I think it should be "after a release cycle" not just a few days.
We need to see whether that regression test will expose any problems
on a wider variety of compilers than exist in the buildfarm.

Anyway, pushed for now --- Tomas, if you blocked fulmar, it should
be safe to unblock.

regards, tom lane



Fwd: Problems with Error Messages wrt Domains, Checks

2018-03-17 Thread john frazer
-- Forwarded message --
From: john frazer 
Date: Sat, Mar 17, 2018 at 6:28 PM
Subject: Re: Problems with Error Messages wrt Domains, Checks
To: "David G. Johnston" 




Thanks for your consideration,

I'll try to be brief.


> As such, it could be anywhere in my many, many kLOCs big DB
definition. I cannot even search the RegEx with a RegEx because all I know
is some parenthesis is missing, somewhere:
>
> ​Well, the error does point to the first statement in the chain of issues
- working backward a couple of steps is possible.

In this particular case that is possible, and I did manage to do it. The
point is that in the
general case the faulty regular expression could be anywhere, and there's
no clue given at all.

> RegExes cannot match parentheses,
> ​Sure they can​.
> and PG RegExes do not have a unique syntactic marker to them.
> ​True​

I meant to say they can't detect matching parentheses or lack thereof.

> FAILURE: before the insert statement, everything runs dandy. We could
have built an entire data warehouse application on top of a table
definition that can never be syntactically processed but which will only
fail when someone accidentally tries to insert a line.
>
> ​Since this is going to fail every single time you add a record I'm
lacking sympathy here.  "Accidentally tries to insert a line" - if the
table wasn't meant to be used why does it exist in the first place?​  And
if it is intended to be used then functional testing should quickly point
out something like this.

But there clearly can be tables that are used only now and then and might
get checked
for absence of rows. But regardless, I think the point stands that ideally
you shouldn't
be able to succesfully declare nonsensical objects and only be told so some
kinds of
usage patterns by runtime errors (with defective contexts), and in most
cases, pgSQL
does keep that promise.

> FAILURE: I can select from a table with a syntactically invalid
definition.
> ​You're stretching here if you think this is an important failure point.
Since the table cannot not have valid data there would be nothing to
select.  Checking constraints during selection is undesireable - they
should be an are only checked during insertion or when the constraint
itself changes.

To clarify, I do not suggest checking constraints during `select`, I suggest
tighter checks at table creation time. I should not be able to construct
any kind of object that says (in SQL or maybe even plpgsql, too) something
to the effect `x ~ '('` because that string in that syntactic context which
must
be a RegexLiteral is syntactically bogus. (Yes operators can be redefined
but the
only thing that counts here is the definition of the `~` operator at the
point in time
that the table gets created, and that expression is constant with a
literal, so it's
not much difference between this and checking against, say, `x > 0` which
could and
would fail for illegal literals and non-matching types.)

> ​I suppose the best practice when dealing with a lack of information in
the error handle code flows is to limit the amount of context that is in
scope by unit testing.  And while they are absolutely short-comings
overcoming them has cost in terms of both developer effort and, more
importantly, runtime performance.

I'm afraid no amount of unit testing of the DDL code can do this for me.
Yes,
in the first reported cases (the invalid RegExp), I can make sure I use each
expression at least once so unsyntactic ones will make themselves shown. But
in the other two cases, well, the production environment in which this came
up
has an insert statement that takes data from a largish source into the
target table
(20k rows of altogether >2m rows), and I *can't* unit test that data.

FWIW the workaround that IÄve found is this:

create table X.table_with_constraints (
  my_column text,
  constraint "my_column must start with 'x'"  check ( Q.starts_with_x(
my_column ) ),
  constraint "my_column must have 3 chrs" check ( Q.has_3_characters(
my_column ) ) );

In other words, I dispense with domains and use (small, boolean) functions
(defined as `select` one-liners)
because only then do I get told what piece of data comes doen the wrong way
and where.
It's a shame because this is essentially what I expect to do in a language
like
JavaScript.







On Sat, Mar 17, 2018 at 4:20 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Sat, Mar 17, 2018 at 6:14 AM, john frazer 
> wrote:
>
>> Today I realized a number of points where PostgreSQL (v10.3) is rather
>> lackluster and sparse in its error messages.
>>
>> ​You may find the following thread and its predecessors enlightening.
>
> https://www.postgresql.org/message-id/CAD3a31WR8X1TpjR_MoZxu
> z4S0BO3ZkPAeLoQ9rPxKHG%3D728eoQ%40mail.gmail.com
>
> ​Basically, the fundamental problem is type input is performed in a
> relatively isolated fashion since there is no requirement that a table or
> column of said type even exist

Re: MCV lists for highly skewed distributions

2018-03-17 Thread Tomas Vondra
On 03/17/2018 08:32 PM, Dean Rasheed wrote:
> On 17 March 2018 at 18:40, Tomas Vondra  wrote:
>> Currently, analyze_mcv_list only checks if the frequency of the
>> current item is significantly higher than the non-MCV selectivity.
>> My question is if it shouldn't also consider if removing the item
>> from MCV would not increase the non-MCV selectivity too much.
>>
> 
> Oh, I see what you're saying. In theory, each MCV item we remove is 
> not significantly more common than the non-MCV items at that point,
> so removing it shouldn't significantly increase the non-MCV
> selectivity. It's possible the cumulative effect of removing multiple
> items might start to add up, but I think it would necessarily be a
> slow effect, and I think it would keep getting slower and slower as
> more items are removed -- isn't this equivalent to constructing a
> sequence of numbers where each number is a little greater than the
> average of all the preceding numbers, and ends up virtually
> flat-lining.
> 

Yes, I think you're right. Another thing that occurred to me is that
we're pretty much guaranteed to misestimate things at the tail end, and
in my experience under-estimates have far worse consequences.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Tomas Vondra
On 03/17/2018 08:41 PM, Tom Lane wrote:
> Andres Freund  writes:
>> On March 17, 2018 12:25:57 PM PDT, Tom Lane  wrote:
>>> If we're not putting a test into the back branches, then we darn
>>> well better be using the same code there as in HEAD, else we
>>> won't know that it actually solves the problem.
> 
>> I was thinking of committing your version everywhere and then
>> revising in master after a bf cycle.
> 
> Meh.  I don't really feel a need to change it at all, but if we do,
> I think it should be "after a release cycle" not just a few days.
> We need to see whether that regression test will expose any problems
> on a wider variety of compilers than exist in the buildfarm.
> 
> Anyway, pushed for now --- Tomas, if you blocked fulmar, it should
> be safe to unblock.
> 

OK, enabled again. It'll take a while to run through the branches.

I guess it might want to notify people running affected animals, because
otherwise they may stay stuck for a long time.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [PATCH] Verify Checksums during Basebackups

2018-03-17 Thread Michael Banck
Hi,

On Fri, Mar 09, 2018 at 10:35:33PM +0100, Michael Banck wrote:
> Possibly open questions:
> 
> 1. I have not so far changed the replication protocol to make verifying
> checksums optional. I can go about that next if the consensus is that we
> need such an option (and cannot just check it everytime)?

I think most people (including those I had off-list discussions about
this with) were of the opinion that such an option should be there, so I
added an additional option NOVERIFY_CHECKSUMS to the BASE_BACKUP
replication command and also an option -k / --no-verify-checksums to
pg_basebackup to trigger this.

Updated patch attached.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 4fd61d7c2d..9c907db262 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2382,7 +2382,7 @@ The commands accepted in replication mode are:
   
 
   
-BASE_BACKUP [ LABEL 'label' ] [ PROGRESS ] [ FAST ] [ WAL ] [ NOWAIT ] [ MAX_RATE rate ] [ TABLESPACE_MAP ]
+BASE_BACKUP [ LABEL 'label' ] [ PROGRESS ] [ FAST ] [ WAL ] [ NOWAIT ] [ MAX_RATE rate ] [ TABLESPACE_MAP ] [ NOVERIFY_CHECKSUMS ]
  BASE_BACKUP
 
 
@@ -2481,6 +2481,17 @@ The commands accepted in replication mode are:
  
 

+
+   
+NOVERIFY_CHECKSUMS
+
+ 
+  By default, checksums are verified during a base backup if they are
+  enabled. Specifying NOVERIFY_CHECKSUMS disables
+  this verification.
+ 
+
+   
   
  
  
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index e8921b1bb4..95045669c9 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -506,6 +506,22 @@ PostgreSQL documentation
   
  
 
+ 
+  -k
+  --no-verify-checksums
+  
+   
+Disables verification of checksums, if they are enabled on the server
+the base backup is taken from. 
+   
+   
+By default, checksums are verified and checksum failures will result in
+a non-zero exit status. However, the base backup will not be removed in
+this case, as if the --no-clean option was used.
+   
+  
+ 
+
  
   -v
   --verbose
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 185f32a5f9..9f735a2c07 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 
+#include "access/xlog.h"
 #include "access/xlog_internal.h"	/* for pg_start/stop_backup */
 #include "catalog/catalog.h"
 #include "catalog/pg_type.h"
@@ -30,6 +31,8 @@
 #include "replication/basebackup.h"
 #include "replication/walsender.h"
 #include "replication/walsender_private.h"
+#include "storage/bufpage.h"
+#include "storage/checksum.h"
 #include "storage/dsm_impl.h"
 #include "storage/fd.h"
 #include "storage/ipc.h"
@@ -97,6 +100,15 @@ static TimeOffset elapsed_min_unit;
 /* The last check of the transfer rate. */
 static TimestampTz throttled_last;
 
+/* The starting XLOG position of the base backup. */
+static XLogRecPtr startptr;
+
+/* Whether a checksum failure occured. */
+static bool checksum_failure;
+
+/* Do not verify checksums. */
+static bool noverify_checksums = false;
+
 /*
  * The contents of these directories are removed or recreated during server
  * start so they are not included in backups.  The directories themselves are
@@ -189,7 +201,6 @@ base_backup_cleanup(int code, Datum arg)
 static void
 perform_base_backup(basebackup_options *opt)
 {
-	XLogRecPtr	startptr;
 	TimeLineID	starttli;
 	XLogRecPtr	endptr;
 	TimeLineID	endtli;
@@ -205,6 +216,8 @@ perform_base_backup(basebackup_options *opt)
 	labelfile = makeStringInfo();
 	tblspc_map_file = makeStringInfo();
 
+	checksum_failure = false;
+
 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
   labelfile, &tablespaces,
   tblspc_map_file,
@@ -563,6 +576,12 @@ perform_base_backup(basebackup_options *opt)
 		pq_putemptymessage('c');
 	}
 	SendXlogRecPtrResult(endptr, endtli);
+
+	if (checksum_failure == true)
+		ereport(ERROR,
+			(errcode(ERRCODE_DATA_CORRUPTED),
+			errmsg("checksum mismatch during basebackup")));
+
 }
 
 /*
@@ -592,6 +611,7 @@ parse_basebackup_options(List *options, basebackup_options *opt)
 	bool		o_wal = false;
 	bool		o_maxrate = false;
 	bool		o_tablespace_map = false;
+	bool		o_noverify_checksums = false;
 
 	MemSet(opt, 0, sizeof(*opt));
 	foreach(lopt, options)
@@ -671,6 +691,15 @@ parse_basebackup_options(List *options, basebackup_options *opt)
 			o

Re: pgbench randomness initialization

2018-03-17 Thread Chapman Flack
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, failed
Spec compliant:   not tested
Documentation:tested, failed

This is a simple patch, includes documentation, includes and passes tests, and, 
in my rookie opinion, is ready for committer.

The new status of this patch is: Ready for Committer


Re: Precision loss casting float to numeric

2018-03-17 Thread Chapman Flack
On 03/09/18 12:05, Emre Hasegeli wrote:
> In this case, I cannot see any other option than adding those as
> separate cast functions.  Should we mark this entry as "returned with
> feedback"?
> 
> We can also consider turning the current float to numeric casts to
> explicit as they are causing data loss.  I am not sure how much it
> would impact backwards-compatibility.  The counter argument is the
> numeric to float casts being IMPLICIT.  They are causing data loss for
> sure, but I believe there are different reasons to keep them as
> IMPLICIT.

Thanks for the feedback. I will mark it RWF myself, as the backward-
compatibility issues are kind of paralyzing, and I don't think I'll
have time in this CF to give it enough further thought anyway.

I wonder whether even changing a formerly-implicit cast to explicit
would be too much of a behavior change for existing code that expects
the current behavior?

-Chap



Re: [bug fix] Cascaded standby cannot start after a clean shutdown

2018-03-17 Thread Michael Paquier
On Fri, Mar 16, 2018 at 06:02:25AM +, Tsunakawa, Takayuki wrote:
> Ouch, you're right.  If memory allocation fails, the startup process
> would emit a LOG message and continue to fetch new WAL records.  Then,
> I'm completely happy with your patch.

Thanks for double-checking, Tsunakawa-san.
--
Michael


signature.asc
Description: PGP signature


Re: SSL passphrase prompt external command

2018-03-17 Thread Michael Paquier
On Fri, Mar 16, 2018 at 12:07:59PM -0400, Peter Eisentraut wrote:
> On 3/15/18 12:13, Daniel Gustafsson wrote:
>> * In src/tools/msvc/Mkvcbuild.pm
>> 
>> # if building without OpenSSL
>> if (!$solution->{options}->{openssl})
>> {
>> +   $postgres->RemoveFile('src/backend/libpq/be-secure-common.c');
>> $postgres->RemoveFile('src/backend/libpq/be-secure-openssl.c');
>> }
>> 
>> Is this because run_ssl_passphrase_command() would otherwise generate a 
>> warning
>> due to not being used?
> 
> I have no idea. ;-)  It's the same thing I was told to do for
> fe-secure-common.c a while back.

I don't recall saying exactly that either :)

You need to filter out OpenSSL-only code when MSVC builds without
OpenSSL because it includes all files listed in a path by default.  No
need to make compilation include files which are not necessary.  This
change is correct by the way.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] AdvanceXLInsertBuffer vs. WAL segment compressibility

2018-03-17 Thread Chapman Flack
On 03/16/18 17:14, Daniel Gustafsson wrote:
> The attached patch adds the test, and a neccessary extension to 
> check_pg_config
> to allow for extracting values from pg_config.h as opposed to just returning
> the number of regex matches. (needed for XLOG_BLCKSZ.)

Thanks for the review. I notice that cfbot has now flagged the patch as
failing, and when I look into it, it appears that cfbot is building with
your test patch, and without the xlog.c patch, and so the test naturally
fails. Does the cfbot require both patches to be attached to the same
email, in order to include them both? I'll try attaching both to this one,
and see what it does.

This is good confirmation that the test is effective. :)

-Chap
From 35684bdaa1d27c3f4b7198541f3c92bb2b4cb2f4 Mon Sep 17 00:00:00 2001
From: Chapman Flack 
Date: Sun, 25 Feb 2018 11:44:47 -0500
Subject: [PATCH] Zero headers of unused pages after WAL switch.

When writing zeroed pages to the remainder of a WAL segment
after a WAL switch, ensure that the headers of those pages are
also zeroed, as their initialized values otherwise reduce the
compressibility of the WAL segment file by general tools.
---
 src/backend/access/transam/xlog.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 47a6c4d..a91ec7b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1556,7 +1556,16 @@ CopyXLogRecordToWAL(int write_len, bool isLogSwitch, 
XLogRecData *rdata,
{
/* initialize the next page (if not initialized 
already) */
WALInsertLockUpdateInsertingAt(CurrPos);
-   AdvanceXLInsertBuffer(CurrPos, false);
+   /*
+* Fields in the page header preinitialized by 
AdvanceXLInsertBuffer
+* to nonconstant values reduce the compressibility of 
WAL segments,
+* and aren't needed in the freespace following a 
switch record.
+* Re-zero that header area. This is not 
performance-critical, as
+* the more empty pages there are for this loop to 
touch, the less
+* busy the system is.
+*/
+   currpos = GetXLogBuffer(CurrPos);
+   MemSet(currpos, 0, SizeOfXLogShortPHD);
CurrPos += XLOG_BLCKSZ;
}
}
-- 
2.7.3

From a7cbcde7518e2f95673ec5ddba32b7ea6e0b84bb Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson 
Date: Fri, 16 Mar 2018 21:32:32 +0100
Subject: [PATCH] Add test for ensuring WAL segment is zeroed out

Switch to a new WAL file and ensure the last segment in the now
switched-away-from file is completely zeroed out.

This adds a mode to check_pg_config() where the match from the
regexp can be returned, not just the literal count of matches.
Only one match is returned (the first), but given the structure
of pg_config.h that's likely to be enough.
---
 src/test/perl/TestLib.pm | 18 --
 src/test/recovery/t/002_archiving.pl | 20 +++-
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index 7d017d49bd..8b9796c4f6 100644
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -227,15 +227,29 @@ sub append_to_file
 # that.
 sub check_pg_config
 {
-   my ($regexp) = @_;
+   my ($regexp, $value) = @_;
my ($stdout, $stderr);
+   my $match;
my $result = IPC::Run::run [ 'pg_config', '--includedir' ], '>',
  \$stdout, '2>', \$stderr
  or die "could not execute pg_config";
chomp($stdout);
 
open my $pg_config_h, '<', "$stdout/pg_config.h" or die "$!";
-   my $match = (grep {/^$regexp/} <$pg_config_h>);
+   if (defined $value)
+   {
+   while (<$pg_config_h>)
+   {
+   $_ =~ m/^$regexp/;
+   next unless defined $1;
+   $match = $1;
+   last;
+   }
+   }
+   else
+   {
+   $match = (grep {/^$regexp/} <$pg_config_h>);
+   }
close $pg_config_h;
return $match;
 }
diff --git a/src/test/recovery/t/002_archiving.pl 
b/src/test/recovery/t/002_archiving.pl
index e1bd3c95cc..c235e98904 100644
--- a/src/test/recovery/t/002_archiving.pl
+++ b/src/test/recovery/t/002_archiving.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 1;
+use Test::More tests => 2;
 use File::Copy;
 
 # Initialize master node, doing archives
@@ -49,3 +49,21 @@ $node_standby->poll_query_until('postgres', $caughtup_query)
 my $result =
   $node_standby->safe_psql('postgres', "SELECT count(*) FROM tab_int");
 is($result, qq(1000), 'check content from archives');
+

[GSoC 2018] Proposal Draft

2018-03-17 Thread Kefan Yang
Hi everyone,

I am Kefan Yang, a third-year Computing Science student from Simon Fraser
University, Canada. I am very interested in the *sorting algorithm
benchmarking and implementation* issue you mentioned on the idealist of
Google Summer of Code 2018.

I am currently working on my proposal, but I am a little bit worried about
whether I am on the right track. I've encountered some problems in the
implementation of the benchmark of those sorting algorithms. I briefly
described my current ideas in the proposal draft, but they may change a bit
as I read more research papers. I sincerely hope to get some feedback from
you to improve my proposal, especially for the implementation part.

I've attached a proposal draft to this email. It's a brief one but I guess
it's enough to show my current ideas:)

Sincerely,
Kefan Yang


proposal_draft.docx
Description: MS-Word 2007 document


Re: [GSoC 2018] Proposal Draft

2018-03-17 Thread Peter Geoghegan
On Sat, Mar 17, 2018 at 5:34 PM, Kefan Yang  wrote:
> I am Kefan Yang, a third-year Computing Science student from Simon Fraser
> University, Canada. I am very interested in the sorting algorithm
> benchmarking and implementation issue you mentioned on the idealist of
> Google Summer of Code 2018.

> I've attached a proposal draft to this email. It's a brief one but I guess
> it's enough to show my current ideas:)

The proposal reads:

"""
Industrial implementation of selected sorting algorithm:
The industrial version is basically an optimization based on the benchmark
implementation. I plan to use optimizations like checking if input
array is already sorted
or applying insertion sort directly for short arrays to see if the
performance can be
improved. I am still looking for other common optimization methods.

"""

But our qsort implementation, which is the Bentley & McIlroy
implementation with a couple of customizations, already does
everything you mention (I refer to the precheck algorithmic
customization, and the way that we check to see which side of a pivot
to use real recursion with to avoid stack overflows). I suggest that
you read the paper [1] -- the code that we use is almost directly
lifted from the paper. The opaque sounding variable names are the
same, and the C code from the paper is structured in exactly the same
way.

I think that this won't be a particularly easy project to get
committed. I suggest that if you go forward with it that you
specifically look into integrating "Dual-Pivot Quicksort" [2] as a
whole cloth replacement for the B&M implementation. It seems like this
has some chance of working out, because it's more or less acknowledged
by Bentley himself to be a kind of successor to his own industrial
quicksort implementation [3] -- it's derived from it. Note that Java 7
uses dual pivot quicksort when sorting integers.

In general, we've had a lot of success with optimizing sorting in the
past few years by focusing on things like avoiding cache misses in
comparators. There has been much less success with algorithmic
improvements, and no success at all with novel algorithmic
enhancements. PostgreSQL development just isn't a great way to do that
sort of thing.

BTW, I noticed that you go on to say this:

"""
However,
since disk operation is much expensive than in-memory sorting, I am
not sure if we can
observe a significant difference in this way.

"""

I think that you'd be surprised at how often this isn't true these
days, at least when sorting enough data for spilling to disk to be
relevant. The main reason for that is that the cost of writing out
runs increases linearly, and therefore eventually becomes very small
compared to the costs of sorting itself, which increases at a
linearithmic rate.

[1] http://cs.fit.edu/~pkc/classes/writing/samples/bentley93engineering.pdf
[2] http://codeblab.com/wp-content/uploads/2009/09/DualPivotQuicksort.pdf
[3] 
http://mail.openjdk.java.net/pipermail/core-libs-dev/2009-September/002630.html
-- 
Peter Geoghegan



Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

2018-03-17 Thread Tom Lane
Tomas Vondra  writes:
> I guess it might want to notify people running affected animals, because
> otherwise they may stay stuck for a long time.

Yeah, I sent something out to buildfarm-members already.

regards, tom lane



Re: Precision loss casting float to numeric

2018-03-17 Thread Tom Lane
Chapman Flack  writes:
> I wonder whether even changing a formerly-implicit cast to explicit
> would be too much of a behavior change for existing code that expects
> the current behavior?

We did exactly that during the 8.3 cycle, and got one heck of a lot of
pushback about it, despite the existence of a lot of examples showing
that it would be a good idea.  I won't say we can't do it again, but
it won't be an easy sell.

regards, tom lane



Re: [GSoC 2018] Proposal Draft

2018-03-17 Thread Kefan Yang
Thanks for your quick feedback!

"""
Industrial implementation of selected sorting algorithm:
The industrial version is basically an optimization based on the benchmark
implementation. I plan to use optimizations like checking if input
array is already sorted
or applying insertion sort directly for short arrays to see if the
performance can be
improved. I am still looking for other common optimization methods.

"""
What I am trying to say here is that similar optimizations can be applied
to novel algorithms or other implementations of quicksort.

The paper about "Dual-Pivot Quicksort" is really helpful and it has a Java
implementation already. I can definitely make use of that.

Also, I am wondering what's the normal approach to testing if a certain
sorting implementation brings performance gain in this project? More
specifically,  You mentioned that little progress has been made with novel
algorithmic enhancement. How can we get that conclusion?

I am still working on my proposal and I will post a new version soon:)

Regards,
Kefan


2018-03-17 18:05 GMT-07:00 Peter Geoghegan :

> On Sat, Mar 17, 2018 at 5:34 PM, Kefan Yang  wrote:
> > I am Kefan Yang, a third-year Computing Science student from Simon Fraser
> > University, Canada. I am very interested in the sorting algorithm
> > benchmarking and implementation issue you mentioned on the idealist of
> > Google Summer of Code 2018.
>
> > I've attached a proposal draft to this email. It's a brief one but I
> guess
> > it's enough to show my current ideas:)
>
> The proposal reads:
>
> """
> Industrial implementation of selected sorting algorithm:
> The industrial version is basically an optimization based on the benchmark
> implementation. I plan to use optimizations like checking if input
> array is already sorted
> or applying insertion sort directly for short arrays to see if the
> performance can be
> improved. I am still looking for other common optimization methods.
>
> """
>
> But our qsort implementation, which is the Bentley & McIlroy
> implementation with a couple of customizations, already does
> everything you mention (I refer to the precheck algorithmic
> customization, and the way that we check to see which side of a pivot
> to use real recursion with to avoid stack overflows). I suggest that
> you read the paper [1] -- the code that we use is almost directly
> lifted from the paper. The opaque sounding variable names are the
> same, and the C code from the paper is structured in exactly the same
> way.
>
> I think that this won't be a particularly easy project to get
> committed. I suggest that if you go forward with it that you
> specifically look into integrating "Dual-Pivot Quicksort" [2] as a
> whole cloth replacement for the B&M implementation. It seems like this
> has some chance of working out, because it's more or less acknowledged
> by Bentley himself to be a kind of successor to his own industrial
> quicksort implementation [3] -- it's derived from it. Note that Java 7
> uses dual pivot quicksort when sorting integers.
>
> In general, we've had a lot of success with optimizing sorting in the
> past few years by focusing on things like avoiding cache misses in
> comparators. There has been much less success with algorithmic
> improvements, and no success at all with novel algorithmic
> enhancements. PostgreSQL development just isn't a great way to do that
> sort of thing.
>
> BTW, I noticed that you go on to say this:
>
> """
> However,
> since disk operation is much expensive than in-memory sorting, I am
> not sure if we can
> observe a significant difference in this way.
>
> """
>
> I think that you'd be surprised at how often this isn't true these
> days, at least when sorting enough data for spilling to disk to be
> relevant. The main reason for that is that the cost of writing out
> runs increases linearly, and therefore eventually becomes very small
> compared to the costs of sorting itself, which increases at a
> linearithmic rate.
>
> [1] http://cs.fit.edu/~pkc/classes/writing/samples/
> bentley93engineering.pdf
> [2] http://codeblab.com/wp-content/uploads/2009/09/DualPivotQuicksort.pdf
> [3] http://mail.openjdk.java.net/pipermail/core-libs-dev/2009-
> September/002630.html
> --
> Peter Geoghegan
>


Re: Precision loss casting float to numeric

2018-03-17 Thread Tom Lane
Emre Hasegeli  writes:
> We can also consider turning the current float to numeric casts to
> explicit as they are causing data loss.  I am not sure how much it
> would impact backwards-compatibility.  The counter argument is the
> numeric to float casts being IMPLICIT.  They are causing data loss for
> sure, but I believe there are different reasons to keep them as
> IMPLICIT.

FWIW, the behavior of these casts is intended to model what the SQL
standard says about the behavior of "exact numeric" vs "approximate
numeric" types.  (In our implementation, the integral types plus numeric
are in the first category, float4 and float8 in the second.)  The spec is
perfectly clear that you can assign an approximate numeric to an exact
numeric, or vice versa, without any explicit cast.  It is also clear that
arithmetic operations combining approximate and exact are allowed without
a cast, and yield approximate results.  These rules lead to our
conclusions that exact -> approximate is an implicit cast (so that the
parser will choose eg. float8 multiplication over numeric multiplication
if you write numericvar * float8var) while approximate -> exact is an
assignment cast (so that you can assign float8 to numeric without explicit
casting).  If the decisions had been driven purely by "what risks silent
precision loss", no doubt we'd have done it differently ... but it's hard
to do that and still meet the letter of the spec.

regards, tom lane



Re: [HACKERS] AdvanceXLInsertBuffer vs. WAL segment compressibility

2018-03-17 Thread Tom Lane
Chapman Flack  writes:
> Thanks for the review. I notice that cfbot has now flagged the patch as
> failing, and when I look into it, it appears that cfbot is building with
> your test patch, and without the xlog.c patch, and so the test naturally
> fails. Does the cfbot require both patches to be attached to the same
> email, in order to include them both?

I believe so --- AFAIK it does not know anything about dependencies
between different patches, and will just try to build whatever patch(es)
appear in the latest email on a given thread.  Munro might be able to
provide more detail.

regards, tom lane



Re: [GSoC 2018] Proposal Draft

2018-03-17 Thread Peter Geoghegan
On Sat, Mar 17, 2018 at 7:00 PM, Kefan Yang  wrote:
> What I am trying to say here is that similar optimizations can be applied to
> novel algorithms or other implementations of quicksort.

A novel algorithm is something to avoid here, because novel techniques
tend to only work out for specific datasets and data distributions. In
general, we care about a wide variety of cases, and are very keen to
avoid regressions.

> The paper about "Dual-Pivot Quicksort" is really helpful and it has a Java
> implementation already. I can definitely make use of that.

Cool. Please be careful to pick source material that has a license
that is compatible with the PostgreSQL license.

> Also, I am wondering what's the normal approach to testing if a certain
> sorting implementation brings performance gain in this project?

It varies quite a bit. You can search the lists archives for some of
this. For example, Tomas Vondra often tests my sorting patches by
subjecting them to a variety of tests with varied distributions and
data types. His results are then collated in a spreadsheet for easy
analysis. Maybe you can replicate that.

The only obvious trend is that everything is tested using real SQL
statements (including CREATE INDEX statements). In the past,
enhancements that were successfully incorporated tended to either
obviously have little or no downside, or have a very significant
upside that made it worth talking about accepting a small regression
in a minority of less representative cases.

> More
> specifically,  You mentioned that little progress has been made with novel
> algorithmic enhancement. How can we get that conclusion?

That's simply been the trend. It isn't particularly likely that
Postgres will be a project that pioneers some kind of algorithmic
enhancement that has broad applicability, that could just as easily
benefit a general purpose library sort routine. If you're interested
in algorithmic enhancements in the sense that I understand the term,
then that's the high bar that would have to be met -- that would
amount to a new insight in an area that has already been researched in
enormous detail by many people, most of whom don't know much about
database systems in particular.

On the other hand, techniques like abbreviated keys work well because
they effectively exploit things like the first normal form, and the
fact that a high start up cost for the sort is already very likely. It
is a technique specifically tailored for a database system, and modern
hardware characteristics.

-- 
Peter Geoghegan



Re: ON CONFLICT DO UPDATE for partitioned tables

2018-03-17 Thread Alvaro Herrera
Alvaro Herrera wrote:

> I think what I should be doing is the same as the returning stuff: keep
> a tupdesc around, and use a single slot, whose descriptor is changed
> just before the projection.

Yes, this works, though it's ugly.  Not any uglier than what's already
there, though, so I think it's okay.

The only thing that I remain unhappy about this patch is the whole
adjust_and_expand_partition_tlist() thing.  I fear we may be doing
redundant and/or misplaced work.  I'll look into it next week.

0001 should be pretty much ready to push -- adjustments to ExecInsert
  and ModifyTableState I already mentioned.

0002 is stuff I would like to get rid of completely -- changes to
  planner code so that it better supports functionality we need for
  0003.

0003 is the main patch.  Compared to the previous version, this one
  reuses slots by switching them to different tupdescs as needed.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 9f9d78e02a474402ee37ebcbed8390f4f3470743 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 16 Mar 2018 14:29:28 -0300
Subject: [PATCH v4 1/3] Simplify ExecInsert API re. ON CONFLICT data

Instead of passing the ON CONFLICT-related members of ModifyTableState
into ExecInsert(), we can have that routine obtain them from the node,
since that is already an argument into the function.

While at it, remove arbiterIndexes from ModifyTableState, since that's
just a copy of the list already in the ModifyTable node, to which the
state node already has access.
---
 src/backend/executor/nodeModifyTable.c | 18 +-
 src/include/nodes/execnodes.h  |  2 --
 2 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c 
b/src/backend/executor/nodeModifyTable.c
index 3332ae4bf3..a9a48e914f 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -258,8 +258,6 @@ static TupleTableSlot *
 ExecInsert(ModifyTableState *mtstate,
   TupleTableSlot *slot,
   TupleTableSlot *planSlot,
-  List *arbiterIndexes,
-  OnConflictAction onconflict,
   EState *estate,
   bool canSetTag)
 {
@@ -271,6 +269,7 @@ ExecInsert(ModifyTableState *mtstate,
List   *recheckIndexes = NIL;
TupleTableSlot *result = NULL;
TransitionCaptureState *ar_insert_trig_tcs;
+   OnConflictAction onconflict = mtstate->mt_onconflict;
 
/*
 * get the heap tuple out of the tuple table slot, making sure we have a
@@ -455,6 +454,7 @@ ExecInsert(ModifyTableState *mtstate,
else
{
WCOKind wco_kind;
+   boolcheck_partition_constr;
 
/*
 * We always check the partition constraint, including when the 
tuple
@@ -463,8 +463,7 @@ ExecInsert(ModifyTableState *mtstate,
 * trigger might modify the tuple such that the partition 
constraint
 * is no longer satisfied, so we need to check in that case.
 */
-   boolcheck_partition_constr =
-   (resultRelInfo->ri_PartitionCheck != NIL);
+   check_partition_constr = (resultRelInfo->ri_PartitionCheck != 
NIL);
 
/*
 * Constraints might reference the tableoid column, so 
initialize
@@ -510,6 +509,9 @@ ExecInsert(ModifyTableState *mtstate,
uint32  specToken;
ItemPointerData conflictTid;
boolspecConflict;
+   List   *arbiterIndexes;
+
+   arbiterIndexes = ((ModifyTable *) 
mtstate->ps.plan)->arbiterIndexes;
 
/*
 * Do a non-conclusive check for conflicts first.
@@ -627,7 +629,7 @@ ExecInsert(ModifyTableState *mtstate,
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(slot, 
&(tuple->t_self),

   estate, false, NULL,
-   
   arbiterIndexes);
+   
   NIL);
}
}
 
@@ -1217,8 +1219,8 @@ lreplace:;
Assert(mtstate->rootResultRelInfo != NULL);
estate->es_result_relation_info = 
mtstate->rootResultRelInfo;
 
-   ret_slot = ExecInsert(mtstate, slot, planSlot, NULL,
- 
ONCONFLICT_NONE, estate, canSetTag);
+   ret_slot = ExecInsert(

Recently-introduced segfault in initdb?

2018-03-17 Thread Isaac Morland
I am trying to develop a small proof-of-concept patch for a proposal I
have, but recently I found that initdb started segfaulting after I did a
git pull. I used git bisect and it identified the following commit as the
first one with the problem:

1733460f0205fc6d6bbe4c14911049a918c6e073 is the first bad commit

commit 1733460f0205fc6d6bbe4c14911049a918c6e073

Author: Robert Haas 

Date:   Fri Mar 2 13:16:01 2018 -0500


postgres_fdw: Fourth attempt to stabilize regression tests.



Commit 1bc0100d270e5bcc980a0629b8726a32a497e788 added this test, and

commits 882ea509fe7a4711fe25463427a33262b873dfa1,

958e20e42d6c346ab89f6c72e4262230161d1663,

4fa396464e5fe238b7994535182f28318c61c78e tried to stabilize it.  It's

still not stable, so keep trying.



The latest comment from Tom Lane is that disabling autovacuum seems

like a good strategy, but we might need to do it on more tables, hence

this patch.



Etsuro Fujita



Discussion: http://postgr.es/m/5a9928f1.2010...@lab.ntt.co.jp


:04 04 97d2b695bc44dd6c50ccc2c73e626ae453507299
c3e23dd08d4f9d0576401d63e375cbb3460e2d33 M contrib

bisect run success

01:01 ijmorlan@scsmac161$

I'm not particularly confident I've done this right but if anybody else is
having trouble with systems compiled from recent commits this might be a
starting point to look at.

Incidentally I'm working on Mac OS X 10.11.6 in case that's helpful.

And I just looked at the commit and as far as I can tell it only affects
tests, which shouldn't affect whether initdb segfaults, which makes me
think I've done something wrong.

So is anybody else finding that initdb segfaults with a newly-compiled
system?


Re: Recently-introduced segfault in initdb?

2018-03-17 Thread Isaac Morland
OK, I must have done something wrong with the bisect the first time. Now
I'm getting the following as the problem commit:

fd1a421fe66173fb9b85d3fe150afde8e812cbe4 is the first bad commit

commit fd1a421fe66173fb9b85d3fe150afde8e812cbe4

Author: Peter Eisentraut 

Date:   Fri Mar 2 08:57:38 2018 -0500


Add prokind column, replacing proisagg and proiswindow



The new column distinguishes normal functions, procedures, aggregates,

and window functions.  This replaces the existing columns proisagg and

proiswindow, and replaces the convention that procedures are indicated

by prorettype == 0.  Also change prorettype to be VOIDOID for
procedures.



Reviewed-by: Tom Lane 

Reviewed-by: Michael Paquier 


:04 04 43854d518b5fdb6b36b6cc5d1f625f75f6b1974c
96aefd013c0ccf730e69a2a3611de9ab4f12294d M doc

:04 04 5f0e806094bdeb8e14ddf098ec7c318f574ec548
2916aea3ab2049c0317d5edd788968c167aecfde M src

bisect run success

01:52 ijmorlan@scsmac161$

When it's not working, I get the following output from initdb:

The files belonging to this database system will be owned by user
"ijmorlan".

This user must also own the server process.


The database cluster will be initialized with locale "C".

The default text search configuration will be set to "english".


Data page checksums are enabled.


creating directory ./test/pgdata ... ok

creating subdirectories ... ok

selecting default max_connections ... 100

selecting default shared_buffers ... 128MB

selecting dynamic shared memory implementation ... posix

creating configuration files ... ok

running bootstrap script ... ok

performing post-bootstrap initialization ... TRAP:
FailedAssertion("!(!isNull)", File: "catcache.c", Line: 365)

sh: line 1: 45094 Abort trap: 6   "/usr/local/pgsql/bin/postgres"
--single -F -O -j -c search_path=pg_catalog -c exit_on_error=true template1
> /dev/null

child process exited with exit code 134

initdb: removing data directory "./test/pgdata"

I hope this is more helpful.

On 18 March 2018 at 01:08, Isaac Morland  wrote:

> I am trying to develop a small proof-of-concept patch for a proposal I
> have, but recently I found that initdb started segfaulting after I did a
> git pull. I used git bisect and it identified the following commit as the
> first one with the problem:
>
[]


Re: [HACKERS] MERGE SQL Statement for PG11

2018-03-17 Thread Pavan Deolasee
On Mon, Mar 12, 2018 at 5:43 PM, Pavan Deolasee 
wrote:

>
>
> On Sun, Mar 11, 2018 at 11:18 AM, Peter Geoghegan  wrote:
>
>>
>>
>> As you know, there is an ON CONFLICT DO UPDATE + partitioning patch in
>> the works from Alvaro. In your explanation about that approach that
>> you cited, you wondered what the trouble might have been with ON
>> CONFLICT + partitioning, and supposed that the issues were similar
>> there. Are they? Has that turned up much?
>>
>>
> Well, I initially thought that ON CONFLICT DO UPDATE on partition table
> may have the same challenges, but that's probably not the case. For INSERT
> ON CONFLICT it's still just an INSERT path, with some special handling for
> UPDATEs. Currently, for partition or inherited table, UPDATEs/DELETEs go
> via inheritance_planner() thus expanding inheritance for the result
> relation where as INSERTs go via simple grouping_planner().
>
> For MERGE, we do all three DMLs. That doesn't mean we could not
> re-implement MERGE on the lines of INSERTs, but that would most likely mean
> complete re-writing of the UPDATEs/DELETEs for partition/inheritance
> tables. The challenges would just be the same in both cases.
>
>
Having thought more about this in the last couple of days, I am actually
inclined to try out rewrite the UPDATE handling of MERGE on the lines of
what ON CONFLICT DO UPDATE patch is doing. This might help us to completely
eliminate invoking inheritance_planner() for partition table and that will
be a huge win for tables with several hundred partitions. The code might
also look much cleaner that way. I am gonna give it a try for next couple
of days and see if its doable.

Thanks,
Pavan