RE: pl/perl extension fails on Windows

2017-12-10 Thread Christian Ullrich
* Noah Misch wrote:

> On Wed, Nov 29, 2017 at 11:45:35PM -0500, Tom Lane wrote:
>> Oh, OK.  In that case, we need to get some representatives of these
>> more modern builds into the buildfarm while we're at it.
> 
> Yep.  Among machines already in the buildfarm, the one running member
> woodlouse is the best candidate for this.  Its owner could install
> http://strawberryperl.com/download/5.26.1.1/strawberry-perl-5.26.1.1-32bit.msi
> and setup another animal on the same machine that builds 32-bit and enables
> Perl.  Christian, are you interested in doing this?

Ready to go, waiting for animal assignment. For now, I can confirm that it 
works, that is, the buildfarm --test run is successful.

Although I have to admit, I fail to see the need for Windows x86 builds, too. 
Who in their right mind would want them today?

--
Christian



Re: [HACKERS] Uninterruptible slow geo_ops.c

2017-12-10 Thread Emre Hasegeli
[Replying to an old thread...]

> A customer of ours hit some very slow code while running the
> @>(polygon, polygon) operator with some big polygons.  I'm not familiar
> with this stuff but I think the problem is that the algorithm converges
> too slowly to a solution and also has some pretty expensive calls
> somewhere.  (Perhaps there is also a problem that the algorithm *never*
> converges for some inputs ...)

I believe there is a simple flaw on the algorithm that causes it loop
endlessly.  It should stop looking at the other segments of the
polygon after finding the touching one.  The attached patch fixes the
issue for the query posted to this thread.  I suggest backpacking the
fix.

There are many code quality issues and bugs like this still unresolved
on geo_ops.c.  I am trying to improve it.  Any help appreciated:

https://commitfest.postgresql.org/16/1160/


0001-polygon-contains-lseg-loop-fix-v00.patch
Description: Binary data


Re: Rethinking MemoryContext creation

2017-12-10 Thread Tom Lane
Peter Geoghegan  writes:
> On Sat, Dec 9, 2017 at 5:53 PM, Tom Lane  wrote:
>> Overall I'm seeing about a 5% improvement in a "pgbench -S" scenario,
>> although that number is a bit shaky since the run-to-run variation
>> is a few percent anyway.

> Is that with "-M prepared", too?

No, I didn't use that.

regards, tom lane



Re: Postgres with pthread

2017-12-10 Thread james

On 06/12/2017 17:26, Andreas Karlsson wrote:
An additional issue is that this could break a lot of extensions and 
in a way that it is not apparent at compile time. This means we may 
need to break all extensions to force extensions authors to check if 
they are thread safe.


I do not like making life hard for out extension community, but if the 
gains are big enough it might be worth it.


It seems to me that the counter-argument is that extensions that 
naturally support threading will benefit.  For example it may be a lot 
more practical to have CLR or JVM extensions.






Inconsistency in plpgsql's error context reports

2017-12-10 Thread Tom Lane
I noticed that since I put in commit 390d58135, buildfarm members
that use CLOBBER_CACHE_ALWAYS are failing the added test case
with diffs like

*** 5051,5057 
  NOTICE:  y = 2
  ERROR:  record "r" is not assigned yet
  DETAIL:  The tuple structure of a not-yet-assigned record is indeterminate.
! CONTEXT:  PL/pgSQL function inline_code_block line 15 at RAISE
  -- Check handling of conflicts between plpgsql vars and table columns.
  set plpgsql.variable_conflict = error;
  create function conflict_test() returns setof int8_tbl as $$
--- 5051,5058 
  NOTICE:  y = 2
  ERROR:  record "r" is not assigned yet
  DETAIL:  The tuple structure of a not-yet-assigned record is indeterminate.
! CONTEXT:  SQL statement "SELECT r"
! PL/pgSQL function inline_code_block line 15 at RAISE
  -- Check handling of conflicts between plpgsql vars and table columns.
  set plpgsql.variable_conflict = error;
  create function conflict_test() returns setof int8_tbl as $$

The reason for the difference turns out to be that in a CCA build,
we are forcing recreation of the cached plan for the simple expression
"r", so the error gets detected during parse analysis of "SELECT r"
rather than during execution of the expression.  And because this all
mostly relies on SPI, we've gone through SPI_plan_get_cached_plan,
which inserts _SPI_error_callback into the error context stack.
That's where the first CONTEXT line is coming from.

There seem to be two ways we could look at this.  One is that the
new test case just needs to be rejiggered to avoid unstable output
("\set VERBOSITY terse" would be the easiest way).  But there is
also room to argue that it's bad that plpgsql produces error reports
that vary depending on the phase of the moon, which is pretty much
what this would look like in the field --- cache flushes will occur
unpredictably in most application environments.  In that view, where
exec_eval_simple_expr() bypasses SPI but nonetheless "has to do some of
the things SPI_execute_plan would do", one of the things it ought to be
doing is setting up an error context stack entry to duplicate the one
that SPI_execute_plan would push.

I'm of mixed mind about whether this is a good idea.  Adding more overhead
to exec_eval_simple_expr() isn't great, even though it'd be just a few
instructions.  And in most cases the statement-level context line that
you get anyway ought to be sufficient to localize the problem.  But it's
not too hard to imagine that a CONTEXT line that only shows up some of
the time could break somebody's error handling code.  It's also arguable
that it's weird that you get different CONTEXT reports depending on
whether exec_simple_check_plan thinks the expression is simple or not,
eg

regression=# do $$ declare x int := 1;
begin raise notice '%', x/0; 
end $$;
ERROR:  division by zero
CONTEXT:  PL/pgSQL function inline_code_block line 2 at RAISE

regression=# do $$ declare x int := 1;
begin raise notice '%', (select x/0);
end $$;
ERROR:  division by zero
CONTEXT:  SQL statement "SELECT (select x/0)"
PL/pgSQL function inline_code_block line 2 at RAISE

Thoughts?

regards, tom lane



Re: pl/perl extension fails on Windows

2017-12-10 Thread Noah Misch
On Sun, Dec 10, 2017 at 12:36:13PM +, Christian Ullrich wrote:
> * Noah Misch wrote:
> > On Wed, Nov 29, 2017 at 11:45:35PM -0500, Tom Lane wrote:
> >> Oh, OK.  In that case, we need to get some representatives of these
> >> more modern builds into the buildfarm while we're at it.
> > 
> > Yep.  Among machines already in the buildfarm, the one running member
> > woodlouse is the best candidate for this.  Its owner could install
> > http://strawberryperl.com/download/5.26.1.1/strawberry-perl-5.26.1.1-32bit.msi
> > and setup another animal on the same machine that builds 32-bit and enables
> > Perl.  Christian, are you interested in doing this?
> 
> Ready to go, waiting for animal assignment. For now, I can confirm that it 
> works, that is, the buildfarm --test run is successful.

Thanks!

> Although I have to admit, I fail to see the need for Windows x86 builds, too. 
> Who in their right mind would want them today?

I can't see installing a 32-bit Windows postmaster in 2018.  The 32-bit libpq
might be useful.  Download statistics for the binary distributions would be
informative.  On the other hand, removing 32-bit Windows support would
eliminate three veteran buildfarm animals, and maintenance was cheap in the
few years before this thread.  I don't think today is the day to remove
support, but it's coming one of these years.



Re: [HACKERS] Uninterruptible slow geo_ops.c

2017-12-10 Thread Tom Lane
Emre Hasegeli  writes:
> I believe there is a simple flaw on the algorithm that causes it loop
> endlessly.  It should stop looking at the other segments of the
> polygon after finding the touching one.  The attached patch fixes the
> issue for the query posted to this thread.  I suggest backpacking the
> fix.

I think this fix is wrong, because it assumes that the answer of
touched_lseg_inside_poly is authoritative, which it clearly isn't;
the blind "return true" for noncollinear segments breaks it.
That is, if "a" is an interior point on the polygon's first segment,
but "b" is outside the polygon, then lseg_inside_poly's first
on_ps_internal call will succeed, its second will not, then it
will call touched_lseg_inside_poly.  All four of the latter's
tests will fail and it'll return true.  Your patch would have us
break out of the loop without looking at any subsequent segments,
which is wrong.

It's a bit difficult to demonstrate the error, because lseg_inside_poly
isn't exposed directly, only through poly_contain (maybe we should add
an operator that does expose it?).  But after experimenting awhile
I found a counterexample:

select '(0,0),(10,0),(0,10)'::polygon @> '(6,0),(7,0),(6,6)'::polygon;

This surely should return false, and it does in HEAD, but your patch
makes it return true.

There are other things to be unhappy about.  As written, lseg_inside_poly
and touched_lseg_inside_poly can mutually recurse to a depth equal to
the number of points in the polygon, which opens a stack-overflow hazard.
I'd like to get rid of the recursion; it seems unnecessary.

Also, while it's sadly underdocumented what touched_lseg_inside_poly is
meant to accomplish at all, I think what it's there for is to deal with
cases involving collinear polygon segments.  That is, if we inject some
useless points into the polygon, say

'(0,0),(5,0),(10,0),(0,10)'::polygon

then it should still be true that the line segment '(1,0),(9,0)' is
"inside" this polygon, but we won't find that out if we just consider
the (0,0),(5,0) and (5,0),(10,0) polygon segments independently.
I am not, however, convinced that this coding fixes that problem if
there are multiple polygon segments we need to match up against.
What concerns me is that if the first collinear poly segment we come
to is an interior subset of the segment-under-test, say we're
considering line segment '(1,0),(9,0)' and we find a poly segment
'(3,0),(5,0)', then we need to separately detect whether each of
'(1,0),(3,0)' and '(5,0),(9,0)' are subsets of the rest of the
polygon ... and I don't see where this code can handle that.
I've not managed to build a weaponized test case though.

regards, tom lane



Re: Out of date comment in cached_plan_cost

2017-12-10 Thread David Rowley
On 9 December 2017 at 06:05, Robert Haas  wrote:
> On Thu, Dec 7, 2017 at 3:14 AM, David Rowley
>  wrote:
>> The attached is my attempt at putting this right.
>
> I don't feel entirely right about the way this seems to treat
> inheritance and partitioning as two entirely separate features; that's
> true from a user perspective, more or less, but not internally to the
> code.

Originally I had it typed out in a way that mentioned something about
"unless using partition-wise join with partitioned tables", but I felt
that the partition planning code is in such a state of flux at the
moment that I feared that comment might get outdated again someday in
the near future.

I've attached another patch which just loosens the claim that join
planning situation is never made worse by inheritance children to now
say it does not "generally" make any worse.

> Of course, this also begs the question of whether we ought to be
> changing the formula somehow.

Perhaps, but not for this patch. The comment already mentions that the
code is far from perfect.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


cached_plan_cost_comment_fix_v2.patch
Description: Binary data


Re: ScalarArrayOpExpr and multi-dimensional arrays

2017-12-10 Thread Amit Langote
On 2017/12/08 23:34, Tom Lane wrote:
> Amit Langote  writes:
>> I wonder if ScalarArrayOpExpr is not really meant for multi-dimensional
>> arrays appearing on the right hand side?  Because:
>> # select array[1] = any (array[array[1], array[2]]);
> 
>> ERROR:  operator does not exist: integer[] = integer
> 
> You are falling into the misimpression that a 2-D array is an array of
> 1-D arrays.  It is not, even if the syntax makes it look like that.
> 
> ScalarArrayOpExpr just iterates over the array elements without regard
> to dimensionality; so the LHS must be of the element type.

Yeah, I can now see that.

Although, I wonder if there is any room for improvement here.  Instead of
waiting for make_scalar_array_op() to emit the error as it does today,
would it be better if we error'd out earlier saying "ERROR: ANY/ALL
leftarg must be scalar, not array"?  Attached a patch for that, if it's
worth going for at all.

Thanks,
Amit
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 29f9da796f..c179d297af 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -958,6 +958,12 @@ transformAExprOpAny(ParseState *pstate, A_Expr *a)
 a->location);
 
lexpr = transformExprRecurse(pstate, lexpr);
+   if (type_is_array(exprType(lexpr)))
+   ereport(ERROR,
+   (errcode(ERRCODE_DATATYPE_MISMATCH),
+errmsg("ANY/ALL leftarg must be be scalar, not 
array"),
+parser_errposition(pstate, 
exprLocation(lexpr;
+
rexpr = transformExprRecurse(pstate, rexpr);
 
return (Node *) make_scalar_array_op(pstate,
@@ -981,6 +987,12 @@ transformAExprOpAll(ParseState *pstate, A_Expr *a)
 a->location);
 
lexpr = transformExprRecurse(pstate, lexpr);
+   if (type_is_array(exprType(lexpr)))
+   ereport(ERROR,
+   (errcode(ERRCODE_DATATYPE_MISMATCH),
+errmsg("ANY/ALL leftarg must be scalar, not 
array"),
+parser_errposition(pstate, 
exprLocation(lexpr;
+
rexpr = transformExprRecurse(pstate, rexpr);
 
return (Node *) make_scalar_array_op(pstate,
diff --git a/src/test/regress/expected/arrays.out 
b/src/test/regress/expected/arrays.out
index c730563f03..c127ffaac0 100644
--- a/src/test/regress/expected/arrays.out
+++ b/src/test/regress/expected/arrays.out
@@ -1295,6 +1295,13 @@ SELECT -1 != ALL(ARRAY(SELECT NULLIF(g.i, 900) FROM 
generate_series(1,1000) g(i)
  
 (1 row)
 
+-- leftarg must always be scalar
+select '{1}'::int[] = any ('{{1}, {2}}');
+ERROR:  ANY/ALL leftarg must be be scalar, not array
+LINE 1: select '{1}'::int[] = any ('{{1}, {2}}');
+   ^
+create table arr_tbl_check (a int[] check (a = any ('{{1}, {2}}')));
+ERROR:  ANY/ALL leftarg must be be scalar, not array
 -- test indexes on arrays
 create temp table arr_tbl (f1 int[] unique);
 insert into arr_tbl values ('{1,2,3}');
diff --git a/src/test/regress/sql/arrays.sql b/src/test/regress/sql/arrays.sql
index 25dd4e2c6d..d5ced0f695 100644
--- a/src/test/regress/sql/arrays.sql
+++ b/src/test/regress/sql/arrays.sql
@@ -374,6 +374,9 @@ select 33 = all ('{1,null,3}');
 select 33 = all ('{33,null,33}');
 -- nulls later in the bitmap
 SELECT -1 != ALL(ARRAY(SELECT NULLIF(g.i, 900) FROM generate_series(1,1000) 
g(i)));
+-- leftarg must always be scalar
+select '{1}'::int[] = any ('{{1}, {2}}');
+create table arr_tbl_check (a int[] check (a = any ('{{1}, {2}}')));
 
 -- test indexes on arrays
 create temp table arr_tbl (f1 int[] unique);


Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD

2017-12-10 Thread Thomas Munro
Hi hackers,

I heard a report of a 10.1 cluster hanging with several 'BtreePage'
wait_events showing in pg_stat_activity.  The query plan involved
Parallel Index Only Scan, and the table is concurrently updated quite
heavily.  I tried and failed to make a reproducer, but from the clues
available it seemed clear that somehow *all* participants in a
Parallel Index Scan must be waiting for someone else to advance the
scan.  The report came with a back trace[1] that was the same in all 3
backends (leader + 2 workers), which I'll summarise here:

  ConditionVariableSleep
  _bt_parallel_seize
  _bt_readnextpage
  _bt_steppage
  _bt_next
  btgettuple
  index_getnext_tid
  IndexOnlyNext

I think _bt_steppage() called _bt_parallel_seize(), then it called
_bt_readnextpage() which I guess must have encountered a BTP_DELETED
or BTP_HALF_DEAD-marked page so didn't take this early break out of
the loop:

/* check for deleted page */
if (!P_IGNORE(opaque))
{
PredicateLockPage(rel, blkno,
scan->xs_snapshot);
/* see if there are any matches on this page */
/* note that this will clear moreRight
if we can stop */
if (_bt_readpage(scan, dir,
P_FIRSTDATAKEY(opaque)))
break;
}

... and then it called _bt_parallel_seize() itself, in violation of
the rule (by my reading of the code) that you must call
_bt_parallel_release() (via _bt_readpage()) or _bt_parallel_done()
after seizing the scan.  If you call _bt_parallel_seize() again
without doing that first, you'll finish up waiting for yourself
forever.  Does this theory make sense?

[1] http://dpaste.com/05PGJ4E

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD

2017-12-10 Thread Thomas Munro
On Mon, Dec 11, 2017 at 3:51 PM, Thomas Munro
 wrote:
> I heard a report of a 10.1 cluster hanging with several 'BtreePage'
> wait_events showing in pg_stat_activity.

I forgot to add, for bug reporter credit purposes: thanks to Patrick
Hemmer for the off-list report and backtrace.  He's able to work
around this recurring problem for now by disabling parallelism.

-- 
Thomas Munro
http://www.enterprisedb.com



Added PostgreSQL internals learning materials in Developer FAQ

2017-12-10 Thread Tsunakawa, Takayuki
Hello,

FYI
I collected Web resources to learn PostgreSQL internals in the Developer FAQ 
page.  I did this because I need to provide self-education materials for new 
developers.  Specifically, I modified "How is the source code organized?", and 
added "What information is available to learn PostgreSQL internals?".

Developer FAQ
https://wiki.postgresql.org/wiki/Developer_FAQ#Development_Tools_and_Help


I borrowed most of the information from the following discussion.  I'd 
appreciate it if you could other useful information.

On How To Shorten the Steep Learning Curve Towards PG Hacking...
https://www.postgresql.org/message-id/flat/29ce9fa2-63d1-001b-74f4-a26e23500acb%40lab.ntt.co.jp#29ce9fa2-63d1-001b-74f4-a26e23500...@lab.ntt.co.jp


Regards
Takayuki Tsunakawa





Re: [HACKERS] parallel.c oblivion of worker-startup failures

2017-12-10 Thread Amit Kapila
On Fri, Dec 8, 2017 at 9:45 PM, Robert Haas  wrote:
> On Fri, Dec 8, 2017 at 4:56 AM, Amit Kapila  wrote:
>> I think the optimization you are suggesting has a merit over what I
>> have done in the patch.  However, one point to note is that we are
>> anyway going to call that function down in the path(
>> WaitForParallelWorkersToExit->WaitForBackgroundWorkerShutdown->GetBackgroundWorkerPid),
>> so we might want to take the advantage of calling it first time as
>> well.  We can actually cache the status of workers that have returned
>> BGWH_STOPPED and use it later so that we don't need to make this
>> function call again for workers which are already stopped.  We can
>> even do what Tom is suggesting to avoid calling it for workers which
>> are known to be launched, but I am really not sure if that function
>> call is costly enough that we need to maintain one extra state to
>> avoid that.
>
> Well, let's do what optimization we can without making it too complicated.
>

Okay, see the attached and let me know if that suffices the need?

>> While looking into this code path, I wonder how much we are gaining by
>> having two separate calls (WaitForParallelWorkersToFinish and
>> WaitForParallelWorkersToExit) to check the status of workers after
>> finishing the parallel execution?  They are used together in rescan
>> code path, so apparently there is no benefit calling them separately
>> there.  OTOH, during shutdown of nodes, it will often be the case that
>> they will be called in a short duration after each other except for
>> the case where we need to shut down the node for the early exit like
>> when there is a limit clause or such.
>
> I'm not 100% sure either, but if we're going to do anything about
> that, it seems like a topic for another patch.
>

makes sense.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_parallel_worker_startup_failures_v1.patch
Description: Binary data


Re: BUG #14941: Vacuum crashes

2017-12-10 Thread Michael Paquier
On Sat, Dec 2, 2017 at 6:16 AM, Bossart, Nathan  wrote:
> On 12/1/17, 3:04 PM, "Robert Haas"  wrote:
>> Seems entirely reasonable to me, provided that we only update the
>> extensible-options syntax:
>>
>> VACUUM [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
>>
>> I don't want to add any more options to the older syntax:
>>
>> VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ table_and_columns [, 
>> ...] ]
>
> Right.  This seems like the right path forward to me, as the
> VACUUM documentation states that the unparenthesized syntax is
> deprecated, and the DISABLE_PAGE_SKIPPING option was not added
> to the old syntax, either.

Yeah, the exact same discussion has happened when what has become
DISABLE_PAGE_SKIPPING was discussed for the freeze map.
-- 
Michael



Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-12-10 Thread Rushabh Lathia
Thanks Tels for reviewing the patch.

On Fri, Dec 8, 2017 at 2:54 PM, Tels  wrote:

> Hello Rushabh,
>
> On Fri, December 8, 2017 2:28 am, Rushabh Lathia wrote:
> > Thanks for review.
> >
> > On Fri, Dec 8, 2017 at 6:27 AM, Peter Geoghegan  wrote:
> >
> >> On Thu, Dec 7, 2017 at 12:25 AM, Rushabh Lathia
> >>  wrote:
> >> > 0001-Add-parallel-B-tree-index-build-sorting_v14.patch
>
> I've looked only at patch 0002, here are some comments.
>
> > + * leaderasworker indicates whether leader going to participate as
> worker  or
> > + * not.
>
> The grammar is a bit off, and the "or not" seems obvious. IMHO this could
> be:
>
> + * leaderasworker indicates whether the leader is going to participate as
> worker
>
>
Sure.


> The argument leaderasworker is only used once and for one temp. variable
> that is only used once, too. So the temp. variable could maybe go.
>
> And not sure what the verdict was from the const-discussion threads, I did
> not follow it through. If "const" is what should be done generally, then
> the argument could be consted, as to not create more "unconsted" code.
>
> E.g. so:
>
> +plan_create_index_workers(Oid tableOid, Oid indexOid, const bool
> leaderasworker)
>
>
Make sense.


> and later:
>
> -   sort_mem_blocks / (parallel_workers + 1) <
> min_sort_mem_blocks)
> +   sort_mem_blocks / (parallel_workers + (leaderasworker
> ? 1 : 0)) < min_sort_mem_blocks)
>
>
Even I didn't liked to take a extra variable, but then code looks bit
unreadable - so rather then making difficult to read the code - I thought
of adding new variable.


> Thank you for working on this patch!
>
>
I will address review comments in the next set of patches.



Regards,
Rushabh Lathia
www.EnterpriseDB.com


Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2017-12-10 Thread Masahiko Sawada
On Sat, Dec 9, 2017 at 2:00 AM, Robert Haas  wrote:
> On Fri, Dec 8, 2017 at 3:20 AM, Masahiko Sawada  wrote:
>>> The first hunk in monitoring.sgml looks unnecessary.
>>
>> You meant the following hunk?
>>
>> diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
>> index 8d461c8..7aa7981 100644
>> --- a/doc/src/sgml/monitoring.sgml
>> +++ b/doc/src/sgml/monitoring.sgml
>> @@ -669,8 +669,8 @@ postgres   27093  0.0  0.0  30096  2752 ?
>> Ss   11:34   0:00 postgres: ser
>>Heavyweight locks, also known as lock manager locks or simply 
>> locks,
>>primarily protect SQL-visible objects such as tables.  However,
>>they are also used to ensure mutual exclusion for certain internal
>> -  operations such as relation extension.
>> wait_event will
>> -  identify the type of lock awaited.
>> +  operations such as waiting for a transaction to finish.
>> +  wait_event will identify the type of lock 
>> awaited.
>>   
>>  
>>  
>>
>> I think that since the extension locks are no longer a part of
>> heavyweight locks we should change the explanation.
>
> Yes, you are right.
>
>>> +RelationExtensionLockWaiterCount(Relation relation)
>>>
>>> Hmm.  This is sort of problematic, because with then new design we
>>> have no guarantee that the return value is actually accurate.  I don't
>>> think that's a functional problem, but the optics aren't great.
>>
>> Yeah, with this patch we could overestimate it and then add extra
>> blocks to the relation. Since the number of extra blocks is capped at
>> 512 I think it would not become serious problem.
>
> How about renaming it EstimateNumberOfExtensionLockWaiters?

Agreed, fixed.

>
>>> +/* If force releasing, release all locks we're holding */
>>> +if (force)
>>> +held_relextlock.nLocks = 0;
>>> +else
>>> +held_relextlock.nLocks--;
>>> +
>>> +Assert(held_relextlock.nLocks >= 0);
>>> +
>>> +/* Return if we're still holding the lock even after computation */
>>> +if (held_relextlock.nLocks > 0)
>>> +return;
>>>
>>> I thought you were going to have the caller adjust nLocks?
>>
>> Yeah, I was supposed to change so but since we always release either
>> one lock or all relext locks I thought it'd better to pass a bool
>> rather than an int.
>
> I don't see why you need to pass either one.  The caller can set
> held_relextlock.nLocks either with -- or = 0, and then call
> RelExtLockRelease() only if the resulting value is 0.

Fixed.

>
>>> When I took a break from sitting at the computer, I realized that I
>>> think this has a more serious problem: won't it permanently leak
>>> reference counts if someone hits ^C or an error occurs while the lock
>>> is held?  I think it will -- it probably needs to do cleanup at the
>>> places where we do LWLockReleaseAll() that includes decrementing the
>>> shared refcount if necessary, rather than doing cleanup at the places
>>> we release heavyweight locks.
>>> I might be wrong about the details here -- this is off the top of my head.
>>
>> Good catch. It can leak reference counts if someone hits ^C or an
>> error occurs while waiting. Fixed in the latest patch. But since
>> RelExtLockReleaseAll() is called even when such situations I think we
>> don't need to change the place where releasing the all relext lock. We
>> just moved it from heavyweight locks. Am I missing something?
>
> Hmm, that might be an OK way to handle it.  I don't see a problem off
> the top of my head.  It might be clearer to rename it to
> RelExtLockCleanup() though, since it is not just releasing the lock
> but also any wait count we hold.

Yeah, it seems better. Fixed.

> +/* Must be greater than MAX_BACKENDS - which is 2^23-1, so we're fine. */
> +#define RELEXT_WAIT_COUNT_MASK((uint32) ((1 << 24) - 1))
>
> Let's drop the comment here and instead add a StaticAssertStmt() that
> checks this.

Fixed. I added StaticAssertStmt() to InitRelExtLocks().

>
> I am slightly puzzled, though.  If I read this correctly, bits 0-23
> are used for the waiter count, bit 24 is always 0, bit 25 indicates
> the presence or absence of an exclusive lock, and bits 26+ are always
> 0.  That seems slightly odd.  Shouldn't we either use the highest
> available bit for the locker (bit 31) or the lowest one (bit 24)?  The
> former seems better, in case MAX_BACKENDS changes later.  We could
> make RELEXT_WAIT_COUNT_MASK bigger too, just in case.

I agree with the former. Fixed.

> +/* Make a lock tag */
> +tag.dbid = MyDatabaseId;
> +tag.relid = relid;
>
> What about shared relations?  I bet we need to use 0 in that case.
> Otherwise, if backends in two different databases try to extend the
> same shared relation at the same time, we'll (probably) fail to notice
> that they conflict.
>

You're right. I changed it so that we set invalidOId to tag.dbid if
the relation is shared relation.


> + * To avoid unnecessary recomput

Re: [HACKERS] What does it mean by XLOG_BACKUP_RECORD?

2017-12-10 Thread Masahiko Sawada
On Sat, Dec 9, 2017 at 1:25 AM, Peter Eisentraut
 wrote:
> On 6/29/17 06:09, Masahiko Sawada wrote:
>> Thanks, I agree to use XLOG_BACKUP_END instead.
>>
>>> Worse, the current comment implies that
>>> minRecoveryPoint is incorrectly set if it is true. Bleh.
>>
>> Looking at the condition, we use minRecoveryPoint only when
>> ControlFile->backupEndRequired is *false*. So I guess that it means
>> that minRecoveryPoint is incorrectly set if
>> ControlFile->backupEndReuired is true. Am I missing something?
>
> I agree with you that the logic in the comment is correct.  I've
> committed just the symbol change.
>

Thank you for picking up an old thread and committing it!

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] Assertion failure when the non-exclusive pg_stop_backup aborted.

2017-12-10 Thread Masahiko Sawada
On Sat, Dec 9, 2017 at 2:24 AM, Robert Haas  wrote:
> On Fri, Dec 8, 2017 at 4:13 AM, Michael Paquier
>  wrote:
>> On Fri, Dec 8, 2017 at 5:38 PM, Masahiko Sawada  
>> wrote:
>>> After off-discussion with Fujii-san, I've updated the comment of why
>>> we should disallow interrupts before setting/cleanup the session-level
>>> lock. Please review it.
>>
>> +   /*
>> +* Set session-level lock. If we allow interrupts before setting
>> +* session-level lock, we could call callbacks with an inconsistent
>> +* state. To avoid calling CHECK_FOR_INTERRUPTS by 
>> LWLockReleaseClearVar
>> +* which is called by WALInsertLockRelease before changing the backup
>> +* state we change it while holding the WAL insert lock.
>> +*/
>> So you are just adding the reference to WALInsertLockRelease.. Instead
>> of writing the function names for LWLocks,

I also added a sentence "If we allow interrupts before cleanup
session-level lock, we could call do_pg_abort_backup with an
inconsistent state" at two places: setting and cleanup session-level
lock.

>> I would just write "To
>> avoid calling CHECK_FOR_INTERRUPTS which can happen when releasing a
>> LWLock" and be done with it. There is no point to list a full function
>> dependency list, which could change in the future with static routines
>> of lwlock.c.

Agreed. Updated the comment.

>
> I think it's actually good to be explicit here.  I looked at this
> patch a bit last week and had great difficulty understanding how the
> CHECK_FOR_INTERRUPTS() could happen.
>

Attached the updated version patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


fix_do_pg_abort_backup_v11.patch
Description: Binary data


Re: BUG #14941: Vacuum crashes

2017-12-10 Thread Michael Paquier
On Wed, Dec 6, 2017 at 1:52 AM, Bossart, Nathan  wrote:
> 0001_fix_unparenthesized_vacuum_grammar_v1.patch
>
> One VacuumStmt grammar rule allows users to specify the VACOPT_ANALYZE
> option by including an AnalyzeStmt at the end of the command.  This
> appears to have been added as part of the introduction of the ANALYZE
> command (f905d65e).  However, this means that users can call VACUUM
> commands like
>
> VACUUM VERBOSE ANALYZE VERBOSE pg_class;
>
> Besides being disallowed according to the documentation, I think this
> may give users the idea that there is a difference between the VERBOSE
> options for VACUUM versus ANALYZE.  In fact, they are the same option,
> and specifying it twice is redundant.

Hmm. Yeah. If adding a parenthesis grammar to ANALYZE this is a good
move for long purposes. If we don't do that now, or at least for the
purpose of this patch set, then a AnalyzeStmt node embedded directly
in the grammar of VACUUM is likely to bite in the future.

> This change fixes the unparenthesized VACUUM syntax to disallow the out-
> of-order VACUUM options as described above.  This is a prerequisite
> change for opening up VACOPT_NOWAIT to users, as adding it to the
> grammar as-is would cause statements like
>
> VACUUM VERBOSE NOWAIT ANALYZE VERBOSE NOWAIT pg_class;
>
> to be allowed.  Since I am also looking to introduce a parenthesized
> syntax for ANALYZE, this patch would prevent statements like

If you add only a parenthesized grammar of ANALYZE, it seems to me
that this would not be an allowed query anyway.

> VACUUM VERBOSE ANALYZE (VERBOSE, NOWAIT) pg_class;
>
> from being accepted as well.

This could also be equivalent to ANALYZE(VERBOSE, NOWAIT). I can see
pros and cons by keeping the complete extension of AnalyzeStmt in the
VACUUM grammar, but as the documentation does not mention VACUUM
VERBOSE ANALYZE VERBOSE as a legit query I guess that ripping it out
is not going to annoy many users. Still let's see opinions from other
folks on -hackers as another approach to the feature you want to add
here would be to just implement the grammar for VACUUM but *not*
ANALYZE, but I'd like to think that we still want ANALYZE to be
supported, and also we want it to be extensible as a different thing
than what VACUUM is.

> 0002_add_parenthesized_analyze_syntax_v1.patch
>
> As a prerequisite to giving users access to the VACOPT_NOWAIT option, I
> am introducing a parenthesized syntax for ANALYZE that is similar to
> VACUUM's.  Following VACUUM's example, any new options will be added to
> this syntax, and the old style will become deprecated.
>
> Adding a parenthesized syntax for ANALYZE is not strictly necessary for
> providing the NOWAIT option, as NOWAIT is already a keyword in gram.y.
> However, I thought it would be good to match VACUUM (since the command
> structure is so similar), and this work would be required at some point
> anyway if ANALYZE ever accepts options that we do not want to make
> reserved keywords.

Yep, agreed with the contents of this patch.

> 0003_add_nowait_vacuum_option_v1.patch
>
> This change provides the existing VACOPT_NOWAIT option to users.  This
> option was previously only used by autovacuum in special cases, but it
> seems like a potentially valuable option in other cases as well.  For
> example, perhaps a user wants to run a nightly VACUUM job that skips
> tables that cannot be locked due to autovacuum (or something else)
> already working on it.
>
> I chose to use NOWAIT as the option name because it is already a keyword
> for the LOCK command.

Makes sense to me.

> This patch follows the existing logging precedent added by 11d8d72c and
> ab6eaee8: if a relation is explicitly specified in the command, a log
> statement will be emitted if it is skipped.  If the relation is not
> specified (e.g. "VACUUM FULL;"), no such log statements will be emitted.

+WARNING:  skipping analyze of "test1" --- lock not available
+step analyze_specified: ANALYZE (NOWAIT) test1, test2;
+step commit:
+   COMMIT;
OK for a WARNING for me in this case. We already do that for the other entries.

Let's see what other folks think first about the ANALYZE grammar in
VACUUM, as well as having ANALYZE use a parenthesized grammer. FWIW, I
like the contents of this patch set, and the thing is non-intrusive. I
think that NOWAIT gains more value now that multiple relations can be
vacuumed or analyzed with a single manual command.
-- 
Michael



Re: [HACKERS] Assertion failure when the non-exclusive pg_stop_backup aborted.

2017-12-10 Thread Michael Paquier
On Mon, Dec 11, 2017 at 2:03 PM, Masahiko Sawada  wrote:
> On Sat, Dec 9, 2017 at 2:24 AM, Robert Haas  wrote:
>> On Fri, Dec 8, 2017 at 4:13 AM, Michael Paquier
>>  wrote:
>>> I would just write "To
>>> avoid calling CHECK_FOR_INTERRUPTS which can happen when releasing a
>>> LWLock" and be done with it. There is no point to list a full function
>>> dependency list, which could change in the future with static routines
>>> of lwlock.c.
>
> Agreed. Updated the comment.

Robert actually liked adding the complete routine list. Let's see what
Fujii-san thinks at the end, there is still some time until the next
round of minor releases. Robert, if Fujii-san does not show  up in
time, would you look at this patch? I won't fight if you rework the
comments the way you think is better :)
-- 
Michael



Re: no partition pruning when partitioning using array type

2017-12-10 Thread Amit Langote
On 2017/12/09 3:46, Robert Haas wrote:
> On Fri, Dec 8, 2017 at 5:40 AM, Amit Langote
>  wrote:
>> I noticed that if you partition using a array type column, partition
>> pruning using constraint exclusion fails to work due to a minor problem.
>>
>> Example:
>>
>> create table p (a int[]) partition by list (a);
>> create table p1 partition of p for values in ('{1}');
>> create table p1 partition of p for values in ('{2, 3}', '{4, 5}');
>>
>> explain select a from p where a = '{1}';
>> QUERY PLAN
>> |-
>>  Append  (cost=0.00..54.00 rows=14 width=32)
>>->  Seq Scan on p1  (cost=0.00..27.00 rows=7 width=32)
>>  Filter: (a = '{1}'::integer[])
>>->  Seq Scan on p2  (cost=0.00..27.00 rows=7 width=32)
>>  Filter: (a = '{1}'::integer[])
>>
>> explain select a from p where a = '{2, 3}';
>> QUERY PLAN
>> |-
>>  Append  (cost=0.00..54.00 rows=14 width=32)
>>->  Seq Scan on p1  (cost=0.00..27.00 rows=7 width=32)
>>  Filter: (a = '{2,3}'::integer[])
>>->  Seq Scan on p2  (cost=0.00..27.00 rows=7 width=32)
>>  Filter: (a = '{2,3}'::integer[])
>> (5 rows)
>>
>> In the case of array type partition key, make_partition_op_expr() will
>> have to put a RelabelType node on top of the partition key Var, after
>> having selected an = operator from the array_ops family.  The RelabelType
>> causes operator_predicate_proof() to fail to consider predicate leftop and
>> clause leftop as equal, because only one of them ends up having the
>> RelabelType attached to it.
>>
>> As a simple measure, the attached patch teaches operator_predicate_proof()
>> to strip RelabelType nodes from all the nodes it compares using equal().
>> I also added a relevant test in partition_prune.sql.
> 
> I guess the question is whether that's guaranteed to be safe.  I spent
> a little bit of time thinking about it and I don't see a problem.  The
> function is careful to check that the opclasses and collations of the
> OpExprs are compatible, and it is the behavior of the operator that is
> in question here, not the column type, so your change seems OK to me.
> But I hope somebody else will also study this, because this stuff is
> fairly subtle and I would not like to be the one who breaks it.

Thanks for taking a look at it.

I will try to say a little more on why it seems safe.  RelabelType node
exists (if any) on top of a given expression node only to denote that the
operator for which the node is an input will interpret its result as of
the type RelableType.resulttype, instead of the node's original type.  No
conversion of values actually occurs before making any decisions that this
function is in charge of making, because the mismatching types in question
are known to be binary coercible.  Or more to the point, the operator that
will be used in the proof will give correct answers for the values without
having to do any conversion of values.  IOW, it's okay if we simply drop
the RelabelType, because it doesn't alter in any way the result of the
proof that operator_predicate_proof() performs.

That said, I've to come think in this particular case that the
partitioning code that generates the predicate expression should be a bit
smarter about the various types it manipulates such that RelabelType won't
be added in the first place.  In contrast, make_op(), that generates an
OpExpr from the parser representation of a = '{1}' appearing in the
query's WHERE clause, won't add the RelabelType because the underlying
type machinery that it invokes is able to conclude that that's
unnecessary.  The original patch may still be worth considering as a
solution though.

I hope someone else chimes in as well. :)

Thanks,
Amit




Re: Added PostgreSQL internals learning materials in Developer FAQ

2017-12-10 Thread Amit Kapila
On Mon, Dec 11, 2017 at 8:43 AM, Tsunakawa, Takayuki
 wrote:
> Hello,
>
> FYI
> I collected Web resources to learn PostgreSQL internals in the Developer FAQ 
> page.  I did this because I need to provide self-education materials for new 
> developers.  Specifically, I modified "How is the source code organized?", 
> and added "What information is available to learn PostgreSQL internals?".
>
> Developer FAQ
> https://wiki.postgresql.org/wiki/Developer_FAQ#Development_Tools_and_Help
>
>
> I borrowed most of the information from the following discussion.
>

Thanks, looks like quite useful information for the new people who
want to learn about PostgreSQL especially internals.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD

2017-12-10 Thread Amit Kapila
On Mon, Dec 11, 2017 at 8:21 AM, Thomas Munro
 wrote:
> Hi hackers,
>
>
> ... and then it called _bt_parallel_seize() itself, in violation of
> the rule (by my reading of the code) that you must call
> _bt_parallel_release() (via _bt_readpage()) or _bt_parallel_done()
> after seizing the scan.  If you call _bt_parallel_seize() again
> without doing that first, you'll finish up waiting for yourself
> forever.  Does this theory make sense?
>

Yes, I think if the current page is half-dead or deleted, we need to
set the next page to be scanned and release the parallel scan.  This
has to be done for both forward and backward scans.

Thanks for looking into it.  I will see if we can write some test.  In
the meantime if possible, can you please request Patrick Hemmer to
verify the attached patch?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_parallel_index_scan_v1.patch
Description: Binary data


Re: SIGPIPE in TAP tests

2017-12-10 Thread Michael Paquier
On Sun, Dec 10, 2017 at 6:02 AM, Noah Misch  wrote:
> Two buildfarm runs[1][2] from the last 90 days have failed in
> src/test/authentication, like this:
>
>   t/001_password.pl ..
>   Failed 3/8 subtests
>   t/002_saslprep.pl .. ok 1815 ms ( 0.00 usr  0.00 sys +  0.89 cusr  0.26 
> csys =  1.15 CPU)
>
>   Test Summary Report
>   ---
>   t/001_password.pl (Wstat: 13 Tests: 5 Failed: 0)
> Non-zero wait status: 13
> Parse errors: Bad plan.  You planned 8 tests but ran 5.
>   Files=2, Tests=17,  3 wallclock secs ( 0.04 usr  0.00 sys +  1.67 cusr  
> 0.50 csys =  2.21 CPU)
>   Result: FAIL
>
> Compared to a good run, the other logs just end suddenly after the expected
> "FATAL:  password authentication failed".  "Wstat: 13" means the Perl process
> died to signal 13 (SIGPIPE).  This test invokes psql in a way that fails
> authentication, and it writes "SELECT 1" to psql's stdin.  The SIGPIPE happens
> if the psql process exits before that write.

Nice investigation. An interesting coincidence is that I have looked
yesterday at an off-list reported some folks have sent me which is
basically what you have here.

+   # Return EPIPE instead of killing the process with SIGPIPE.  An affected
+   # test may still fail, but it's more likely to report useful facts.
+   $SIG{PIPE} = 'IGNORE';
With this portion changed I can indeed see something which is more helpful:
psql: FATAL:  password authentication failed for user "saslpreptest1_role"
ack Broken pipe: write( 11, 'SELECT 1' ) at
/Library/Perl/5.18/IPC/Run/IO.pm line 558.

If SIGPIPE is ignored then test output just stops after generating the
FATAL message. Oops.

> The two src/test/authentication tests then fail, but nothing else fails.
> Let's ignore SIGPIPE in all TAP tests, which leaves some evidence in
> regress_log_001_password:
>
>   ack Broken pipe: write( 13, 'SELECT 1' ) at 
> /home/nm/sw/cpan/lib/perl5/IPC/Run/IO.pm line 549.
>
> To fix the actual failures, we can cease sending "SELECT 1"; it's enough to
> disconnect immediately.  Patch attached.

Perhaps you could use an empty string instead? I feel a bit uneasy
about passing an undefined object to IPC::Run::run.
-- 
Michael



Re: [HACKERS] postgres_fdw bug in 9.6

2017-12-10 Thread Etsuro Fujita

(2017/12/09 5:53), Robert Haas wrote:

On Sun, Dec 3, 2017 at 2:56 PM, Tom Lane  wrote:

Not sure where that leaves us for the patch at hand.  It feels to me
like it's a temporary band-aid for code that we want to get rid of
in the not-too-long run.  As such, it'd be okay if it were smaller,
but it seems bigger and more invasive than I could wish for a band-aid.


Well, the bug report is a year old today, so we should try to do
something.  The patch needs a rebase, but reading through it, I'm not
convinced that it's particularly risky.  I mean, it's going to break
FDWs that are calling GetExistingLocalJoinPath(), but there are
probably very few third-party FDWs that do that.  Any that do are
precisely the ones that need this fix, so I think there's a good
chance they'll forgive of us for requiring them to make a code change.
I think we can contain the risk of anything else getting broken to an
acceptable level because there's not really very much other code
changing.  The changes to joinpath.c need to be carefully audited to
make sure that they are not changing the semantics, but that seems
like it shouldn't take more than an hour of code-reading to check.
The new fields in JoinPathExtraData can be moved to the end of the
struct to minimize ABI breakage.  I don't see what else there is that
could break apart from the EPQ rechecks themselves, and if that
weren't broken already, this patch wouldn't exist.

Now, if you're still super-concerned about this breaking something, we
could commit it only to master, where it will have 9 months to bake
before it gets released.  I think that's overly conservative, but I
think it's still better than waiting for the rewrite you'd like to see
happen.  We don't know when or if anyone is going to undertake that,
and if we wait, we may easing release a v11 that's got the same defect
as v9.6 and now v10.  And I don't see that we lose much by committing
this now even if that rewrite does happen in time for v11.  Ripping
out CreateLocalJoinPath() won't be any harder than ripping out
GetExistingLocalJoinPath().


Agreed.  Attached is an rebased version which moved the new fields in 
JoinPathExtraData to the end of that struct.


Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/expected/postgres_fdw.out
--- b/contrib/postgres_fdw/expected/postgres_fdw.out
***
*** 1701,1722  SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t
 Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
 Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
 Remote SQL: SELECT r1."C 1", r1.c3, CASE WHEN (r1.*)::text IS NOT NULL THEN ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8) END, r2."C 1", CASE WHEN (r2.*)::text IS NOT NULL THEN ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) END FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1" ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST FOR UPDATE OF r1
!->  Merge Join
   Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
!  Merge Cond: (t1.c1 = t2.c1)
!  ->  Sort
 Output: t1.c1, t1.c3, t1.*
!Sort Key: t1.c1
!->  Foreign Scan on public.ft1 t1
!  Output: t1.c1, t1.c3, t1.*
!  Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
!  ->  Sort
 Output: t2.c1, t2.*
!Sort Key: t2.c1
!->  Foreign Scan on public.ft2 t2
!  Output: t2.c1, t2.*
!  Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
! (23 rows)
  
  SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE OF t1;
   c1  | c1  
--- 1701,1716 
 Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
 Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
 Remote SQL: SELECT r1."C 1", r1.c3, CASE WHEN (r1.*)::text IS NOT NULL THEN ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8) END, r2."C 1", CASE WHEN (r2.*)::text IS NOT NULL THEN ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) END FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1" ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST FOR UPDATE OF r1
!->  Nested Loop
   Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
!  Join Filter: (t1.c1 = t2.c1)
!  ->  Foreign Scan on public.ft1 t1
 Output: t1.c1, t1.c3, t1.*
!Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
!  ->  For

Re: SIGPIPE in TAP tests

2017-12-10 Thread Noah Misch
On Mon, Dec 11, 2017 at 04:19:52PM +0900, Michael Paquier wrote:
> On Sun, Dec 10, 2017 at 6:02 AM, Noah Misch  wrote:

> If SIGPIPE is ignored then test output just stops after generating the
> FATAL message. Oops.

You mean "If SIGPIPE is not ignored ...", right?

> > To fix the actual failures, we can cease sending "SELECT 1"; it's enough to
> > disconnect immediately.  Patch attached.
> 
> Perhaps you could use an empty string instead? I feel a bit uneasy
> about passing an undefined object to IPC::Run::run.

IPC::Run documents the equivalence of undef and '' in this context; search for
"close a child processes stdin" in
http://search.cpan.org/~rbs/IPC-Run-0.78/lib/IPC/Run.pm.  Thus, I expect both
spellings to work reliably, and I find "undef" slightly more evocative.