date:20180519

FindDefinedSymbol() is subtly broken

2018-05-19 Thread John Naylor

Long story short: It expects an array of paths, but will die if it
can't find the symbol in the first path. Arguably it's a bug, but
since the original use case for multiple paths is long gone anyway, it
might never occur in practice. The attached patch changes it to expect
a single path.

Also attached: a few minor build script cleanups I've noticed along the way:
-more accurate error message
-s/print sprintf/printf/
-a recent perltidy change made a couple multi-line strings look odd,
so use heredocs (which other scripts use in this case anyway)
-make generated file footers match project style
-remove a duplicate comment

-John Naylor
From be3cbb1499e026355697b1aff890a3049cdef4ac Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Sat, 19 May 2018 13:14:58 +0700
Subject: [PATCH] Update FindDefinedSymbol() to match current practice.

Once upon a time, headers passed to build scripts could live in
either $builddir or $sourcedir, so the Makefiles needed to pass
both directories to the scripts. FindDefinedSymbol() was intended
to deal with muliple directories, but it's actually broken: It dies
if it can't find the symbol in the first given path. This went
unnoticed for so long, because the catalog scripts output distprep
targets, so only need $sourcedir.

Fix so that we expect only one include path.
---
 src/backend/catalog/Catalog.pm   | 36 
 src/backend/utils/Gen_fmgrtab.pl |  8 
 src/include/catalog/unused_oids  |  2 +-
 3 files changed, 21 insertions(+), 25 deletions(-)

diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
index 07bd5f3..b88b36b 100644
--- a/src/backend/catalog/Catalog.pm
+++ b/src/backend/catalog/Catalog.pm
@@ -365,34 +365,30 @@ sub RenameTempFile
 }
 
 # Find a symbol defined in a particular header file and extract the value.
-#
-# The include path has to be passed as a reference to an array.
 sub FindDefinedSymbol
 {
 	my ($catalog_header, $include_path, $symbol) = @_;
+	my $value;
 
-	for my $path (@$include_path)
+	# Make sure include path ends in a slash.
+	if (substr($include_path, -1) ne '/')
 	{
-
-		# Make sure include path ends in a slash.
-		if (substr($path, -1) ne '/')
-		{
-			$path .= '/';
-		}
-		my $file = $path . $catalog_header;
-		next if !-f $file;
-		open(my $find_defined_symbol, '<', $file) || die "$file: $!";
-		while (<$find_defined_symbol>)
+		$include_path .= '/';
+	}
+	my $file = $include_path . $catalog_header;
+	next if !-f $file;
+	open(my $find_defined_symbol, '<', $file) || die "$file: $!";
+	while (<$find_defined_symbol>)
+	{
+		if (/^#define\s+\Q$symbol\E\s+(\S+)/)
 		{
-			if (/^#define\s+\Q$symbol\E\s+(\S+)/)
-			{
-return $1;
-			}
+			$value = $1;
+			last;
 		}
-		close $find_defined_symbol;
-		die "$file: no definition found for $symbol\n";
 	}
-	die "$catalog_header: not found in any include directory\n";
+	close $find_defined_symbol;
+	return $value if defined $value;
+	die "$file: no definition found for $symbol\n";
 }
 
 # Similar to FindDefinedSymbol, but looks in the bootstrap metadata.
diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl
index 6c9f1a7..465bcd5 100644
--- a/src/backend/utils/Gen_fmgrtab.pl
+++ b/src/backend/utils/Gen_fmgrtab.pl
@@ -22,7 +22,7 @@ use warnings;
 # Collect arguments
 my @input_files;
 my $output_path = '';
-my @include_path;
+my $include_path;
 
 while (@ARGV)
 {
@@ -37,7 +37,7 @@ while (@ARGV)
 	}
 	elsif ($arg =~ /^-I/)
 	{
-		push @include_path, length($arg) > 2 ? substr($arg, 2) : shift @ARGV;
+		$include_path = length($arg) > 2 ? substr($arg, 2) : shift @ARGV;
 	}
 	else
 	{
@@ -53,7 +53,7 @@ if ($output_path ne '' && substr($output_path, -1) ne '/')
 
 # Sanity check arguments.
 die "No input files.\n" if !@input_files;
-die "No include path; you must specify -I at least once.\n" if !@include_path;
+die "No include path; you must specify -I.\n" if !$include_path;
 
 # Read all the input files into internal data structures.
 # Note: We pass data file names as arguments and then look for matching
@@ -80,7 +80,7 @@ foreach my $datfile (@input_files)
 
 # Fetch some values for later.
 my $FirstBootstrapObjectId =
-  Catalog::FindDefinedSymbol('access/transam.h', \@include_path,
+  Catalog::FindDefinedSymbol('access/transam.h', $include_path,
 	'FirstBootstrapObjectId');
 my $INTERNALlanguageId =
   Catalog::FindDefinedSymbolFromData($catalog_data{pg_language},
diff --git a/src/include/catalog/unused_oids b/src/include/catalog/unused_oids
index 2780de0..c5fc378 100755
--- a/src/include/catalog/unused_oids
+++ b/src/include/catalog/unused_oids
@@ -34,7 +34,7 @@ my $oids = Catalog::FindAllOidsFromHeaders(@input_files);
 
 # Also push FirstBootstrapObjectId to serve as a terminator for the last gap.
 my $FirstBootstrapObjectId =
-  Catalog::FindDefinedSymbol('access/transam.h', [".."],
+  Catalog::FindDefinedSymbol('access/transam.h', '..',
 	'FirstBootstrapObjectId');
 push @{$oid

generating bootstrap entries for array types

2018-05-19 Thread John Naylor

I wrote [1]:

> On 4/26/18, Tom Lane  wrote:
>> if I counted correctly.  (Array entries should be ignored for this
>> purpose; maybe we'll autogenerate them someday.)
>
> Hmm, that wouldn't be too hard. Add a new metadata field called
> 'array_type_oid', then if it finds such an OID, teach genbki.pl to
> construct a tuple for the corresponding array type by consulting
> something like
>
> chartypcategoryBKI_ARRAY(A);
> chartypstorage  BKI_ARRAY(x);
> ...etc
>
> in the header file, plus copying typalign from the element type. I'll
> whip this up sometime and add it to the next commitfest.

This turned out to be slightly more complicated than that, but still
pretty straightforward. The script is for information only, it doesn't
need to be run.

-typalign for arrays is 'i' unless the element type is 'd', in which
case it's 'd'.
-If future array-like types are created that break the model and so
can't be generated from the element type, they have an escape hatch of
having their own full entry. Currently this includes  _record, since
it's a pseudotype, and of course the special types oidvector and
int2vector.
-I've kept the current convention of duplicating typdelim in the array
type, although I'm not sure it's used anywhere.

If you sort the before and after postgres.bki, it should result in a clean diff.

Will add to next commitfest.

--
[1] 
https://www.postgresql.org/message-id/CAJVSVGW-D7OobzU%3DdybVT2JqZAx-4X1yvBJdavBmqQL05Q6CLw%40mail.gmail.com

-John Naylor
From 8c1b345b54bddd560df0e15b534c960b532cdade Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Sat, 19 May 2018 15:10:51 +0700
Subject: [PATCH v1] Generate bootstrap entries for array types

Add a new metadata field called 'array_type_oid', which instructs
genbki.pl to generate an array type from the given element type.
This allows most array type entries to be deleted from pg_type.dat.
The lone exception is _record, since it's a pseudotype.
---
 src/backend/catalog/Catalog.pm   |   6 +
 src/backend/catalog/genbki.pl|  54 +++-
 src/include/catalog/genbki.h |   2 +
 src/include/catalog/pg_type.dat  | 471 ++-
 src/include/catalog/pg_type.h|  24 +-
 src/include/catalog/reformat_dat_file.pl |   2 +-
 6 files changed, 153 insertions(+), 406 deletions(-)

diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
index c8dc956..8d40c62 100644
--- a/src/backend/catalog/Catalog.pm
+++ b/src/backend/catalog/Catalog.pm
@@ -191,6 +191,10 @@ sub ParseHeader
 	{
 		$column{default} = $1;
 	}
+	elsif ($attopt =~ /BKI_ARRAY_TYPE\(['"]?([^'"]+)['"]?\)/)
+	{
+		$column{array} = $1;
+	}
 	elsif ($attopt =~ /BKI_LOOKUP\((\w+)\)/)
 	{
 		$column{lookup} = $1;
@@ -451,6 +455,8 @@ sub FindAllOidsFromHeaders
 			foreach my $row (@$catdata)
 			{
 push @oids, $row->{oid} if defined $row->{oid};
+push @oids, $row->{array_type_oid}
+	if defined $row->{array_type_oid};
 			}
 		}
 
diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index fb61db0..939af5f 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -98,6 +98,8 @@ foreach my $header (@input_files)
 	if (-e $datfile)
 	{
 		my $data = Catalog::ParseData($datfile, $schema, 0);
+		gen_array_types($schema, $data)
+			if $catname eq 'pg_type';
 		$catalog_data{$catname} = $data;
 
 		# Check for duplicated OIDs while we're at it.
@@ -396,7 +398,8 @@ EOM
 			  if $key eq "oid"
 			  || $key eq "oid_symbol"
 			  || $key eq "descr"
-			  || $key eq "line_number";
+			  || $key eq "line_number"
+			  || $key eq "array_type_oid";
 			die sprintf "unrecognized field name \"%s\" in %s.dat line %s\n",
 			  $key, $catname, $bki_values{line_number}
 			  if (!exists($attnames{$key}));
@@ -571,6 +574,53 @@ exit 0;
  Subroutines 
 
 
+# If the type specifies an array type OID, generate an entry for the array
+# type using values from the element type, plus some values that are needed
+# for all array types.
+sub gen_array_types
+{
+	my $pgtype_schema = shift;
+	my $types = shift;
+	my @array_types;
+
+	foreach my $elem_type (@$types)
+	{
+		next if !exists $elem_type->{array_type_oid};
+		my %array_type;
+
+		# Specific values derived from the element type.
+		$array_type{oid} = $elem_type->{array_type_oid};
+		$array_type{typname} = '_' . $elem_type->{typname};
+		$array_type{typelem} = $elem_type->{typname};
+
+		# Arrays require INT alignment, unless the element type requires
+		# DOUBLE alignment.
+		$array_type{typalign} = $elem_type->{typalign} eq 'd' ? 'd' : 'i';
+
+		# Fill in the rest of the values.
+		foreach my $column (@$pgtype_schema)
+		{
+			my $attname = $column->{name};
+
+			# Skip if we already set it above.
+			next if defined $array_type{$attname};
+
+			# If we have a value needed for all arrays, use it, otherwise
+			# copy the value from the elemen

Re: pg_control read error message

2018-05-19 Thread Michael Paquier

On Fri, May 18, 2018 at 10:22:29AM -0400, Tom Lane wrote:
> Only comment I have is that I think there's similar shortcuts in a lot
> of places :-(

Yeah.  A quick lookup is showing me one in xlog.c (XLOG_BLCKSZ) and one
in pg_rewind.  (Spotted roughly 392 places to look at in all the core
code).  Let's discuss that on a separate thread.
--
Michael


signature.asc
Description: PGP signature

Re: SCRAM with channel binding downgrade attack

2018-05-19 Thread Michael Paquier

On Fri, May 18, 2018 at 09:30:22AM -0400, Bruce Momjian wrote:
> Good work, but I think the existance of both scram_channel_binding and
> channel_binding_mode in libpq is confusing.  I am thinking we should
> have one channel binding parameter that can take values "prefer",
> "tls-unique", "tls-server-end-point", and "require".  I don't see the
> point to having "none" and "allow" that sslmode supports.   "tls-unique"
> and "tls-server-end-point" would _require_ those channel binding modes; 
> the setting would never be ignored, e.g. if no SSL.

I can see the point you are coming at.  Do you think that we should
worry about users willing to use specifically tls-server-end-point
(which is something actually in the backend protocol only for JDBC) and
manage a cluster of both v10 and v11 servers?  In this case we assume
that "prefer" means always using tls-unique as channel binding, but
there is no way to say "I would prefer channel binding if available,
only with end-point".  It would not be that hard to let the application
layer on top of libpq handle that complexity with channel binding
handling, though it makes the life of libpq consumers a bit harder.

Hence, I would actually eliminate "require", as that would be actually
the same as "tls-unique", and the possibility to use an empty value from
the list of options available but add "none" as that's actually the same
as the current empty value.  This gives as list:
- tls-unique
- tls-server-end-point
- none
- prefer, which has the meaning of preferring tls-unique
- And prefer-end-point?  But thinking why end-point has been added it
would not be worth it, and for tests only the option requiring end-point
is enough.

The previous patch has actually problems with servers using "trust",
"password" and any protocol which can send directly AUTH_REQ_OK as those
could still enforce a downgrade to not use channel binding, so we
actually need to make sure that AUTH_REQ_SASL_FIN has been received when 
channel binding is required when looking at a AUTH_REQ_OK message.
--
Michael


signature.asc
Description: PGP signature

Re: Flexible permissions for REFRESH MATERIALIZED VIEW

2018-05-19 Thread Robert Haas

On Fri, May 18, 2018 at 8:13 PM, Stephen Frost  wrote:
> I'm not a big fan of it- what happens when we introduce something else
> which would seem like it'd fall under 'maintain' but provides some
> capability that maybe it wouldn't be good for users who could only run
> the above commands to have?  I'm tempted to suggest that, really, we
> might even be thinking about splitting up things further than the above
> proposal- what about VACUUM vs. VACUUM FULL?  Or REFRESH MATVIEW vs.
> REFRESH MATVIEW CONCURRENTLY?  Mistakes between those routinly cause
> problems due to the heavy lock taken in some cases- as an administrator,
> I'd be a lot more comfortable giving a user or some process the ability
> to run a VACUUM vs. VACUUM FULL.

That is a fair point, but if we want to do things like that then it's
really not a good idea to limit ourselves to a fixed number of bits,
even if it's 2x or 4x more than what we have today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Flexible permissions for REFRESH MATERIALIZED VIEW

2018-05-19 Thread Greg Stark

On 19 May 2018 at 01:13, Stephen Frost  wrote:
>
> I'm not entirely sure about the varlena suggestion, seems like that
> would change a great deal more code and be slower, though perhaps not
> enough to matter; it's not like our aclitem arrays are exactly optimized
> for speed today.

I don't actually understand the reason y'all are talking about
varlena. The aclitem type is already not passbyvalue so there's no
particular limit on the size and specifically no reason it can't be
larger than 64 bytes. As Tom mentioned it's only used in catalogs so
there's no on-disk version compatibility issue either. It can be
defined to have a bitmask exactly large enough to hold all the acl
privileges needed for the specific version and still not be a varlena.

The only downsides are:

1. At some point it gets large enough that adding rarely used
privileges is imposing a large storage and cache efficiency cost
that's amortized over all the acl lookups even when it's not used. I
doubt we're anywhere close to that currently but if we started having
hundreds of privileges...

2. The current macros just do a bitshift to look up bits and they
would get a bit more complex. But it's not like it isn't arithmetic we
haven't tackled repeatedly in other places that do bitmasks such as
varbit, numeric, bitmapset, and probably more.

Fwiw I don't understand why the current AclMode uses a single uint32
to pass around two 16-bit bitmasks and uses bitshifting to extract the
two. Why not just make it a struct with two uint16s in it instead?
That would mean we would have a factor of four available before the
macros even get the slight added complication.

-- 
greg

printf format selection vs. reality

2018-05-19 Thread Tom Lane

Our configure script goes to considerable lengths to try to identify
what printf width modifier to use for int64 values.  In particular
see PGAC_FUNC_SNPRINTF_LONG_LONG_INT_MODIFIER, which claims

# MinGW uses '%I64d', though gcc throws a warning with -Wall,
# while '%lld' doesn't generate a warning, but doesn't work.

However, if we decide that we ought to use our own snprintf replacement,
we throw that info away and set INT64_FORMAT to "%lld" which we know
is what that code uses.

This was all right when that code was first written, when we had only
a very small number of uses of INT64_FORMAT and they all were in
snprintf() calls.  It's been a long time since that was true: pgbench,
in particular, has been passing INT64_FORMAT to the native printf
with increasing enthusiasm.  In reality, we do not anymore work with
situations where our snprintf has a different idea about this format
modifier than libc does.

What's more, as far as I can find, we do not have any buildfarm members
that set INT64_MODIFIER to anything other than "l" or "ll", which no
doubt explains why we've not noticed a problem.  The comment quoted
above doesn't seem to apply to any current buildfarm members.

I think we should abandon the pretense that we can work with libc
printfs that accept anything but "l"/"ll", and rip out the excess
complexity in configure, just setting INT64_MODIFIER to "l" or "ll"
depending on whether "int64" is "long" or "long long".

Comments?

regards, tom lane

Re: FindDefinedSymbol() is subtly broken

2018-05-19 Thread Tom Lane

John Naylor  writes:
> Long story short: It expects an array of paths, but will die if it
> can't find the symbol in the first path. Arguably it's a bug, but
> since the original use case for multiple paths is long gone anyway, it
> might never occur in practice. The attached patch changes it to expect
> a single path.

> Also attached: a few minor build script cleanups I've noticed along the way:
> -more accurate error message
> -s/print sprintf/printf/
> -a recent perltidy change made a couple multi-line strings look odd,
> so use heredocs (which other scripts use in this case anyway)
> -make generated file footers match project style
> -remove a duplicate comment

This all seemed like minor cleanup from v11 changes, so I went
ahead and pushed it rather than waiting for v12.

regards, tom lane

printf("%lf",...) isn't actually portable

2018-05-19 Thread Tom Lane

I noticed while poking at the recent ecpg unpleasantness that some
of my older critters were warning about use of %lf conversions in
printf.  Looking into it, I find that current POSIX says the "l"
is a no-op, while SUSv2 says it's undefined.  I think this usage
got into our code as a result of people making false analogies
between scanf and printf conversions.

I think we should just switch these to plain %f, as per attached.

regards, tom lane

diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index 558a8c4..90acf6a 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -563,7 +563,7 @@ bt_metap(PG_FUNCTION_ARGS)
 	if (metad->btm_version == BTREE_VERSION)
 	{
 		values[j++] = psprintf("%u", metad->btm_oldest_btpo_xact);
-		values[j++] = psprintf("%lf", metad->btm_last_cleanup_num_heap_tuples);
+		values[j++] = psprintf("%f", metad->btm_last_cleanup_num_heap_tuples);
 	}
 	else
 	{
diff --git a/doc/src/sgml/ecpg.sgml b/doc/src/sgml/ecpg.sgml
index 98b6840..f065477 100644
--- a/doc/src/sgml/ecpg.sgml
+++ b/doc/src/sgml/ecpg.sgml
@@ -1678,7 +1678,7 @@ while (1)
 
  Here is an example using the data type complex from
  the example in .  The external string
- representation of that type is (%lf,%lf),
+ representation of that type is (%f,%f),
  which is defined in the
  functions complex_in()
  and complex_out() functions
diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c
index 1590d67..5c44571 100644
--- a/src/backend/access/rmgrdesc/nbtdesc.c
+++ b/src/backend/access/rmgrdesc/nbtdesc.c
@@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record)
 			{
 xl_btree_metadata *xlrec = (xl_btree_metadata *) rec;
 
-appendStringInfo(buf, "oldest_btpo_xact %u; last_cleanup_num_heap_tuples: %lf",
+appendStringInfo(buf, "oldest_btpo_xact %u; last_cleanup_num_heap_tuples: %f",
  xlrec->oldest_btpo_xact,
  xlrec->last_cleanup_num_heap_tuples);
 break;
diff --git a/src/interfaces/ecpg/test/compat_informix/sqlda.pgc b/src/interfaces/ecpg/test/compat_informix/sqlda.pgc
index 423ce41..87e0110 100644
--- a/src/interfaces/ecpg/test/compat_informix/sqlda.pgc
+++ b/src/interfaces/ecpg/test/compat_informix/sqlda.pgc
@@ -37,7 +37,7 @@ dump_sqlda(sqlda_t *sqlda)
 			printf("name sqlda descriptor: '%s' value %d\n", sqlda->sqlvar[i].sqlname, *(int *)sqlda->sqlvar[i].sqldata);
 			break;
 		case SQLFLOAT:
-			printf("name sqlda descriptor: '%s' value %lf\n", sqlda->sqlvar[i].sqlname, *(double *)sqlda->sqlvar[i].sqldata);
+			printf("name sqlda descriptor: '%s' value %f\n", sqlda->sqlvar[i].sqlname, *(double *)sqlda->sqlvar[i].sqldata);
 			break;
 		case SQLDECIMAL:
 			{
diff --git a/src/interfaces/ecpg/test/expected/compat_informix-sqlda.c b/src/interfaces/ecpg/test/expected/compat_informix-sqlda.c
index fa2e569..ad3188d 100644
--- a/src/interfaces/ecpg/test/expected/compat_informix-sqlda.c
+++ b/src/interfaces/ecpg/test/expected/compat_informix-sqlda.c
@@ -142,7 +142,7 @@ dump_sqlda(sqlda_t *sqlda)
 			printf("name sqlda descriptor: '%s' value %d\n", sqlda->sqlvar[i].sqlname, *(int *)sqlda->sqlvar[i].sqldata);
 			break;
 		case SQLFLOAT:
-			printf("name sqlda descriptor: '%s' value %lf\n", sqlda->sqlvar[i].sqlname, *(double *)sqlda->sqlvar[i].sqldata);
+			printf("name sqlda descriptor: '%s' value %f\n", sqlda->sqlvar[i].sqlname, *(double *)sqlda->sqlvar[i].sqldata);
 			break;
 		case SQLDECIMAL:
 			{
diff --git a/src/interfaces/ecpg/test/expected/preproc-outofscope.c b/src/interfaces/ecpg/test/expected/preproc-outofscope.c
index f4676a0..ef4dada 100644
--- a/src/interfaces/ecpg/test/expected/preproc-outofscope.c
+++ b/src/interfaces/ecpg/test/expected/preproc-outofscope.c
@@ -337,7 +337,7 @@ if (sqlca.sqlcode < 0) exit (1);}
 		get_record1();
 		if (sqlca.sqlcode == ECPG_NOT_FOUND)
 			break;
-		printf("id=%d%s t='%s'%s d1=%lf%s d2=%lf%s c = '%s'%s\n",
+		printf("id=%d%s t='%s'%s d1=%f%s d2=%f%s c = '%s'%s\n",
 			myvar->id, mynullvar->id ? " (NULL)" : "",
 			myvar->t, mynullvar->t ? " (NULL)" : "",
 			myvar->d1, mynullvar->d1 ? " (NULL)" : "",
diff --git a/src/interfaces/ecpg/test/expected/sql-sqlda.c b/src/interfaces/ecpg/test/expected/sql-sqlda.c
index 81d26b3..090aaf1 100644
--- a/src/interfaces/ecpg/test/expected/sql-sqlda.c
+++ b/src/interfaces/ecpg/test/expected/sql-sqlda.c
@@ -158,7 +158,7 @@ dump_sqlda(sqlda_t *sqlda)
 			break;
 #endif
 		case ECPGt_double:
-			printf("name sqlda descriptor: '%s' value %lf\n", sqlda->sqlvar[i].sqlname.data, *(double *)sqlda->sqlvar[i].sqldata);
+			printf("name sqlda descriptor: '%s' value %f\n", sqlda->sqlvar[i].sqlname.data, *(double *)sqlda->sqlvar[i].sqldata);
 			break;
 		case ECPGt_numeric:
 			{
diff --git a/src/interfaces/ecpg/test/preproc/outofscope.pgc b/src/interfaces/ecpg/test/preproc/outofscope.pgc
index aae5325..b03743c 100644
--- a/src/interfaces/ecpg/test/preproc/ou

Fix for FETCH FIRST syntax problems

2018-05-19 Thread Andrew Gierth

Per bug #15200, our support for sql2008 FETCH FIRST syntax is incomplete
to the extent that it should be regarded as a bug; the spec quite
clearly allows parameters/host variables in addition to constants, but
we don't.

Attached is a draft patch for fixing that, which additionally fixes the
ugly wart that OFFSET  ROW/ROWS and FETCH FIRST [] ROW/ROWS ONLY
had very different productions for ; both now accept c_expr there.

Shift/reduce conflict is avoided by taking advantage of the fact that
ONLY is a fully reserved word.

I've checked that this change would not make it any harder to add
(post-2008 features) WITH TIES or PERCENT in the event that someone
wants to do that.

I think a case can be made that this should be backpatched; thoughts?

(While I can't find any visible change for existing working queries, one
change that does occur is that FETCH FIRST -1 ROWS ONLY now returns a
different error message; but that was already inconsistent with the
error from OFFSET -1 ROWS.)

-- 
Andrew (irc:RhodiumToad)

diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index b5d3d3a071..330adb8c37 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -1399,10 +1399,11 @@ OFFSET start
 OFFSET start { ROW | ROWS }
 FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY
 
-In this syntax, to write anything except a simple integer constant for
+In this syntax, to write anything except a simple integer constant,
+parameter, or variable name for
 start or count, you must write parentheses
-around it.
+class="parameter">count, you should write parentheses
+around it (this is a PostgreSQL extension).
 If count is
 omitted in a FETCH clause, it defaults to 1.
 ROW
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index babb62dae1..e441c555a4 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -451,7 +451,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
 %type 	fetch_args limit_clause select_limit_value
 offset_clause select_offset_value
-select_offset_value2 opt_select_fetch_first_value
+select_fetch_first_value
 %type 	row_or_rows first_or_next
 
 %type 	OptSeqOptList SeqOptList OptParenthesizedSeqOptList
@@ -11570,15 +11570,23 @@ limit_clause:
 			 parser_errposition(@1)));
 }
 			/* SQL:2008 syntax */
-			| FETCH first_or_next opt_select_fetch_first_value row_or_rows ONLY
+			/* to avoid shift/reduce conflicts, handle the optional value with
+			 * a separate production rather than an opt_ expression.  The fact
+			 * that ONLY is fully reserved means that this way, we defer any
+			 * decision about what rule reduces ROW or ROWS to the point where
+			 * we can see the ONLY token in the lookahead slot.
+			 */
+			| FETCH first_or_next select_fetch_first_value row_or_rows ONLY
 { $$ = $3; }
+			| FETCH first_or_next row_or_rows ONLY
+{ $$ = makeIntConst(1, -1); }
 		;
 
 offset_clause:
 			OFFSET select_offset_value
 { $$ = $2; }
 			/* SQL:2008 syntax */
-			| OFFSET select_offset_value2 row_or_rows
+			| OFFSET select_fetch_first_value row_or_rows
 { $$ = $2; }
 		;
 
@@ -11597,21 +11605,16 @@ select_offset_value:
 
 /*
  * Allowing full expressions without parentheses causes various parsing
- * problems with the trailing ROW/ROWS key words.  SQL only calls for
- * constants, so we allow the rest only with parentheses.  If omitted,
- * default to 1.
- */
-opt_select_fetch_first_value:
-			SignedIconst		{ $$ = makeIntConst($1, @1); }
-			| '(' a_expr ')'	{ $$ = $2; }
-			| /*EMPTY*/			{ $$ = makeIntConst(1, -1); }
-		;
-
-/*
- * Again, the trailing ROW/ROWS in this case prevent the full expression
- * syntax.  c_expr is the best we can do.
+ * problems with the trailing ROW/ROWS key words.  SQL spec only calls for
+ * , which is either a literal or a parameter (but
+ * an  could be an identifier, bringing up conflicts
+ * with ROW/ROWS). We solve this by leveraging the presense of ONLY (see above)
+ * to determine whether the expression is missing rather than trying to make it
+ * optional in this rule.
+ *
+ * c_expr covers all the spec-required cases (and more).
  */
-select_offset_value2:
+select_fetch_first_value:
 			c_expr	{ $$ = $1; }
 		;

Re: Fix for FETCH FIRST syntax problems

2018-05-19 Thread Tom Lane

Andrew Gierth  writes:
> Attached is a draft patch for fixing that, which additionally fixes the
> ugly wart that OFFSET  ROW/ROWS and FETCH FIRST [] ROW/ROWS ONLY
> had very different productions for ; both now accept c_expr there.

LGTM, except please s/presense/presence/ in grammar comment.
Also, why'd you back off "must write" to "should write" in the docs?
This doesn't make the parens any more optional than before.

> I think a case can be made that this should be backpatched; thoughts?

Meh, -0.5.  This is not really a bug, because it's operating as designed.
You've found a reasonably painless way to improve the grammar, which is
great, but it seems more like a feature addition than a bug fix.

I'd be fine with sneaking this into v11, though.

regards, tom lane

Re: Fix for FETCH FIRST syntax problems

2018-05-19 Thread Vik Fearing

On 20/05/18 00:57, Tom Lane wrote:
> Andrew Gierth  writes:
>> Attached is a draft patch for fixing that, which additionally fixes the
>> ugly wart that OFFSET  ROW/ROWS and FETCH FIRST [] ROW/ROWS ONLY
>> had very different productions for ; both now accept c_expr there.
> 
> LGTM, except please s/presense/presence/ in grammar comment.
> Also, why'd you back off "must write" to "should write" in the docs?
> This doesn't make the parens any more optional than before.

It certainly does.  It allows this now:

PREPARE foo AS TABLE bar FETCH FIRST $1 ROWS ONLY;

>> I think a case can be made that this should be backpatched; thoughts?
> 
> Meh, -0.5.  This is not really a bug, because it's operating as designed.
> You've found a reasonably painless way to improve the grammar, which is
> great, but it seems more like a feature addition than a bug fix.
> 
> I'd be fine with sneaking this into v11, though.
I'm +1 for backpatching it.  It may be operating as designed by PeterE
ten years ago, but it's not operating as designed by the SQL standard.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Re: Fix for FETCH FIRST syntax problems

2018-05-19 Thread Vik Fearing

On 20/05/18 01:27, Vik Fearing wrote:
>> Also, why'd you back off "must write" to "should write" in the docs?
>> This doesn't make the parens any more optional than before.
>> It certainly does.  It allows this now:
> 
> PREPARE foo AS TABLE bar FETCH FIRST $1 ROWS ONLY;

Please disregard this comment.

My +1 for backpatching still stands.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Re: Fix for FETCH FIRST syntax problems

2018-05-19 Thread Tom Lane

Vik Fearing  writes:
> On 20/05/18 00:57, Tom Lane wrote:
>> Also, why'd you back off "must write" to "should write" in the docs?
>> This doesn't make the parens any more optional than before.

> It certainly does.  It allows this now:
> PREPARE foo AS TABLE bar FETCH FIRST $1 ROWS ONLY;

No, it makes the parens omittable in the cases specified in the previous
sentence.  The sentence I'm complaining about is describing other cases,
in which they're still required.

> I'm +1 for backpatching it.  It may be operating as designed by PeterE
> ten years ago, but it's not operating as designed by the SQL standard.

By that argument, *anyplace* where we're missing a SQL-spec feature
is a back-patchable bug.  I don't buy it.

It may be that this fix is simple and safe enough that the risk/reward
tradeoff favors back-patching, but I think you have to argue it as a
favorable tradeoff rather than just saying "this isn't per standard".
Consider: if Andrew had completely rewritten gram.y to get the same
visible effect, would you think that was back-patchable?

regards, tom lane

Re: Fix for FETCH FIRST syntax problems

2018-05-19 Thread Michael Paquier

On Sat, May 19, 2018 at 07:41:10PM -0400, Tom Lane wrote:
> Vik Fearing  writes:
>> I'm +1 for backpatching it.  It may be operating as designed by PeterE
>> ten years ago, but it's not operating as designed by the SQL standard.
> 
> By that argument, *anyplace* where we're missing a SQL-spec feature
> is a back-patchable bug.  I don't buy it.

+1.
--
Michael


signature.asc
Description: PGP signature

Fix some error handling for read() and errno

2018-05-19 Thread Michael Paquier

Hi all,

This is basically a new thread after what has been discussed for
pg_controldata with its error handling for read():
https://www.postgresql.org/message-id/CABUevEx8ZRV5Ut_FvP2etXiQppx3xVzm7oOaV3AcdHxX81Yt8Q%40mail.gmail.com

While reviewing the core code, I have noticed similar weird error
handling for read().  At the same time, some of those places may use an
incorrect errno, as an error is invoked using an errno which may be
overwritten by another system call.  I found a funny one in slru.c,
for which I have added a note in the patch.  I don't think that this is
worth addressing with more facility, thoughts are welcome.

Attached is a patch addressing the issues I found.

Thanks,
--
Michael
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index 87942b4cca..d487347cc6 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -683,6 +683,11 @@ SlruPhysicalReadPage(SlruCtl ctl, int pageno, int slotno)
 	pgstat_report_wait_start(WAIT_EVENT_SLRU_READ);
 	if (read(fd, shared->page_buffer[slotno], BLCKSZ) != BLCKSZ)
 	{
+		/*
+		 * XXX: Note that this may actually report sucess if the number
+		 * of bytes read is positive, but lacking data so that errno is
+		 * not set.
+		 */
 		pgstat_report_wait_end();
 		slru_errcause = SLRU_READ_FAILED;
 		slru_errno = errno;
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 65194db70e..52f634e706 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1219,6 +1219,7 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings)
 	uint32		crc_offset;
 	pg_crc32c	calc_crc,
 file_crc;
+	int			r;
 
 	TwoPhaseFilePath(path, xid);
 
@@ -1272,15 +1273,28 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings)
 	buf = (char *) palloc(stat.st_size);
 
 	pgstat_report_wait_start(WAIT_EVENT_TWOPHASE_FILE_READ);
-	if (read(fd, buf, stat.st_size) != stat.st_size)
+	r = read(fd, buf, stat.st_size);
+	if (r != stat.st_size)
 	{
+		int		save_errno = errno;
+
 		pgstat_report_wait_end();
 		CloseTransientFile(fd);
 		if (give_warnings)
-			ereport(WARNING,
-	(errcode_for_file_access(),
-	 errmsg("could not read two-phase state file \"%s\": %m",
-			path)));
+		{
+			if (r < 0)
+			{
+errno = save_errno;
+ereport(WARNING,
+		(errcode_for_file_access(),
+		 errmsg("could not read two-phase state file \"%s\": %m",
+path)));
+			}
+			else
+ereport(WARNING,
+		(errmsg("could not read two-phase state file \"%s\": %d read, expected %zu",
+path, r, stat.st_size)));
+		}
 		pfree(buf);
 		return NULL;
 	}
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index adbd6a2126..72fd800646 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3395,21 +3395,24 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
 
 		if (nread > 0)
 		{
+			int		r;
+
 			if (nread > sizeof(buffer))
 nread = sizeof(buffer);
 			errno = 0;
 			pgstat_report_wait_start(WAIT_EVENT_WAL_COPY_READ);
-			if (read(srcfd, buffer, nread) != nread)
+			r = read(srcfd, buffer, nread);
+			if (r != nread)
 			{
-if (errno != 0)
+if (r < 0)
 	ereport(ERROR,
 			(errcode_for_file_access(),
 			 errmsg("could not read file \"%s\": %m",
 	path)));
 else
 	ereport(ERROR,
-			(errmsg("not enough data in file \"%s\"",
-	path)));
+			(errmsg("not enough data in file \"%s\": read %d, expected %d",
+	path, r, nread)));
 			}
 			pgstat_report_wait_end();
 		}
@@ -11594,6 +11597,7 @@ XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen,
 	int			emode = private->emode;
 	uint32		targetPageOff;
 	XLogSegNo	targetSegNo PG_USED_FOR_ASSERTS_ONLY;
+	int			r;
 
 	XLByteToSeg(targetPagePtr, targetSegNo, wal_segment_size);
 	targetPageOff = XLogSegmentOffset(targetPagePtr, wal_segment_size);
@@ -11675,8 +11679,10 @@ retry:
 	if (lseek(readFile, (off_t) readOff, SEEK_SET) < 0)
 	{
 		char		fname[MAXFNAMELEN];
+		int			save_errno = errno;
 
 		XLogFileName(fname, curFileTLI, readSegNo, wal_segment_size);
+		errno = save_errno;
 		ereport(emode_for_corrupt_record(emode, targetPagePtr + reqLen),
 (errcode_for_file_access(),
  errmsg("could not seek in log segment %s to offset %u: %m",
@@ -11685,16 +11691,29 @@ retry:
 	}
 
 	pgstat_report_wait_start(WAIT_EVENT_WAL_READ);
-	if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+	r = read(readFile, readBuf, XLOG_BLCKSZ);
+	if (r != XLOG_BLCKSZ)
 	{
 		char		fname[MAXFNAMELEN];
+		int			save_errno = errno;
 
 		pgstat_report_wait_end();
 		XLogFileName(fname, curFileTLI, readSegNo, wal_segment_size);
-		ereport(emode_for_corrupt_record(emode, targetPagePtr + reqLen),
-(errcode_for_file_access(),
- errmsg("could not read from log segment %s, offset %u: %m",
-		fname, readOff)));
+
+		if (r < 0

Re: Fix for FETCH FIRST syntax problems

2018-05-19 Thread Andrew Gierth

> "Tom" == Tom Lane  writes:

 Tom> Also, why'd you back off "must write" to "should write" in the docs?
 Tom> This doesn't make the parens any more optional than before.

Currently, the requirement for parens is inconsistent - FETCH FIRST
wants them for absolutely anything that's not a literal constant, but
OFFSET x ROWS allows any c_expr (which covers a whole lot of ground in
addition to the spec's requirements). The docs don't distinguish these
two and just say "must write" parens even though some cases clearly work
without.

(There's also the slight wart that OFFSET -1 ROWS is a syntax error,
whereas the spec defines it as valid syntax but a semantic error. Do we
care?)

With the patch, c_exprs would be accepted without parens, so saying
"must write" parens in the docs clearly isn't entirely accurate. I'm
open to better phrasing.

I found some more warts: OFFSET +1 ROWS isn't accepted (but should be),
and FETCH FIRST 100 ROWS ONLY fails on 32bit and Windows builds.
The patch as posted fixes the second of those but makes FETCH FIRST +1
fail instead; I'm working on that.

-- 
Andrew (irc:RhodiumToad)

Re: Fix for FETCH FIRST syntax problems

2018-05-19 Thread Andrew Gierth

> "Tom" == Tom Lane  writes:

 >> I'm +1 for backpatching it. It may be operating as designed by
 >> PeterE ten years ago, but it's not operating as designed by the SQL
 >> standard.

 Tom> By that argument, *anyplace* where we're missing a SQL-spec
 Tom> feature is a back-patchable bug. I don't buy it.

But this is a feature we already claimed to actually support (it's
listed in sql_features with a bunch of unqualified YES entries), but in
fact doesn't work properly.

-- 
Andrew (irc:RhodiumToad)

Re: [GSoC] Question about returning bytea array

2018-05-19 Thread Charles Cui

construct_md_array works for me, thanks for inputs!

2018-05-17 13:42 GMT-07:00 Andrew Gierth :

> > "Charles" == Charles Cui  writes:
>
>  Charles> I have the requirements to return a bytea array for some
>  Charles> functions in pg_thrift plugin.
>
> If you mean you want the return value to be of type bytea[], i.e. an SQL
> array of bytea values, then you need to be using construct_array to
> construct the result.
>
> --
> Andrew (irc:RhodiumToad)
>

Re: Fix for FETCH FIRST syntax problems

2018-05-19 Thread Andrew Gierth

Updated patch.

This one explicitly accepts +/- ICONST/FCONST in addition to c_expr for
both offset and limit, removing the inconsistencies mentioned
previously.

I changed the wording of the docs part a bit; does that look better? or
worse?

Old behavior:
select 1 offset +1 rows;  -- ERROR:  syntax error at or near "rows"
select 1 fetch first +1 rows only;  -- works
select 1 offset -1 rows;  -- ERROR:  syntax error at or near "rows"
select 1 fetch first -1 rows only;  -- ERROR:  LIMIT must not be negative

New behavior:
select 1 offset +1 rows;  -- works
select 1 fetch first +1 rows only;  -- works
select 1 offset -1 rows;  -- ERROR:  OFFSET must not be negative
select 1 fetch first -1 rows only;  -- ERROR:  LIMIT must not be negative

-- 
Andrew (irc:RhodiumToad)

diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index b5d3d3a071..3d59b0c3e5 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -1399,10 +1399,12 @@ OFFSET start
 OFFSET start { ROW | ROWS }
 FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY
 
-In this syntax, to write anything except a simple integer constant for
-start or count, you must write parentheses
-around it.
+In this syntax, the start
+or count value is required by
+the standard to be a literal constant, a parameter, or a variable name;
+as a PostgreSQL extension, other expressions
+are allowed, but will generally need to be enclosed in parentheses to avoid
+ambiguity.
 If count is
 omitted in a FETCH clause, it defaults to 1.
 ROW
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index babb62dae1..2ef3bdecc7 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -451,7 +451,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
 %type 	fetch_args limit_clause select_limit_value
 offset_clause select_offset_value
-select_offset_value2 opt_select_fetch_first_value
+select_fetch_first_value I_or_F_const
 %type 	row_or_rows first_or_next
 
 %type 	OptSeqOptList SeqOptList OptParenthesizedSeqOptList
@@ -11570,15 +11570,23 @@ limit_clause:
 			 parser_errposition(@1)));
 }
 			/* SQL:2008 syntax */
-			| FETCH first_or_next opt_select_fetch_first_value row_or_rows ONLY
+			/* to avoid shift/reduce conflicts, handle the optional value with
+			 * a separate production rather than an opt_ expression.  The fact
+			 * that ONLY is fully reserved means that this way, we defer any
+			 * decision about what rule reduces ROW or ROWS to the point where
+			 * we can see the ONLY token in the lookahead slot.
+			 */
+			| FETCH first_or_next select_fetch_first_value row_or_rows ONLY
 { $$ = $3; }
+			| FETCH first_or_next row_or_rows ONLY
+{ $$ = makeIntConst(1, -1); }
 		;
 
 offset_clause:
 			OFFSET select_offset_value
 { $$ = $2; }
 			/* SQL:2008 syntax */
-			| OFFSET select_offset_value2 row_or_rows
+			| OFFSET select_fetch_first_value row_or_rows
 { $$ = $2; }
 		;
 
@@ -11597,22 +11605,28 @@ select_offset_value:
 
 /*
  * Allowing full expressions without parentheses causes various parsing
- * problems with the trailing ROW/ROWS key words.  SQL only calls for
- * constants, so we allow the rest only with parentheses.  If omitted,
- * default to 1.
+ * problems with the trailing ROW/ROWS key words.  SQL spec only calls for
+ * , which is either a literal or a parameter (but
+ * an  could be an identifier, bringing up conflicts
+ * with ROW/ROWS). We solve this by leveraging the presence of ONLY (see above)
+ * to determine whether the expression is missing rather than trying to make it
+ * optional in this rule.
+ *
+ * c_expr covers almost all the spec-required cases (and more), but it doesn't
+ * cover signed numeric literals, which are allowed by the spec. So we include
+ * those here explicitly.
  */
-opt_select_fetch_first_value:
-			SignedIconst		{ $$ = makeIntConst($1, @1); }
-			| '(' a_expr ')'	{ $$ = $2; }
-			| /*EMPTY*/			{ $$ = makeIntConst(1, -1); }
+select_fetch_first_value:
+			c_expr	{ $$ = $1; }
+			| '+' I_or_F_const
+{ $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "+", NULL, $2, @1); }
+			| '-' I_or_F_const
+{ $$ = doNegate($2, @1); }
 		;
 
-/*
- * Again, the trailing ROW/ROWS in this case prevent the full expression
- * syntax.  c_expr is the best we can do.
- */
-select_offset_value2:
-			c_expr	{ $$ = $1; }
+I_or_F_const:
+			Iconst	{ $$ = makeIntConst($1,@1); }
+			| FCONST{ $$ = makeFloatConst($1,@1); }
 		;
 
 /* noise words */

FindDefinedSymbol() is subtly broken

generating bootstrap entries for array types

Re: pg_control read error message

Re: SCRAM with channel binding downgrade attack

Re: Flexible permissions for REFRESH MATERIALIZED VIEW

Re: Flexible permissions for REFRESH MATERIALIZED VIEW

printf format selection vs. reality

Re: FindDefinedSymbol() is subtly broken

printf("%lf",...) isn't actually portable

Fix for FETCH FIRST syntax problems

Re: Fix for FETCH FIRST syntax problems

Re: Fix for FETCH FIRST syntax problems

Re: Fix for FETCH FIRST syntax problems

Re: Fix for FETCH FIRST syntax problems

Re: Fix for FETCH FIRST syntax problems

Fix some error handling for read() and errno

Re: Fix for FETCH FIRST syntax problems

Re: Fix for FETCH FIRST syntax problems

Re: [GSoC] Question about returning bytea array

Re: Fix for FETCH FIRST syntax problems

20 matches

Site Navigation

Mail list logo

Footer information