FindDefinedSymbol() is subtly broken
Long story short: It expects an array of paths, but will die if it can't find the symbol in the first path. Arguably it's a bug, but since the original use case for multiple paths is long gone anyway, it might never occur in practice. The attached patch changes it to expect a single path. Also attached: a few minor build script cleanups I've noticed along the way: -more accurate error message -s/print sprintf/printf/ -a recent perltidy change made a couple multi-line strings look odd, so use heredocs (which other scripts use in this case anyway) -make generated file footers match project style -remove a duplicate comment -John Naylor From be3cbb1499e026355697b1aff890a3049cdef4ac Mon Sep 17 00:00:00 2001 From: John Naylor Date: Sat, 19 May 2018 13:14:58 +0700 Subject: [PATCH] Update FindDefinedSymbol() to match current practice. Once upon a time, headers passed to build scripts could live in either $builddir or $sourcedir, so the Makefiles needed to pass both directories to the scripts. FindDefinedSymbol() was intended to deal with muliple directories, but it's actually broken: It dies if it can't find the symbol in the first given path. This went unnoticed for so long, because the catalog scripts output distprep targets, so only need $sourcedir. Fix so that we expect only one include path. --- src/backend/catalog/Catalog.pm | 36 src/backend/utils/Gen_fmgrtab.pl | 8 src/include/catalog/unused_oids | 2 +- 3 files changed, 21 insertions(+), 25 deletions(-) diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index 07bd5f3..b88b36b 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -365,34 +365,30 @@ sub RenameTempFile } # Find a symbol defined in a particular header file and extract the value. -# -# The include path has to be passed as a reference to an array. sub FindDefinedSymbol { my ($catalog_header, $include_path, $symbol) = @_; + my $value; - for my $path (@$include_path) + # Make sure include path ends in a slash. + if (substr($include_path, -1) ne '/') { - - # Make sure include path ends in a slash. - if (substr($path, -1) ne '/') - { - $path .= '/'; - } - my $file = $path . $catalog_header; - next if !-f $file; - open(my $find_defined_symbol, '<', $file) || die "$file: $!"; - while (<$find_defined_symbol>) + $include_path .= '/'; + } + my $file = $include_path . $catalog_header; + next if !-f $file; + open(my $find_defined_symbol, '<', $file) || die "$file: $!"; + while (<$find_defined_symbol>) + { + if (/^#define\s+\Q$symbol\E\s+(\S+)/) { - if (/^#define\s+\Q$symbol\E\s+(\S+)/) - { -return $1; - } + $value = $1; + last; } - close $find_defined_symbol; - die "$file: no definition found for $symbol\n"; } - die "$catalog_header: not found in any include directory\n"; + close $find_defined_symbol; + return $value if defined $value; + die "$file: no definition found for $symbol\n"; } # Similar to FindDefinedSymbol, but looks in the bootstrap metadata. diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl index 6c9f1a7..465bcd5 100644 --- a/src/backend/utils/Gen_fmgrtab.pl +++ b/src/backend/utils/Gen_fmgrtab.pl @@ -22,7 +22,7 @@ use warnings; # Collect arguments my @input_files; my $output_path = ''; -my @include_path; +my $include_path; while (@ARGV) { @@ -37,7 +37,7 @@ while (@ARGV) } elsif ($arg =~ /^-I/) { - push @include_path, length($arg) > 2 ? substr($arg, 2) : shift @ARGV; + $include_path = length($arg) > 2 ? substr($arg, 2) : shift @ARGV; } else { @@ -53,7 +53,7 @@ if ($output_path ne '' && substr($output_path, -1) ne '/') # Sanity check arguments. die "No input files.\n" if !@input_files; -die "No include path; you must specify -I at least once.\n" if !@include_path; +die "No include path; you must specify -I.\n" if !$include_path; # Read all the input files into internal data structures. # Note: We pass data file names as arguments and then look for matching @@ -80,7 +80,7 @@ foreach my $datfile (@input_files) # Fetch some values for later. my $FirstBootstrapObjectId = - Catalog::FindDefinedSymbol('access/transam.h', \@include_path, + Catalog::FindDefinedSymbol('access/transam.h', $include_path, 'FirstBootstrapObjectId'); my $INTERNALlanguageId = Catalog::FindDefinedSymbolFromData($catalog_data{pg_language}, diff --git a/src/include/catalog/unused_oids b/src/include/catalog/unused_oids index 2780de0..c5fc378 100755 --- a/src/include/catalog/unused_oids +++ b/src/include/catalog/unused_oids @@ -34,7 +34,7 @@ my $oids = Catalog::FindAllOidsFromHeaders(@input_files); # Also push FirstBootstrapObjectId to serve as a terminator for the last gap. my $FirstBootstrapObjectId = - Catalog::FindDefinedSymbol('access/transam.h', [".."], + Catalog::FindDefinedSymbol('access/transam.h', '..', 'FirstBootstrapObjectId'); push @{$oid
generating bootstrap entries for array types
I wrote [1]: > On 4/26/18, Tom Lane wrote: >> if I counted correctly. (Array entries should be ignored for this >> purpose; maybe we'll autogenerate them someday.) > > Hmm, that wouldn't be too hard. Add a new metadata field called > 'array_type_oid', then if it finds such an OID, teach genbki.pl to > construct a tuple for the corresponding array type by consulting > something like > > chartypcategoryBKI_ARRAY(A); > chartypstorage BKI_ARRAY(x); > ...etc > > in the header file, plus copying typalign from the element type. I'll > whip this up sometime and add it to the next commitfest. This turned out to be slightly more complicated than that, but still pretty straightforward. The script is for information only, it doesn't need to be run. -typalign for arrays is 'i' unless the element type is 'd', in which case it's 'd'. -If future array-like types are created that break the model and so can't be generated from the element type, they have an escape hatch of having their own full entry. Currently this includes _record, since it's a pseudotype, and of course the special types oidvector and int2vector. -I've kept the current convention of duplicating typdelim in the array type, although I'm not sure it's used anywhere. If you sort the before and after postgres.bki, it should result in a clean diff. Will add to next commitfest. -- [1] https://www.postgresql.org/message-id/CAJVSVGW-D7OobzU%3DdybVT2JqZAx-4X1yvBJdavBmqQL05Q6CLw%40mail.gmail.com -John Naylor From 8c1b345b54bddd560df0e15b534c960b532cdade Mon Sep 17 00:00:00 2001 From: John Naylor Date: Sat, 19 May 2018 15:10:51 +0700 Subject: [PATCH v1] Generate bootstrap entries for array types Add a new metadata field called 'array_type_oid', which instructs genbki.pl to generate an array type from the given element type. This allows most array type entries to be deleted from pg_type.dat. The lone exception is _record, since it's a pseudotype. --- src/backend/catalog/Catalog.pm | 6 + src/backend/catalog/genbki.pl| 54 +++- src/include/catalog/genbki.h | 2 + src/include/catalog/pg_type.dat | 471 ++- src/include/catalog/pg_type.h| 24 +- src/include/catalog/reformat_dat_file.pl | 2 +- 6 files changed, 153 insertions(+), 406 deletions(-) diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index c8dc956..8d40c62 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -191,6 +191,10 @@ sub ParseHeader { $column{default} = $1; } + elsif ($attopt =~ /BKI_ARRAY_TYPE\(['"]?([^'"]+)['"]?\)/) + { + $column{array} = $1; + } elsif ($attopt =~ /BKI_LOOKUP\((\w+)\)/) { $column{lookup} = $1; @@ -451,6 +455,8 @@ sub FindAllOidsFromHeaders foreach my $row (@$catdata) { push @oids, $row->{oid} if defined $row->{oid}; +push @oids, $row->{array_type_oid} + if defined $row->{array_type_oid}; } } diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index fb61db0..939af5f 100644 --- a/src/backend/catalog/genbki.pl +++ b/src/backend/catalog/genbki.pl @@ -98,6 +98,8 @@ foreach my $header (@input_files) if (-e $datfile) { my $data = Catalog::ParseData($datfile, $schema, 0); + gen_array_types($schema, $data) + if $catname eq 'pg_type'; $catalog_data{$catname} = $data; # Check for duplicated OIDs while we're at it. @@ -396,7 +398,8 @@ EOM if $key eq "oid" || $key eq "oid_symbol" || $key eq "descr" - || $key eq "line_number"; + || $key eq "line_number" + || $key eq "array_type_oid"; die sprintf "unrecognized field name \"%s\" in %s.dat line %s\n", $key, $catname, $bki_values{line_number} if (!exists($attnames{$key})); @@ -571,6 +574,53 @@ exit 0; Subroutines +# If the type specifies an array type OID, generate an entry for the array +# type using values from the element type, plus some values that are needed +# for all array types. +sub gen_array_types +{ + my $pgtype_schema = shift; + my $types = shift; + my @array_types; + + foreach my $elem_type (@$types) + { + next if !exists $elem_type->{array_type_oid}; + my %array_type; + + # Specific values derived from the element type. + $array_type{oid} = $elem_type->{array_type_oid}; + $array_type{typname} = '_' . $elem_type->{typname}; + $array_type{typelem} = $elem_type->{typname}; + + # Arrays require INT alignment, unless the element type requires + # DOUBLE alignment. + $array_type{typalign} = $elem_type->{typalign} eq 'd' ? 'd' : 'i'; + + # Fill in the rest of the values. + foreach my $column (@$pgtype_schema) + { + my $attname = $column->{name}; + + # Skip if we already set it above. + next if defined $array_type{$attname}; + + # If we have a value needed for all arrays, use it, otherwise + # copy the value from the elemen
Re: pg_control read error message
On Fri, May 18, 2018 at 10:22:29AM -0400, Tom Lane wrote: > Only comment I have is that I think there's similar shortcuts in a lot > of places :-( Yeah. A quick lookup is showing me one in xlog.c (XLOG_BLCKSZ) and one in pg_rewind. (Spotted roughly 392 places to look at in all the core code). Let's discuss that on a separate thread. -- Michael signature.asc Description: PGP signature
Re: SCRAM with channel binding downgrade attack
On Fri, May 18, 2018 at 09:30:22AM -0400, Bruce Momjian wrote: > Good work, but I think the existance of both scram_channel_binding and > channel_binding_mode in libpq is confusing. I am thinking we should > have one channel binding parameter that can take values "prefer", > "tls-unique", "tls-server-end-point", and "require". I don't see the > point to having "none" and "allow" that sslmode supports. "tls-unique" > and "tls-server-end-point" would _require_ those channel binding modes; > the setting would never be ignored, e.g. if no SSL. I can see the point you are coming at. Do you think that we should worry about users willing to use specifically tls-server-end-point (which is something actually in the backend protocol only for JDBC) and manage a cluster of both v10 and v11 servers? In this case we assume that "prefer" means always using tls-unique as channel binding, but there is no way to say "I would prefer channel binding if available, only with end-point". It would not be that hard to let the application layer on top of libpq handle that complexity with channel binding handling, though it makes the life of libpq consumers a bit harder. Hence, I would actually eliminate "require", as that would be actually the same as "tls-unique", and the possibility to use an empty value from the list of options available but add "none" as that's actually the same as the current empty value. This gives as list: - tls-unique - tls-server-end-point - none - prefer, which has the meaning of preferring tls-unique - And prefer-end-point? But thinking why end-point has been added it would not be worth it, and for tests only the option requiring end-point is enough. The previous patch has actually problems with servers using "trust", "password" and any protocol which can send directly AUTH_REQ_OK as those could still enforce a downgrade to not use channel binding, so we actually need to make sure that AUTH_REQ_SASL_FIN has been received when channel binding is required when looking at a AUTH_REQ_OK message. -- Michael signature.asc Description: PGP signature
Re: Flexible permissions for REFRESH MATERIALIZED VIEW
On Fri, May 18, 2018 at 8:13 PM, Stephen Frost wrote: > I'm not a big fan of it- what happens when we introduce something else > which would seem like it'd fall under 'maintain' but provides some > capability that maybe it wouldn't be good for users who could only run > the above commands to have? I'm tempted to suggest that, really, we > might even be thinking about splitting up things further than the above > proposal- what about VACUUM vs. VACUUM FULL? Or REFRESH MATVIEW vs. > REFRESH MATVIEW CONCURRENTLY? Mistakes between those routinly cause > problems due to the heavy lock taken in some cases- as an administrator, > I'd be a lot more comfortable giving a user or some process the ability > to run a VACUUM vs. VACUUM FULL. That is a fair point, but if we want to do things like that then it's really not a good idea to limit ourselves to a fixed number of bits, even if it's 2x or 4x more than what we have today. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Flexible permissions for REFRESH MATERIALIZED VIEW
On 19 May 2018 at 01:13, Stephen Frost wrote: > > I'm not entirely sure about the varlena suggestion, seems like that > would change a great deal more code and be slower, though perhaps not > enough to matter; it's not like our aclitem arrays are exactly optimized > for speed today. I don't actually understand the reason y'all are talking about varlena. The aclitem type is already not passbyvalue so there's no particular limit on the size and specifically no reason it can't be larger than 64 bytes. As Tom mentioned it's only used in catalogs so there's no on-disk version compatibility issue either. It can be defined to have a bitmask exactly large enough to hold all the acl privileges needed for the specific version and still not be a varlena. The only downsides are: 1. At some point it gets large enough that adding rarely used privileges is imposing a large storage and cache efficiency cost that's amortized over all the acl lookups even when it's not used. I doubt we're anywhere close to that currently but if we started having hundreds of privileges... 2. The current macros just do a bitshift to look up bits and they would get a bit more complex. But it's not like it isn't arithmetic we haven't tackled repeatedly in other places that do bitmasks such as varbit, numeric, bitmapset, and probably more. Fwiw I don't understand why the current AclMode uses a single uint32 to pass around two 16-bit bitmasks and uses bitshifting to extract the two. Why not just make it a struct with two uint16s in it instead? That would mean we would have a factor of four available before the macros even get the slight added complication. -- greg
printf format selection vs. reality
Our configure script goes to considerable lengths to try to identify what printf width modifier to use for int64 values. In particular see PGAC_FUNC_SNPRINTF_LONG_LONG_INT_MODIFIER, which claims # MinGW uses '%I64d', though gcc throws a warning with -Wall, # while '%lld' doesn't generate a warning, but doesn't work. However, if we decide that we ought to use our own snprintf replacement, we throw that info away and set INT64_FORMAT to "%lld" which we know is what that code uses. This was all right when that code was first written, when we had only a very small number of uses of INT64_FORMAT and they all were in snprintf() calls. It's been a long time since that was true: pgbench, in particular, has been passing INT64_FORMAT to the native printf with increasing enthusiasm. In reality, we do not anymore work with situations where our snprintf has a different idea about this format modifier than libc does. What's more, as far as I can find, we do not have any buildfarm members that set INT64_MODIFIER to anything other than "l" or "ll", which no doubt explains why we've not noticed a problem. The comment quoted above doesn't seem to apply to any current buildfarm members. I think we should abandon the pretense that we can work with libc printfs that accept anything but "l"/"ll", and rip out the excess complexity in configure, just setting INT64_MODIFIER to "l" or "ll" depending on whether "int64" is "long" or "long long". Comments? regards, tom lane
Re: FindDefinedSymbol() is subtly broken
John Naylor writes: > Long story short: It expects an array of paths, but will die if it > can't find the symbol in the first path. Arguably it's a bug, but > since the original use case for multiple paths is long gone anyway, it > might never occur in practice. The attached patch changes it to expect > a single path. > Also attached: a few minor build script cleanups I've noticed along the way: > -more accurate error message > -s/print sprintf/printf/ > -a recent perltidy change made a couple multi-line strings look odd, > so use heredocs (which other scripts use in this case anyway) > -make generated file footers match project style > -remove a duplicate comment This all seemed like minor cleanup from v11 changes, so I went ahead and pushed it rather than waiting for v12. regards, tom lane
printf("%lf",...) isn't actually portable
I noticed while poking at the recent ecpg unpleasantness that some of my older critters were warning about use of %lf conversions in printf. Looking into it, I find that current POSIX says the "l" is a no-op, while SUSv2 says it's undefined. I think this usage got into our code as a result of people making false analogies between scanf and printf conversions. I think we should just switch these to plain %f, as per attached. regards, tom lane diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c index 558a8c4..90acf6a 100644 --- a/contrib/pageinspect/btreefuncs.c +++ b/contrib/pageinspect/btreefuncs.c @@ -563,7 +563,7 @@ bt_metap(PG_FUNCTION_ARGS) if (metad->btm_version == BTREE_VERSION) { values[j++] = psprintf("%u", metad->btm_oldest_btpo_xact); - values[j++] = psprintf("%lf", metad->btm_last_cleanup_num_heap_tuples); + values[j++] = psprintf("%f", metad->btm_last_cleanup_num_heap_tuples); } else { diff --git a/doc/src/sgml/ecpg.sgml b/doc/src/sgml/ecpg.sgml index 98b6840..f065477 100644 --- a/doc/src/sgml/ecpg.sgml +++ b/doc/src/sgml/ecpg.sgml @@ -1678,7 +1678,7 @@ while (1) Here is an example using the data type complex from the example in . The external string - representation of that type is (%lf,%lf), + representation of that type is (%f,%f), which is defined in the functions complex_in() and complex_out() functions diff --git a/src/backend/access/rmgrdesc/nbtdesc.c b/src/backend/access/rmgrdesc/nbtdesc.c index 1590d67..5c44571 100644 --- a/src/backend/access/rmgrdesc/nbtdesc.c +++ b/src/backend/access/rmgrdesc/nbtdesc.c @@ -100,7 +100,7 @@ btree_desc(StringInfo buf, XLogReaderState *record) { xl_btree_metadata *xlrec = (xl_btree_metadata *) rec; -appendStringInfo(buf, "oldest_btpo_xact %u; last_cleanup_num_heap_tuples: %lf", +appendStringInfo(buf, "oldest_btpo_xact %u; last_cleanup_num_heap_tuples: %f", xlrec->oldest_btpo_xact, xlrec->last_cleanup_num_heap_tuples); break; diff --git a/src/interfaces/ecpg/test/compat_informix/sqlda.pgc b/src/interfaces/ecpg/test/compat_informix/sqlda.pgc index 423ce41..87e0110 100644 --- a/src/interfaces/ecpg/test/compat_informix/sqlda.pgc +++ b/src/interfaces/ecpg/test/compat_informix/sqlda.pgc @@ -37,7 +37,7 @@ dump_sqlda(sqlda_t *sqlda) printf("name sqlda descriptor: '%s' value %d\n", sqlda->sqlvar[i].sqlname, *(int *)sqlda->sqlvar[i].sqldata); break; case SQLFLOAT: - printf("name sqlda descriptor: '%s' value %lf\n", sqlda->sqlvar[i].sqlname, *(double *)sqlda->sqlvar[i].sqldata); + printf("name sqlda descriptor: '%s' value %f\n", sqlda->sqlvar[i].sqlname, *(double *)sqlda->sqlvar[i].sqldata); break; case SQLDECIMAL: { diff --git a/src/interfaces/ecpg/test/expected/compat_informix-sqlda.c b/src/interfaces/ecpg/test/expected/compat_informix-sqlda.c index fa2e569..ad3188d 100644 --- a/src/interfaces/ecpg/test/expected/compat_informix-sqlda.c +++ b/src/interfaces/ecpg/test/expected/compat_informix-sqlda.c @@ -142,7 +142,7 @@ dump_sqlda(sqlda_t *sqlda) printf("name sqlda descriptor: '%s' value %d\n", sqlda->sqlvar[i].sqlname, *(int *)sqlda->sqlvar[i].sqldata); break; case SQLFLOAT: - printf("name sqlda descriptor: '%s' value %lf\n", sqlda->sqlvar[i].sqlname, *(double *)sqlda->sqlvar[i].sqldata); + printf("name sqlda descriptor: '%s' value %f\n", sqlda->sqlvar[i].sqlname, *(double *)sqlda->sqlvar[i].sqldata); break; case SQLDECIMAL: { diff --git a/src/interfaces/ecpg/test/expected/preproc-outofscope.c b/src/interfaces/ecpg/test/expected/preproc-outofscope.c index f4676a0..ef4dada 100644 --- a/src/interfaces/ecpg/test/expected/preproc-outofscope.c +++ b/src/interfaces/ecpg/test/expected/preproc-outofscope.c @@ -337,7 +337,7 @@ if (sqlca.sqlcode < 0) exit (1);} get_record1(); if (sqlca.sqlcode == ECPG_NOT_FOUND) break; - printf("id=%d%s t='%s'%s d1=%lf%s d2=%lf%s c = '%s'%s\n", + printf("id=%d%s t='%s'%s d1=%f%s d2=%f%s c = '%s'%s\n", myvar->id, mynullvar->id ? " (NULL)" : "", myvar->t, mynullvar->t ? " (NULL)" : "", myvar->d1, mynullvar->d1 ? " (NULL)" : "", diff --git a/src/interfaces/ecpg/test/expected/sql-sqlda.c b/src/interfaces/ecpg/test/expected/sql-sqlda.c index 81d26b3..090aaf1 100644 --- a/src/interfaces/ecpg/test/expected/sql-sqlda.c +++ b/src/interfaces/ecpg/test/expected/sql-sqlda.c @@ -158,7 +158,7 @@ dump_sqlda(sqlda_t *sqlda) break; #endif case ECPGt_double: - printf("name sqlda descriptor: '%s' value %lf\n", sqlda->sqlvar[i].sqlname.data, *(double *)sqlda->sqlvar[i].sqldata); + printf("name sqlda descriptor: '%s' value %f\n", sqlda->sqlvar[i].sqlname.data, *(double *)sqlda->sqlvar[i].sqldata); break; case ECPGt_numeric: { diff --git a/src/interfaces/ecpg/test/preproc/outofscope.pgc b/src/interfaces/ecpg/test/preproc/outofscope.pgc index aae5325..b03743c 100644 --- a/src/interfaces/ecpg/test/preproc/ou
Fix for FETCH FIRST syntax problems
Per bug #15200, our support for sql2008 FETCH FIRST syntax is incomplete to the extent that it should be regarded as a bug; the spec quite clearly allows parameters/host variables in addition to constants, but we don't. Attached is a draft patch for fixing that, which additionally fixes the ugly wart that OFFSET ROW/ROWS and FETCH FIRST [] ROW/ROWS ONLY had very different productions for ; both now accept c_expr there. Shift/reduce conflict is avoided by taking advantage of the fact that ONLY is a fully reserved word. I've checked that this change would not make it any harder to add (post-2008 features) WITH TIES or PERCENT in the event that someone wants to do that. I think a case can be made that this should be backpatched; thoughts? (While I can't find any visible change for existing working queries, one change that does occur is that FETCH FIRST -1 ROWS ONLY now returns a different error message; but that was already inconsistent with the error from OFFSET -1 ROWS.) -- Andrew (irc:RhodiumToad) diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml index b5d3d3a071..330adb8c37 100644 --- a/doc/src/sgml/ref/select.sgml +++ b/doc/src/sgml/ref/select.sgml @@ -1399,10 +1399,11 @@ OFFSET start OFFSET start { ROW | ROWS } FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY -In this syntax, to write anything except a simple integer constant for +In this syntax, to write anything except a simple integer constant, +parameter, or variable name for start or count, you must write parentheses -around it. +class="parameter">count, you should write parentheses +around it (this is a PostgreSQL extension). If count is omitted in a FETCH clause, it defaults to 1. ROW diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index babb62dae1..e441c555a4 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -451,7 +451,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type fetch_args limit_clause select_limit_value offset_clause select_offset_value -select_offset_value2 opt_select_fetch_first_value +select_fetch_first_value %type row_or_rows first_or_next %type OptSeqOptList SeqOptList OptParenthesizedSeqOptList @@ -11570,15 +11570,23 @@ limit_clause: parser_errposition(@1))); } /* SQL:2008 syntax */ - | FETCH first_or_next opt_select_fetch_first_value row_or_rows ONLY + /* to avoid shift/reduce conflicts, handle the optional value with + * a separate production rather than an opt_ expression. The fact + * that ONLY is fully reserved means that this way, we defer any + * decision about what rule reduces ROW or ROWS to the point where + * we can see the ONLY token in the lookahead slot. + */ + | FETCH first_or_next select_fetch_first_value row_or_rows ONLY { $$ = $3; } + | FETCH first_or_next row_or_rows ONLY +{ $$ = makeIntConst(1, -1); } ; offset_clause: OFFSET select_offset_value { $$ = $2; } /* SQL:2008 syntax */ - | OFFSET select_offset_value2 row_or_rows + | OFFSET select_fetch_first_value row_or_rows { $$ = $2; } ; @@ -11597,21 +11605,16 @@ select_offset_value: /* * Allowing full expressions without parentheses causes various parsing - * problems with the trailing ROW/ROWS key words. SQL only calls for - * constants, so we allow the rest only with parentheses. If omitted, - * default to 1. - */ -opt_select_fetch_first_value: - SignedIconst { $$ = makeIntConst($1, @1); } - | '(' a_expr ')' { $$ = $2; } - | /*EMPTY*/ { $$ = makeIntConst(1, -1); } - ; - -/* - * Again, the trailing ROW/ROWS in this case prevent the full expression - * syntax. c_expr is the best we can do. + * problems with the trailing ROW/ROWS key words. SQL spec only calls for + * , which is either a literal or a parameter (but + * an could be an identifier, bringing up conflicts + * with ROW/ROWS). We solve this by leveraging the presense of ONLY (see above) + * to determine whether the expression is missing rather than trying to make it + * optional in this rule. + * + * c_expr covers all the spec-required cases (and more). */ -select_offset_value2: +select_fetch_first_value: c_expr { $$ = $1; } ;
Re: Fix for FETCH FIRST syntax problems
Andrew Gierth writes: > Attached is a draft patch for fixing that, which additionally fixes the > ugly wart that OFFSET ROW/ROWS and FETCH FIRST [] ROW/ROWS ONLY > had very different productions for ; both now accept c_expr there. LGTM, except please s/presense/presence/ in grammar comment. Also, why'd you back off "must write" to "should write" in the docs? This doesn't make the parens any more optional than before. > I think a case can be made that this should be backpatched; thoughts? Meh, -0.5. This is not really a bug, because it's operating as designed. You've found a reasonably painless way to improve the grammar, which is great, but it seems more like a feature addition than a bug fix. I'd be fine with sneaking this into v11, though. regards, tom lane
Re: Fix for FETCH FIRST syntax problems
On 20/05/18 00:57, Tom Lane wrote: > Andrew Gierth writes: >> Attached is a draft patch for fixing that, which additionally fixes the >> ugly wart that OFFSET ROW/ROWS and FETCH FIRST [] ROW/ROWS ONLY >> had very different productions for ; both now accept c_expr there. > > LGTM, except please s/presense/presence/ in grammar comment. > Also, why'd you back off "must write" to "should write" in the docs? > This doesn't make the parens any more optional than before. It certainly does. It allows this now: PREPARE foo AS TABLE bar FETCH FIRST $1 ROWS ONLY; >> I think a case can be made that this should be backpatched; thoughts? > > Meh, -0.5. This is not really a bug, because it's operating as designed. > You've found a reasonably painless way to improve the grammar, which is > great, but it seems more like a feature addition than a bug fix. > > I'd be fine with sneaking this into v11, though. I'm +1 for backpatching it. It may be operating as designed by PeterE ten years ago, but it's not operating as designed by the SQL standard. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Re: Fix for FETCH FIRST syntax problems
On 20/05/18 01:27, Vik Fearing wrote: >> Also, why'd you back off "must write" to "should write" in the docs? >> This doesn't make the parens any more optional than before. >> It certainly does. It allows this now: > > PREPARE foo AS TABLE bar FETCH FIRST $1 ROWS ONLY; Please disregard this comment. My +1 for backpatching still stands. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Re: Fix for FETCH FIRST syntax problems
Vik Fearing writes: > On 20/05/18 00:57, Tom Lane wrote: >> Also, why'd you back off "must write" to "should write" in the docs? >> This doesn't make the parens any more optional than before. > It certainly does. It allows this now: > PREPARE foo AS TABLE bar FETCH FIRST $1 ROWS ONLY; No, it makes the parens omittable in the cases specified in the previous sentence. The sentence I'm complaining about is describing other cases, in which they're still required. > I'm +1 for backpatching it. It may be operating as designed by PeterE > ten years ago, but it's not operating as designed by the SQL standard. By that argument, *anyplace* where we're missing a SQL-spec feature is a back-patchable bug. I don't buy it. It may be that this fix is simple and safe enough that the risk/reward tradeoff favors back-patching, but I think you have to argue it as a favorable tradeoff rather than just saying "this isn't per standard". Consider: if Andrew had completely rewritten gram.y to get the same visible effect, would you think that was back-patchable? regards, tom lane
Re: Fix for FETCH FIRST syntax problems
On Sat, May 19, 2018 at 07:41:10PM -0400, Tom Lane wrote: > Vik Fearing writes: >> I'm +1 for backpatching it. It may be operating as designed by PeterE >> ten years ago, but it's not operating as designed by the SQL standard. > > By that argument, *anyplace* where we're missing a SQL-spec feature > is a back-patchable bug. I don't buy it. +1. -- Michael signature.asc Description: PGP signature
Fix some error handling for read() and errno
Hi all, This is basically a new thread after what has been discussed for pg_controldata with its error handling for read(): https://www.postgresql.org/message-id/CABUevEx8ZRV5Ut_FvP2etXiQppx3xVzm7oOaV3AcdHxX81Yt8Q%40mail.gmail.com While reviewing the core code, I have noticed similar weird error handling for read(). At the same time, some of those places may use an incorrect errno, as an error is invoked using an errno which may be overwritten by another system call. I found a funny one in slru.c, for which I have added a note in the patch. I don't think that this is worth addressing with more facility, thoughts are welcome. Attached is a patch addressing the issues I found. Thanks, -- Michael diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c index 87942b4cca..d487347cc6 100644 --- a/src/backend/access/transam/slru.c +++ b/src/backend/access/transam/slru.c @@ -683,6 +683,11 @@ SlruPhysicalReadPage(SlruCtl ctl, int pageno, int slotno) pgstat_report_wait_start(WAIT_EVENT_SLRU_READ); if (read(fd, shared->page_buffer[slotno], BLCKSZ) != BLCKSZ) { + /* + * XXX: Note that this may actually report sucess if the number + * of bytes read is positive, but lacking data so that errno is + * not set. + */ pgstat_report_wait_end(); slru_errcause = SLRU_READ_FAILED; slru_errno = errno; diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 65194db70e..52f634e706 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -1219,6 +1219,7 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings) uint32 crc_offset; pg_crc32c calc_crc, file_crc; + int r; TwoPhaseFilePath(path, xid); @@ -1272,15 +1273,28 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings) buf = (char *) palloc(stat.st_size); pgstat_report_wait_start(WAIT_EVENT_TWOPHASE_FILE_READ); - if (read(fd, buf, stat.st_size) != stat.st_size) + r = read(fd, buf, stat.st_size); + if (r != stat.st_size) { + int save_errno = errno; + pgstat_report_wait_end(); CloseTransientFile(fd); if (give_warnings) - ereport(WARNING, - (errcode_for_file_access(), - errmsg("could not read two-phase state file \"%s\": %m", - path))); + { + if (r < 0) + { +errno = save_errno; +ereport(WARNING, + (errcode_for_file_access(), + errmsg("could not read two-phase state file \"%s\": %m", +path))); + } + else +ereport(WARNING, + (errmsg("could not read two-phase state file \"%s\": %d read, expected %zu", +path, r, stat.st_size))); + } pfree(buf); return NULL; } diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index adbd6a2126..72fd800646 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -3395,21 +3395,24 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno, if (nread > 0) { + int r; + if (nread > sizeof(buffer)) nread = sizeof(buffer); errno = 0; pgstat_report_wait_start(WAIT_EVENT_WAL_COPY_READ); - if (read(srcfd, buffer, nread) != nread) + r = read(srcfd, buffer, nread); + if (r != nread) { -if (errno != 0) +if (r < 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not read file \"%s\": %m", path))); else ereport(ERROR, - (errmsg("not enough data in file \"%s\"", - path))); + (errmsg("not enough data in file \"%s\": read %d, expected %d", + path, r, nread))); } pgstat_report_wait_end(); } @@ -11594,6 +11597,7 @@ XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen, int emode = private->emode; uint32 targetPageOff; XLogSegNo targetSegNo PG_USED_FOR_ASSERTS_ONLY; + int r; XLByteToSeg(targetPagePtr, targetSegNo, wal_segment_size); targetPageOff = XLogSegmentOffset(targetPagePtr, wal_segment_size); @@ -11675,8 +11679,10 @@ retry: if (lseek(readFile, (off_t) readOff, SEEK_SET) < 0) { char fname[MAXFNAMELEN]; + int save_errno = errno; XLogFileName(fname, curFileTLI, readSegNo, wal_segment_size); + errno = save_errno; ereport(emode_for_corrupt_record(emode, targetPagePtr + reqLen), (errcode_for_file_access(), errmsg("could not seek in log segment %s to offset %u: %m", @@ -11685,16 +11691,29 @@ retry: } pgstat_report_wait_start(WAIT_EVENT_WAL_READ); - if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ) + r = read(readFile, readBuf, XLOG_BLCKSZ); + if (r != XLOG_BLCKSZ) { char fname[MAXFNAMELEN]; + int save_errno = errno; pgstat_report_wait_end(); XLogFileName(fname, curFileTLI, readSegNo, wal_segment_size); - ereport(emode_for_corrupt_record(emode, targetPagePtr + reqLen), -(errcode_for_file_access(), - errmsg("could not read from log segment %s, offset %u: %m", - fname, readOff))); + + if (r < 0
Re: Fix for FETCH FIRST syntax problems
> "Tom" == Tom Lane writes: Tom> Also, why'd you back off "must write" to "should write" in the docs? Tom> This doesn't make the parens any more optional than before. Currently, the requirement for parens is inconsistent - FETCH FIRST wants them for absolutely anything that's not a literal constant, but OFFSET x ROWS allows any c_expr (which covers a whole lot of ground in addition to the spec's requirements). The docs don't distinguish these two and just say "must write" parens even though some cases clearly work without. (There's also the slight wart that OFFSET -1 ROWS is a syntax error, whereas the spec defines it as valid syntax but a semantic error. Do we care?) With the patch, c_exprs would be accepted without parens, so saying "must write" parens in the docs clearly isn't entirely accurate. I'm open to better phrasing. I found some more warts: OFFSET +1 ROWS isn't accepted (but should be), and FETCH FIRST 100 ROWS ONLY fails on 32bit and Windows builds. The patch as posted fixes the second of those but makes FETCH FIRST +1 fail instead; I'm working on that. -- Andrew (irc:RhodiumToad)
Re: Fix for FETCH FIRST syntax problems
> "Tom" == Tom Lane writes: >> I'm +1 for backpatching it. It may be operating as designed by >> PeterE ten years ago, but it's not operating as designed by the SQL >> standard. Tom> By that argument, *anyplace* where we're missing a SQL-spec Tom> feature is a back-patchable bug. I don't buy it. But this is a feature we already claimed to actually support (it's listed in sql_features with a bunch of unqualified YES entries), but in fact doesn't work properly. -- Andrew (irc:RhodiumToad)
Re: [GSoC] Question about returning bytea array
construct_md_array works for me, thanks for inputs! 2018-05-17 13:42 GMT-07:00 Andrew Gierth : > > "Charles" == Charles Cui writes: > > Charles> I have the requirements to return a bytea array for some > Charles> functions in pg_thrift plugin. > > If you mean you want the return value to be of type bytea[], i.e. an SQL > array of bytea values, then you need to be using construct_array to > construct the result. > > -- > Andrew (irc:RhodiumToad) >
Re: Fix for FETCH FIRST syntax problems
Updated patch. This one explicitly accepts +/- ICONST/FCONST in addition to c_expr for both offset and limit, removing the inconsistencies mentioned previously. I changed the wording of the docs part a bit; does that look better? or worse? Old behavior: select 1 offset +1 rows; -- ERROR: syntax error at or near "rows" select 1 fetch first +1 rows only; -- works select 1 offset -1 rows; -- ERROR: syntax error at or near "rows" select 1 fetch first -1 rows only; -- ERROR: LIMIT must not be negative New behavior: select 1 offset +1 rows; -- works select 1 fetch first +1 rows only; -- works select 1 offset -1 rows; -- ERROR: OFFSET must not be negative select 1 fetch first -1 rows only; -- ERROR: LIMIT must not be negative -- Andrew (irc:RhodiumToad) diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml index b5d3d3a071..3d59b0c3e5 100644 --- a/doc/src/sgml/ref/select.sgml +++ b/doc/src/sgml/ref/select.sgml @@ -1399,10 +1399,12 @@ OFFSET start OFFSET start { ROW | ROWS } FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY -In this syntax, to write anything except a simple integer constant for -start or count, you must write parentheses -around it. +In this syntax, the start +or count value is required by +the standard to be a literal constant, a parameter, or a variable name; +as a PostgreSQL extension, other expressions +are allowed, but will generally need to be enclosed in parentheses to avoid +ambiguity. If count is omitted in a FETCH clause, it defaults to 1. ROW diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index babb62dae1..2ef3bdecc7 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -451,7 +451,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type fetch_args limit_clause select_limit_value offset_clause select_offset_value -select_offset_value2 opt_select_fetch_first_value +select_fetch_first_value I_or_F_const %type row_or_rows first_or_next %type OptSeqOptList SeqOptList OptParenthesizedSeqOptList @@ -11570,15 +11570,23 @@ limit_clause: parser_errposition(@1))); } /* SQL:2008 syntax */ - | FETCH first_or_next opt_select_fetch_first_value row_or_rows ONLY + /* to avoid shift/reduce conflicts, handle the optional value with + * a separate production rather than an opt_ expression. The fact + * that ONLY is fully reserved means that this way, we defer any + * decision about what rule reduces ROW or ROWS to the point where + * we can see the ONLY token in the lookahead slot. + */ + | FETCH first_or_next select_fetch_first_value row_or_rows ONLY { $$ = $3; } + | FETCH first_or_next row_or_rows ONLY +{ $$ = makeIntConst(1, -1); } ; offset_clause: OFFSET select_offset_value { $$ = $2; } /* SQL:2008 syntax */ - | OFFSET select_offset_value2 row_or_rows + | OFFSET select_fetch_first_value row_or_rows { $$ = $2; } ; @@ -11597,22 +11605,28 @@ select_offset_value: /* * Allowing full expressions without parentheses causes various parsing - * problems with the trailing ROW/ROWS key words. SQL only calls for - * constants, so we allow the rest only with parentheses. If omitted, - * default to 1. + * problems with the trailing ROW/ROWS key words. SQL spec only calls for + * , which is either a literal or a parameter (but + * an could be an identifier, bringing up conflicts + * with ROW/ROWS). We solve this by leveraging the presence of ONLY (see above) + * to determine whether the expression is missing rather than trying to make it + * optional in this rule. + * + * c_expr covers almost all the spec-required cases (and more), but it doesn't + * cover signed numeric literals, which are allowed by the spec. So we include + * those here explicitly. */ -opt_select_fetch_first_value: - SignedIconst { $$ = makeIntConst($1, @1); } - | '(' a_expr ')' { $$ = $2; } - | /*EMPTY*/ { $$ = makeIntConst(1, -1); } +select_fetch_first_value: + c_expr { $$ = $1; } + | '+' I_or_F_const +{ $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "+", NULL, $2, @1); } + | '-' I_or_F_const +{ $$ = doNegate($2, @1); } ; -/* - * Again, the trailing ROW/ROWS in this case prevent the full expression - * syntax. c_expr is the best we can do. - */ -select_offset_value2: - c_expr { $$ = $1; } +I_or_F_const: + Iconst { $$ = makeIntConst($1,@1); } + | FCONST{ $$ = makeFloatConst($1,@1); } ; /* noise words */