percentile value check can be slow
I tried to enter invalid percentile fractions, and was astonished that it seems to be checked after many work is done? psql (11devel) Type "help" for help. jotpe=# \timing Timing is on. jotpe=# select percentile_cont(array[0,0.25,0.5,1,1,null,2]) within group(order by txk) from klima_tag; ERROR: percentile value 2 is not between 0 and 1 Time: 19155,565 ms (00:19,156) jotpe=# select count(*) from klima_tag; count -- 13950214 (1 row) Time: 933,847 ms Best regards Johannes
Re: [JDBC] [HACKERS] Channel binding support for SCRAM-SHA-256
On Sat, Nov 18, 2017 at 4:31 AM, Peter Eisentraut wrote: > I made some significant changes to the logic. Thanks! > The selection of the channel binding flag (n/y/p) in the client seemed > wrong. Your code treated 'y' as an error, but I think that is a > legitimate case, for example a PG11 client connecting to a PG10 server > over SSL. The client supports channel binding in that case and > (correctly) thinks the server does not, so we use the 'y' flag and > proceed normally without channel binding. > > The selection of the mechanism in the client was similarly incorrect, I > think. The code would not tolerate a situation were an SSL connection > is in use but the server does not advertise the -PLUS mechanism, which > again could be from a PG10 server. Hm, OK. I have been likely confused by the fact that eSws is a valid b64-encoded cbind-input on v10 servers. And the spec has no direct mention of the matter, only of biws. > The creation of the channel binding data didn't match the spec, because > the gs2-header (p=type,,) was not included in the data put through > base64. This was done incorrectly on both server and client, so the > protocol still worked. (However, in the non-channel-binding case we > hardcode "biws", which is exactly the base64-encoding of the gs2-header. > So that was inconsistent.) Meh-to-self, you are right. Still it seems to me that your version is forgetting something.. Please see below. > I think we also need to backpatch a bug fix into PG10 so that the server > can accept base64("y,,") as channel binding data. Otherwise, the above > scenario of a PG11 client connecting to a PG10 server over SSL will > currently fail because the server will not accept the channel binding data. Yes, without the additional comparison, the failure is weird for the user. Here is what I have with an unpatched v10 server: psql: FATAL: unexpected SCRAM channel-binding attribute in client-final-message This is going to need a one-liner in read_client_final_message()'s auth-scram.c. > Please check my patch and think through these changes. I'm happy to > commit the patch as is if there are no additional insights. Here are some comments. +* The client requires channel biding. Channel binding type s/biding/binding/. if (!state->ssl_in_use) + /* +* Without SSL, we don't support channel binding. +* Better to add brackets if adding a comment. +* Read value provided by client; only tls-unique is supported +* for now. XXX Not sure whether it would be safe to print +* the name of an unsupported binding type in the error +* message. Pranksters could print arbitrary strings into the +* log that way. That's why I didn't show those strings in the error messages of the previous versions. Those are useless as well, except for debugging the feature and the protocol. + cbind_header_len = 4 + strlen(state->channel_binding_type); /* p=type,, */ + cbind_input_len = cbind_header_len + cbind_data_len; + cbind_input = malloc(cbind_input_len); + if (!cbind_input) + goto oom_error; + snprintf(cbind_input, cbind_input_len, "p=%s", state->channel_binding_type); + memcpy(cbind_input + cbind_header_len, cbind_data, cbind_data_len); By looking at RFC5802, a base64 encoding of cbind-input is used: cbind-input = gs2-header [ cbind-data ] gs2-cbind-flag "," [ authzid ] "," However you are missing two commands after p=%s, no? -- Michael
Re: [HACKERS] More stats about skipped vacuums
On Thu, Nov 16, 2017 at 7:34 PM, Kyotaro HORIGUCHI wrote: > At Wed, 15 Nov 2017 16:13:01 +0900, Michael Paquier > wrote in > >> pg_stat_get_mod_since_analyze(C.oid) AS n_mod_since_analyze, >> + pg_stat_get_vacuum_necessity(C.oid) AS vacuum_required, >> pg_stat_get_last_vacuum_time(C.oid) as last_vacuum, >> pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum, >> pg_stat_get_last_analyze_time(C.oid) as last_analyze, >> pg_stat_get_last_autoanalyze_time(C.oid) as last_autoanalyze, >> pg_stat_get_vacuum_count(C.oid) AS vacuum_count, >> Please use spaces instead of tabs. Indentation is not consistent. > > Done. Thank you for pointing. (whitespace-mode showed me some > similar inconsistencies at the other places in the file...) Yes, I am aware of those which get introduced here and there. Let's not make things worse.. >> + case PGSTAT_VACUUM_CANCELED: >> + phase = tabentry->vacuum_last_phase; >> + /* number of elements of phasestr above */ >> + if (phase >= 0 && phase <= 7) >> + result = psprintf("%s while %s", >> + status == PGSTAT_VACUUM_CANCELED ? >> + "canceled" : "error", >> + phasestr[phase]); >> Such complication is not necessary. The phase parameter is updated by >> individual calls of pgstat_progress_update_param(), so the information >> showed here overlaps with the existing information in the "phase" >> field. > > The "phase" is pg_stat_progress_vacuum's? If "complexy" means > phasestr[phase], the "phase" cannot be overlap with > last_vacuum_status since pg_stat_progress_vacuum's entry has > already gone when someone looks into pg_stat_all_tables and see a > failed vacuum status. Could you give a bit specific comment? I mean that if you tend to report this information, you should just use a separate column for it. Having a single column report two informations, which are here the type of error and potentially the moment where it appeared are harder to parse. >> However, progress reports are here to allow users to do decisions >> based on the activity of how things are working. This patch proposes >> to add multiple new fields: >> - oldest Xmin. >> - number of index scans. >> - number of pages truncated. >> - number of pages that should have been truncated, but are not truncated. >> Among all this information, as Sawada-san has already mentioned >> upthread, the more index scans the less dead tuples you can store at >> once, so autovacuum_work_mem ought to be increases. This is useful for >> tuning and should be documented properly if reported to give >> indications about vacuum behavior. The rest though, could indicate how >> aggressive autovacuum is able to remove tail blocks and do its work. >> But what really matters for users to decide if autovacuum should be >> more aggressive is tracking the number of dead tuples, something which >> is already evaluated. > > Hmm. I tend to agree. Such numbers are better to be shown as > average of the last n vacuums or maximum. I decided to show > last_vacuum_index_scan only and I think that someone can record > it continuously to elsewhere if wants. As a user, what would you make of those numbers? How would they help in tuning autovacuum for a relation? We need to clear up those questions before thinking if there are cases where those are useful. >> Tracking the number of failed vacuum attempts is also something >> helpful to understand how much the job is able to complete. As there >> is already tracking vacuum jobs that have completed, it could be >> possible, instead of logging activity when a vacuum job has failed, to >> track the number of *begun* jobs on a relation. Then it is possible to >> guess how many have failed by taking the difference between those that >> completed properly. Having counters per failure types could also be a >> possibility. > > Maybe pg_stat_all_tables is not the place to hold such many kinds > of vacuum specific information. pg_stat_vacuum_all_tables or > something like? What do you have in mind? pg_stat_all_tables already includes counters about the number of vacuums and analyze runs completed. I guess that the number of failures, and the types of failures ought to be similar counters at the same level. >> For this commit fest, I would suggest a patch that simply adds >> tracking for the number of index scans done, with documentation to >> give recommendations about parameter tuning. i am switching the patch >> as "waiting on author". > > Ok, the patch has been split into the following four parts. (Not > split by function, but by the kind of information to add.) > The first one is that. > > 0001. Adds pg_stat_all_tables.last_vacuum_index_scans. Documentation is added. > > 0002. Adds pg_stat_all_tables.vacuum_required. And primitive documentation. > > 0003. Adds pg_stat_all_tables
Re: Speed up the removal of WAL files
On Sat, Nov 18, 2017 at 2:57 AM, Fujii Masao wrote: > On Fri, Nov 17, 2017 at 5:20 PM, Tsunakawa, Takayuki > wrote: >> From: Kyotaro HORIGUCHI [mailto:horiguchi.kyot...@lab.ntt.co.jp] >>> The orinal code recycles some of the to-be-removed files, but the patch >>> removes all the victims. This may impact on performance. >> >> Yes, I noticed it after submitting the patch and was wondering what to do. >> Thinking simply, I think it would be just enough to replace >> durable_unlink/durable_rename in RemoveXLogFile() with unlink/rename, and >> sync the pg_wal directory once in RemoveNonParentXlogFiles() and >> RemoveOldXlogFiles(). This will benefit the failover time when fast >> promotion is not performed. What do you think? Note that durable_rename() also flushes the origin file to make sure that it does not show up again after a crash. > It seems not good idea to just replace durable_rename() with rename() > in RemoveOldXlogFiles()->RemoveXlogFiles()->InstallXLogFileSegment(). > Because that change seems to be able to cause the following problem. If archiving is enabled, RemoveOldXlogFiles() would create a .ready flag for all segments that have reappeared. Oops, it archived again. -- Michael
Re: [HACKERS] [POC] Faster processing at Gather node
On Fri, Nov 10, 2017 at 8:39 PM, Robert Haas wrote: > On Fri, Nov 10, 2017 at 5:44 AM, Amit Kapila wrote: >> I am seeing the assertion failure as below on executing the above >> mentioned Create statement: >> >> TRAP: FailedAssertion("!(!(tup->t_data->t_infomask & 0x0008))", File: >> "heapam.c", Line: 2634) >> server closed the connection unexpectedly >> This probably means the server terminated abnormally > > OK, I see it now. Not sure why I couldn't reproduce this before. > > I think the problem is not actually with the code that I just wrote. > What I'm seeing is that the slot descriptor's tdhasoid value is false > for both the funnel slot and the result slot; therefore, we conclude > that no projection is needed to remove the OIDs. That seems to make > sense: if the funnel slot doesn't have OIDs and the result slot > doesn't have OIDs either, then we don't need to remove them. > Unfortunately, even though the funnel slot descriptor is marked > tdhashoid = false, the tuples being stored there actually do have > OIDs. And that is because they are coming from the underlying > sequential scan, which *also* has OIDs despite the fact that tdhasoid > for it's slot is false. > > This had me really confused until I realized that there are two > processes involved. The problem is that we don't pass eflags down to > the child process -- so in the user backend, everybody agrees that > there shouldn't be OIDs anywhere, because EXEC_FLAG_WITHOUT_OIDS is > set. In the parallel worker, however, it's not set, so the worker > feels free to do whatever comes naturally, and in this test case that > happens to be returning tuples with OIDs. Patch for this attached. > > I also noticed that the code that initializes the funnel slot is using > its own PlanState rather than the outer plan's PlanState to call > ExecContextForcesOids. I think that's formally incorrect, because the > goal is to end up with a slot that is the same as the outer plan's > slot. It doesn't matter because ExecContextForcesOids doesn't care > which PlanState it gets passed, but the comments in > ExecContextForcesOids imply that somebody it might, so perhaps it's > best to clean that up. Patch for this attached, too. > - if (!ExecContextForcesOids(&gatherstate->ps, &hasoid)) + if (!ExecContextForcesOids(outerPlanState(gatherstate), &hasoid)) hasoid = false; Don't we need a similar change in nodeGatherMerge.c (in function ExecInitGatherMerge)? > And here are the other patches again, too. > The 0001* patch doesn't apply, please find the attached rebased version which I have used to verify the patch. Now, along with 0001* and 0002*, 0003-skip-gather-project-v2 looks good to me. I think we can proceed with the commit of 0001*~0003* patches unless somebody else has any comments. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com 0001-pass-eflags-to-worker-v2.patch Description: Binary data
to_typemod(type_name) information function
Hi, I need to test a (user) given column type name, with one in the database for equality. To this end, I have to do some kind of normalization (e.g. 'timestamptz(2)' to 'timestamp (2) with time zone'.) Comparing the name alone is possible with to_regtype(type_name) or ::regtype. However, this ignores the 'typemod'. I want to suggest a to_typemod(type_name) function which, combined with to_regtype, would allow to decompose (and reconstruct) a type name completely. Best, Sophie
Re: percentile value check can be slow
jotpe writes: > I tried to enter invalid percentile fractions, and was astonished that > it seems to be checked after many work is done? IIRC, only the aggregate's final-function is concerned with direct arguments, so it's not all that astonishing. regards, tom lane
Re: to_typemod(type_name) information function
Sophie Herold writes: > I need to test a (user) given column type name, with one in the database > for equality. To this end, I have to do some kind of normalization (e.g. > 'timestamptz(2)' to 'timestamp (2) with time zone'.) Perhaps format_type(oid, integer) would help you. regards, tom lane
unsubscribe pgsql-hack...@postgresql.org
[cid:image001.png@01D3608D.8F5DC9E0] Didier ROS DSP/CSP IT-DMA/Solutions Groupe EDF/Expertise Applicative Expertise SGBD 32 Avenue Pablo Picasso 92000 NANTERRE Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
Re: [JDBC] [HACKERS] Channel binding support for SCRAM-SHA-256
On 11/18/17 06:32, Michael Paquier wrote: > Here are some comments. > > +* The client requires channel biding. Channel binding type > s/biding/binding/. fixed > if (!state->ssl_in_use) > + /* > +* Without SSL, we don't support channel binding. > +* > Better to add brackets if adding a comment. done > +* Read value provided by client; only tls-unique is supported > +* for now. XXX Not sure whether it would be safe to print > +* the name of an unsupported binding type in the error > +* message. Pranksters could print arbitrary strings into the > +* log that way. > That's why I didn't show those strings in the error messages of the > previous versions. Those are useless as well, except for debugging the > feature and the protocol. Right. I left the comment in there as a note to future hackers who want to improve error messages (often myself). > + cbind_header_len = 4 + strlen(state->channel_binding_type); /* > p=type,, */ > + cbind_input_len = cbind_header_len + cbind_data_len; > + cbind_input = malloc(cbind_input_len); > + if (!cbind_input) > + goto oom_error; > + snprintf(cbind_input, cbind_input_len, "p=%s", > state->channel_binding_type); > + memcpy(cbind_input + cbind_header_len, cbind_data, cbind_data_len); > By looking at RFC5802, a base64 encoding of cbind-input is used: > cbind-input = gs2-header [ cbind-data ] > gs2-cbind-flag "," [ authzid ] "," > However you are missing two commands after p=%s, no? fixed I have committed the patch with the above fixes. I'll be off for a week, so perhaps by that time you could make a rebased version of the rest? I'm not sure how much more time I'll have, so maybe it will end up being moved to the next CF. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Consistently catch errors from Python _New() functions
On 11/17/17 12:16, Tom Lane wrote: > I'm confused by the places that change PLy_elog calls to pass NULL: > > - PLy_elog(ERROR, "could not create globals"); > + PLy_elog(ERROR, NULL); > > How is that an improvement? Otherwise it looks reasonable. If we pass NULL, then the current Python exception becomes the primary error, so you'd end up with an "out of memory" error of some kind with a stack trace, which seems useful. The previous coding style invented a bespoke error message for each of these rare out of memory scenarios, which seems wasteful. We don't create "out of memory while creating some internal list you've never heard of" errors elsewhere either. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: to_typemod(type_name) information function
On 18/11/17 16:50, Tom Lane wrote: > Sophie Herold writes: >> I need to test a (user) given column type name, with one in the database >> for equality. To this end, I have to do some kind of normalization (e.g. >> 'timestamptz(2)' to 'timestamp (2) with time zone'.) > > Perhaps format_type(oid, integer) would help you. > > regards, tom lane > I am not sure how. I am exactly looking for the the second argument integer. The only workaround I can think of is to create a table with a column with that type, ask the pg_catalog for the typemod afterwards and rollback the creation. But that doesn't sound like a proper solution to me. Best, Sophie
Re: [HACKERS] Consistently catch errors from Python _New() functions
Peter Eisentraut writes: > On 11/17/17 12:16, Tom Lane wrote: >> I'm confused by the places that change PLy_elog calls to pass NULL: >> >> -PLy_elog(ERROR, "could not create globals"); >> +PLy_elog(ERROR, NULL); >> >> How is that an improvement? Otherwise it looks reasonable. > If we pass NULL, then the current Python exception becomes the primary > error, so you'd end up with an "out of memory" error of some kind with a > stack trace, which seems useful. Oh, I see. Objection withdrawn. regards, tom lane
Re: [HACKERS] Add Roman numeral conversion to to_number
Oliver Ford writes: > Attached is v2 of src, tests and docs. Doc patch is unchanged from v1. I took a quick look at this patch. The roman_to_int function itself seems fairly reasonable, except that I don't like anything about the error reporting. ERRCODE_INVALID_PARAMETER_VALUE doesn't really seem like the right thing for bogus data; probably ERRCODE_INVALID_TEXT_REPRESENTATION is the closest thing we've got. More generally, reporting a character position in a string you're not showing to the user is neither helpful nor per project style. It'd be much better to just report the whole input string. I'm tempted to suggest that all the error reports could be errmsg("incorrect Roman numeral: \"%s\"", numerals) I'm not sure it's worth breaking it down more finely than that, and even if it is, several of these messages aren't too intelligible as-is. An angle that might be worth considering is whether we really want to reject non-canonical Roman input. As is, the function rejects "", insisting that you must spell it "IV", but is that helpful or just pedantic? Likewise I'm not that excited about expending code and a separate translatable message to insist that people have to write "X" rather than "VV". However, what I really don't like is the way that you plugged roman_to_int into the existing to_number infrastructure, which is to say that you didn't. Putting in an "if roman then else " test is not the way to make this work. As an example, that fails to handle literal text in the format. If this works: regression=# select to_char(123, 'foo FMRN'); to_char foo CXXIII (1 row) then this should, but it fails: regression=# select to_number('foo CXXIII', 'foo FMRN'); ERROR: invalid character " " I think you need to be calling roman_to_int from the NUM_RN/NUM_rn switch cases in NUM_processor, not trying to bypass NUM_processor entirely. Another issue is that on the input side, we really need to reject formats like 'RN 999', since it's very unclear how we'd combine the input values. There already is some logic that rejects 'RN ', but it needs extension. On the other hand, I do not see any really good reason to reject 'RN 999' for output; it would result in duplicative output, but if the user asked for it why not do it? (I see the existing code fails to handle that case for some reason.) So some refactoring of the existing roman-numeral code seems needed anyway. regards, tom lane
Re: [HACKERS] Consistently catch errors from Python _New() functions
On 11/18/17 12:05, Tom Lane wrote: > Peter Eisentraut writes: >> On 11/17/17 12:16, Tom Lane wrote: >>> I'm confused by the places that change PLy_elog calls to pass NULL: >>> >>> - PLy_elog(ERROR, "could not create globals"); >>> + PLy_elog(ERROR, NULL); >>> >>> How is that an improvement? Otherwise it looks reasonable. > >> If we pass NULL, then the current Python exception becomes the primary >> error, so you'd end up with an "out of memory" error of some kind with a >> stack trace, which seems useful. > > Oh, I see. Objection withdrawn. Committed, thanks. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
spgist rangetypes compiler warning (gcc 7.2.0)
Hi, while compiling on gcc 7.2.0 (on ARM), I got this warning: rangetypes_spgist.c: In function 'spg_range_quad_inner_consistent': rangetypes_spgist.c:559:29: warning: comparison between pointer and zero character constant [-Wpointer-compare] if (in->traversalValue != (Datum) 0) ^~ rangetypes_spgist.c:559:10: note: did you mean to dereference the pointer? if (in->traversalValue != (Datum) 0) ^ I believe we should simply treat the traversalValue as pointer, and change the condition to if (in->traversalValue) Patch attached. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >From 3b0528e9b56b9ff2dec4c3670d78ad9018224065 Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Sat, 18 Nov 2017 20:25:02 +0100 Subject: [PATCH] Fix compiler warning when comparing traversalValue On gcc 7.2.0, comparing pointer to (Datum)0 produces a warning. Treat it as a simple pointer to fix that. --- src/backend/utils/adt/rangetypes_spgist.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/utils/adt/rangetypes_spgist.c b/src/backend/utils/adt/rangetypes_spgist.c index d934105..27fbf66 100644 --- a/src/backend/utils/adt/rangetypes_spgist.c +++ b/src/backend/utils/adt/rangetypes_spgist.c @@ -556,7 +556,7 @@ spg_range_quad_inner_consistent(PG_FUNCTION_ARGS) * for lower or upper bounds to be adjacent. Deserialize * previous centroid range if present for checking this. */ - if (in->traversalValue != (Datum) 0) + if (in->traversalValue) { prevCentroid = DatumGetRangeTypeP(in->traversalValue); range_deserialize(typcache, prevCentroid, -- 2.9.5
Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Hi, Attached is an updated version of the patch, adopting the psql describe changes introduced by 471d55859c11b. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 0001-multivariate-MCV-lists.patch.gz Description: application/gzip 0002-multivariate-histograms.patch.gz Description: application/gzip
Re: WIP: BRIN multi-range indexes
Hi, Apparently there was some minor breakage due to duplicate OIDs, so here is the patch series updated to current master. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 0001-Pass-all-keys-to-BRIN-consistent-function-at-once.patch.gz Description: application/gzip 0002-BRIN-bloom-indexes.patch.gz Description: application/gzip 0003-BRIN-multi-range-minmax-indexes.patch.gz Description: application/gzip 0004-Move-IS-NOT-NULL-checks-to-bringetbitmap.patch.gz Description: application/gzip
Re: percentile value check can be slow
On Sat, Nov 18, 2017 at 10:44:36AM -0500, Tom Lane wrote: > jotpe writes: > > I tried to enter invalid percentile fractions, and was astonished > > that it seems to be checked after many work is done? > > IIRC, only the aggregate's final-function is concerned with direct > arguments, so it's not all that astonishing. It may not be surprising from the point of view of a systems programmer, but it's pretty surprising that this check is deferred to many seconds in when the system has all the information it need in order to establish this before execution begins. I'm not sure I see an easy way to do this check early, but it's worth trying on grounds of POLA violation. I have a couple of ideas on how to do this, one less invasive but hinky, the other a lot more invasive but better overall. Ugly Hack With Ugly Performance Consequences: Inject a subtransaction at the start of execution that supplies an empty input to the final function with the supplied direct arguments. Bigger Lift: Require a separate recognizer function direct arguments and fire it during post-parse analysis. Perhaps this could be called recognizer along with the corresponding mrecognizer. It's not clear that it's sane to have separate ones, but I thought I'd bring it up for completeness. Way Bigger Lift, As Far As I Can Tell, But More Fun For Users: Allow optional CHECK constraints in CREATE AGGREGATE for direct arguments. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: [HACKERS] Repetitive code in RI triggers
Ildar Musin writes: > [ ri_triggers_v2.patch ] Pushed with two minor improvements. I noticed that ri_setdefault could just go directly to ri_restrict rather than call the two separate triggers that would end up there anyway; that lets its argument be "TriggerData *trigdata" for more consistency with the other cases. Also, this patch made it very obvious that we were caching identical queries under hash keys RI_PLAN_RESTRICT_DEL_CHECKREF and RI_PLAN_RESTRICT_UPD_CHECKREF, so we might as well just use one hash entry for both cases, saving a few lines of code as well as a lot of cycles. Likewise in the other two functions. regards, tom lane
Re: spgist rangetypes compiler warning (gcc 7.2.0)
Tomas Vondra writes: > while compiling on gcc 7.2.0 (on ARM), I got this warning: > rangetypes_spgist.c: In function 'spg_range_quad_inner_consistent': > rangetypes_spgist.c:559:29: warning: comparison between pointer and > zero character constant [-Wpointer-compare] > if (in->traversalValue != (Datum) 0) > ^~ Huh. I wonder why 7.2.1 on Fedora isn't producing that warning. > I believe we should simply treat the traversalValue as pointer, and > change the condition to > if (in->traversalValue) Agreed, especially since it's done like that in spgscan.c and geo_spgist.c. regards, tom lane
Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means
Hi, On 11/02/2017 08:15 PM, Tom Lane wrote: > Haribabu Kommi writes: >> The changes are fine and now it reports proper live tuples in both >> pg_class and stats. The other issue of continuous vacuum operation >> leading to decrease of number of live tuples is not related to this >> patch and that can be handled separately. > > I did not like this patch too much, because it did nothing to fix > the underlying problem of confusion between "live tuples" and "total > tuples"; in fact, it made that worse, because now the comment on > LVRelStats.new_rel_tuples is a lie. And there's at least one usage > of that field value where I think we do want total tuples not live > tuples: where we pass it to index AM cleanup functions. Indexes > don't really care whether heap entries are live or not, only that > they're there. Plus the VACUUM VERBOSE log message that uses the > field is supposed to be reporting total tuples not live tuples. > > I hacked up the patch trying to make that better, finding along the > way that contrib/pgstattuple shared in the confusion about what > it was trying to count. Results attached. > Thanks for looking into this. I agree your patch is more consistent and generally cleaner. > However, I'm not sure we're there yet, because there remains a fairly > nasty discrepancy even once we've gotten everyone onto the same page > about reltuples counting just live tuples: VACUUM and ANALYZE have > different definitions of what's "live". In particular they do not treat > INSERT_IN_PROGRESS and DELETE_IN_PROGRESS tuples the same. Should we > try to do something about that? If so, what? It looks like ANALYZE's > behavior is pretty tightly constrained, judging by the comments in > acquire_sample_rows. > Yeah :-( For the record (and people following this thread who are too lazy to open the analyze.c and search for the comments), the problem is that acquire_sample_rows does not count HEAPTUPLE_INSERT_IN_PROGRESS as live (and HEAPTUPLE_DELETE_IN_PROGRESS as dead) as it assumes the transaction will send the stats at the end. Which is a correct assumption, but it means that when there is a long-running transaction that inserted/deleted many rows, the reltuples value will oscillate during VACUUM / ANALYZE runs. ISTM we need to unify those definitions, probably so that VACUUM adopts what acquire_sample_rows does. I mean, if ANALYZE assumes that the stats will be updated at the end of transaction, why shouldn't VACUUM do the same thing? The one reason I can think of is that VACUUM is generally expected to run longer than ANALYZE, so the assumption that large transactions complete later is not quite reliable. Or is there some other reason why VACUUM can't treat the in-progress tuples the same way? > Another problem is that it looks like CREATE INDEX will set reltuples > to the total number of heap entries it chose to index, because that > is what IndexBuildHeapScan counts. Maybe we should adjust that? > You mean by only counting live tuples in IndexBuildHeapRangeScan, following whatever definition we end up using in VACUUM/ANALYZE? Seems like a good idea I guess, although CREATE INDEX not as frequent as the other commands. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means
Tomas Vondra writes: > On 11/02/2017 08:15 PM, Tom Lane wrote: >> However, I'm not sure we're there yet, because there remains a fairly >> nasty discrepancy even once we've gotten everyone onto the same page >> about reltuples counting just live tuples: VACUUM and ANALYZE have >> different definitions of what's "live". In particular they do not treat >> INSERT_IN_PROGRESS and DELETE_IN_PROGRESS tuples the same. Should we >> try to do something about that? If so, what? It looks like ANALYZE's >> behavior is pretty tightly constrained, judging by the comments in >> acquire_sample_rows. > ISTM we need to unify those definitions, probably so that VACUUM adopts > what acquire_sample_rows does. I mean, if ANALYZE assumes that the stats > will be updated at the end of transaction, why shouldn't VACUUM do the > same thing? That was the way I was leaning. I haven't thought very hard about the implications, but as long as the change in VACUUM's behavior extends only to the live-tuple count it reports, it seems like adjusting it couldn't break anything too badly. >> Another problem is that it looks like CREATE INDEX will set reltuples >> to the total number of heap entries it chose to index, because that >> is what IndexBuildHeapScan counts. Maybe we should adjust that? > You mean by only counting live tuples in IndexBuildHeapRangeScan, > following whatever definition we end up using in VACUUM/ANALYZE? Right. One issue is that, as I mentioned, the index AMs probably want to think about total-tuples-indexed not live-tuples; so for their purposes, what IndexBuildHeapScan currently counts is the right thing. We need to look and see if any AMs are actually using that value rather than just silently passing it back. If they are, we might need to go to the trouble of computing/returning two values. regards, tom lane
Re: percentile value check can be slow
Hi, On 11/18/2017 10:30 PM, David Fetter wrote: > On Sat, Nov 18, 2017 at 10:44:36AM -0500, Tom Lane wrote: >> jotpe writes: >>> I tried to enter invalid percentile fractions, and was astonished >>> that it seems to be checked after many work is done? >> >> IIRC, only the aggregate's final-function is concerned with direct >> arguments, so it's not all that astonishing. > > It may not be surprising from the point of view of a systems > programmer, but it's pretty surprising that this check is deferred to > many seconds in when the system has all the information it need in > order to establish this before execution begins. > > I'm not sure I see an easy way to do this check early, but it's worth > trying on grounds of POLA violation. I have a couple of ideas on how > to do this, one less invasive but hinky, the other a lot more invasive > but better overall. > > Ugly Hack With Ugly Performance Consequences: > Inject a subtransaction at the start of execution that supplies an > empty input to the final function with the supplied direct > arguments. > I'm pretty sure you realize this is quite unlikely to get accepted. > Bigger Lift: > Require a separate recognizer function direct arguments and fire > it during post-parse analysis. Perhaps this could be called > recognizer along with the corresponding mrecognizer. It's not > clear that it's sane to have separate ones, but I thought I'd > bring it up for completeness. > Is 'recognizer' an established definition I should know? Is it the same as 'validator' or is it something new/different? > Way Bigger Lift, As Far As I Can Tell, But More Fun For Users: > Allow optional CHECK constraints in CREATE AGGREGATE for direct > arguments. > How will any of the approaches deal with something like select percentile_cont((select array_agg(v) from p)) within group (order by a) from t; In this case the the values are unknown after the parse analysis, so I guess it does not really address that. FWIW while I won't stand in the way of improving this, I wonder if this is really worth the additional complexity. If you get errors like this with a static list of values, you will fix the list and you're done. If the list is dynamic (computed in the query itself), you'll still get the error much later during query execution. So if you're getting many failures like this for the "delayed error reporting" to be an issue, perhaps there's something wrong in you stack and you should address that instead? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: spgist rangetypes compiler warning (gcc 7.2.0)
On 11/18/2017 10:40 PM, Tom Lane wrote: > Tomas Vondra writes: >> while compiling on gcc 7.2.0 (on ARM), I got this warning: > >> rangetypes_spgist.c: In function 'spg_range_quad_inner_consistent': >> rangetypes_spgist.c:559:29: warning: comparison between pointer and >> zero character constant [-Wpointer-compare] >> if (in->traversalValue != (Datum) 0) >> ^~ > > Huh. I wonder why 7.2.1 on Fedora isn't producing that warning. > Not sure. Maybe Ubuntu uses different flags (on ARM)? This is what I get from 'gcc -v' on the machine: Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/7/lto-wrapper Target: arm-linux-gnueabihf Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 7.2.0-8ubuntu3' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=arm-linux-gnueabihf- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --enable-multilib --disable-sjlj-exceptions --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard --with-mode=thumb --disable-werror --enable-multilib --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf Thread model: posix gcc version 7.2.0 (Ubuntu/Linaro 7.2.0-8ubuntu3) regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Bug in to_timestamp().
Artur Zakirov writes: > [ 0001-to-timestamp-format-checking-v7.patch ] This patch needs a rebase over the formatting.c fixes that have gone in over the last couple of days. Looking at the rejects, I notice that in your changes to parse_format(), you seem to be making it rely even more heavily on remembering state about the previous input. I recommend against that --- it was broken before, and it's a pretty fragile approach. Backslashes are not that hard to deal with in-line. The larger picture though is that I'm not real sure that this patch is going in the direction we want. All of these cases work in both our code and Oracle: select to_timestamp('97/Feb/16', 'YY:Mon:DD') select to_timestamp('97/Feb/16', 'YY Mon DD') select to_timestamp('97 Feb 16', 'YY/Mon/DD') (Well, Oracle thinks these mean 2097 where we think 1997, but the point is you don't get an error.) I see from your regression test additions that you want to make some of these throw an error, and I think that any such proposal is dead in the water. Nobody is going to consider it an improvement if it both breaks working PG apps and disagrees with Oracle. regards, tom lane
Re: [HACKERS] ginInsertCleanup called from vacuum could still miss tuples to be deleted
On Thu, Nov 16, 2017 at 2:16 PM, Jeff Janes wrote: > e2c79e14 was to fix a pre-existing bug, but probably e9568083 made that bug > easier to hit than it was before. (Which is not to say that e9568083 can't > contain bugs of its own, of course) I get that. I think that the same thing may have happened again, in fact. A pending list recycling interlock bug may have merely been unmasked by your commit e9568083; I'm speculating that your commit made it more likely to hit in practice, just as with the earlier bug you mentioned. As you know, there is a significant history of VACUUM page recycling bugs in GIN. I wouldn't be surprised if we found one more in the pending list logic. Consider commit ac4ab97e, a bugfix from late 2013 for posting list page deletion, where the current posting list locking protocol was more or less established. The commit message says: """ ... If a page is deleted, and reused for something else, just as a search is following a rightlink to it from its left sibling, the search would continue scanning whatever the new contents of the page are. That could lead to incorrect query results, or even something more curious if the page is reused for a different kind of a page. To fix, modify the search algorithm to lock the next page before releasing the previous one, and refrain from deleting pages from the leftmost branch of the [posting] tree. ... """ I believe that this commit had GIN avoid deletion of the leftmost branch of posting trees because that makes it safe to get to the posting tree root from a raw root block number (a raw block number from the B-tree index constructed over key values). The buffer lock pairing/crabbing this commit adds to posting lists (similar to the pairing/crabbing that pending lists had from day one, when first added in 2011) can prevent a concurrent deletion once you reach the posting tree root. But, as I said, you still need to reliably get to the root in the first place, which the "don't delete leftmost" rule ensures (if you can never delete it, you can never recycle it, and no reader gets confused). It's not clear why it's safe to recycle the pending list pages at all (though deletion alone might well be okay, because readers can notice a page is deleted if it isn't yet recycled). Why is it okay that we can jump to the first block from the meta page in the face of concurrent recycling? Looking at scanPendingInsert(), it does appear that readers do buffer lock crabbing/coupling for the meta page and the first pending page. So that's an encouraging sign. However, ginInsertCleanup() *also* uses shared buffer locks for both meta page and list head page (never exclusive buffer locks) in exactly the same way when first establishing what block is at the head of the list to be zapped. It's acting as if it is a reader, but it's actually a writer. I don't follow why this is correct. It looks like scanPendingInsert() can decide that it knows where the head of the pending list is *after* ginInsertCleanup() has already decided that that same head of the list block needs to possibly be deleted/recycled. Have I missed something? We really ought to make the asserts on gin page type into "can't happen" elog errors, as a defensive measure, in both scanPendingInsert() and ginInsertCleanup(). The 2013 bug fix actually did this for the posting tree, but didn't touch similar pending list code. -- Peter Geoghegan
Re: [JDBC] [HACKERS] Channel binding support for SCRAM-SHA-256
On Sun, Nov 19, 2017 at 12:56 AM, Peter Eisentraut wrote: > On 11/18/17 06:32, Michael Paquier wrote: >> + cbind_header_len = 4 + strlen(state->channel_binding_type); /* >> p=type,, */ >> + cbind_input_len = cbind_header_len + cbind_data_len; >> + cbind_input = malloc(cbind_input_len); >> + if (!cbind_input) >> + goto oom_error; >> + snprintf(cbind_input, cbind_input_len, "p=%s", >> state->channel_binding_type); >> + memcpy(cbind_input + cbind_header_len, cbind_data, cbind_data_len); >> By looking at RFC5802, a base64 encoding of cbind-input is used: >> cbind-input = gs2-header [ cbind-data ] >> gs2-cbind-flag "," [ authzid ] "," >> However you are missing two commands after p=%s, no? > > fixed s/commands/commas/. You caught my words correctly. > I have committed the patch with the above fixes. Thanks, Peter! > I'll be off for a week, so perhaps by that time you could make a rebased > version of the rest? I'm not sure how much more time I'll have, so > maybe it will end up being moved to the next CF. OK, let's see then. That's not an issue for me if this gets bumped. -- Michael
Re: Logical Replication and triggers
On 15 November 2017 at 21:12, Thomas Rosenstein < thomas.rosenst...@creamfinance.com> wrote: > I would like somebody to consider Petr Jelineks patch for worker.c from > here (https://www.postgresql.org/message-id/619c557d-93e6-1833-16 > 92-b010b176ff77%402ndquadrant.com) > > I'm was facing the same issue with 10.1 and BEFORE INSERT OR UPDATE > triggers. > Please: - Apply it to current HEAD - Test its functionality - Report back on the patch thread - Update the commitfest app with your results and sign on as a reviewer - If you're able, read over the patch and make any comments you can "Somebody" needs to be you, if you want this functionality. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] new function for tsquery creartion
Hi, On 09/13/2017 10:57 AM, Victor Drobny wrote: > On 2017-09-09 06:03, Thomas Munro wrote: >> Please send a rebased version of the patch for people to review and >> test as that one has bit-rotted. > Hello, > Thank you for interest. In the attachment you can find rebased > version(based on 69835bc8988812c960f4ed5aeee86b62ac73602a commit) > I did a quick review of the patch today. The patch unfortunately no longer applies, so I had to use an older commit from September. Please rebase to current master. I've only looked on the diff at this point, will do more testing once the rebase happens. Some comments: 1) This seems to mix multiple improvements into one patch. There's the support for alternative query syntax, and then there are the new operators (AROUND and ). I propose to split the patch into two or more parts, each addressing one of those bits. I guess there will be two or three parts - first adding the syntax, second adding and third adding the AROUND(n). Seems reasonable? 2) I don't think we should mention Google in the docs explicitly. Not that I'm somehow anti-google, but this syntax was certainly not invented by Google - I vividly remember using something like that on Altavista (yeah, I'm old). And it's used by pretty much every other web search engine out there ... 3) In the SGML docs, please use instead of just quoting the values. So it should be | instead of '|' etc. Just like in the parts describing plainto_tsquery, for example. 4) Also, I recommend adding a brief explanation what the examples do. Right now there's just a bunch of queryto_tsquery, and the reader is expected to understand the output. I suggest adding a sentence or two, explaining what's happening (just like for plainto_tsquery examples). 5) I'm not sure about negative values in the operator. I don't find it particularly confusing - once you understand that (a b) means "there are 'k' words between 'a' and 'b' (n <= k <= m)", then negative values seem like a fairly straightforward extension. But I guess the main question is "Is there really a demand for the new operator, or have we just invented if because we can?" 6) There seem to be some new constants defined, with names not following the naming convention. I mean this #define WAITOPERAND 1 #define WAITOPERATOR2 #define WAITFIRSTOPERAND3 #define WAITSINGLEOPERAND 4 #define INSIDEQUOTES5 <-- the new one and #define TSPO_L_ONLY0x01 #define TSPO_R_ONLY0x02 #define TSPO_BOTH 0x04 #define TS_NOT_EXAC0x08 <-- the new one Perhaps that's OK, but it seems a bit inconsistent. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] pg_basebackup --progress output for batch execution
On 11/10/2017 02:32 PM, Martín Marqués wrote: > Hi, > > Thanks for having a look at this patch. > > 2017-11-09 20:55 GMT-03:00 Jeff Janes : >> On Fri, Sep 29, 2017 at 4:00 PM, Martin Marques >> wrote: >>> >>> Hi, >>> >>> Some time ago I had to work on a system where I was cloning a standby >>> using pg_basebackup, that didn't have screen or tmux. For that reason I >>> redirected the output to a file and ran it with nohup. >>> >>> I normally (always actually ;) ) run pg_basebackup with --progress and >>> --verbose so I can follow how much has been done. When done on a tty you >>> get a nice progress bar with the percentage that has been cloned. >>> >>> The problem came with the execution and redirection of the output, as >>> the --progress option will write a *very* long line! >>> >>> Back then I thought of writing a patch (actually someone suggested I do >>> so) to add a --batch-mode option which would change the behavior >>> pg_basebackup has when printing the output messages. >> >> >> >> While separate lines in the output file is better than one very long line, >> it still doesn't seem so useful. If you aren't watching it in real time, >> then you really need to have a timestamp on each line so that you can >> interpret the output. The lines are about one second apart, but I don't >> know robust that timing is under all conditions. > > I kind of disagree with your view here. > > If the cloning process takes many hours to complete (in my case, it > was around 12 hours IIRC) you might want to peak at the log every now > and then with tail. > > I do agree on adding a timestamp prefix to each line, as it's not > clear from the code how often progress_report() is called. > >> I think I agree with Arthur that I'd rather have the decision made by >> inspecting whether output is going to a tty, rather than by adding another >> command line option. But maybe that is not detected robustly enough across >> all platforms and circumstances? > > In this case, Arthur's idea is good, but would make the patch less > generic/flexible for the end user. > I'm not quite sure about that. We're not getting the flexibility for free, but for additional complexity (additional command-line option). And what valuable capability would we (or the users) loose, really, by relying on the isatty call? So what use case is possible with --batch-mode but not with isatty (e.g. when combined with tee)? > > That's why I tried to reproduce what top does when executed with -b > (Batch mode operation). There, it's the end user who decides how the > output is formatted (well, saying it decides on formatting a bit of > an overstatement, but you get it ;) ) > In 'top' a batch mode also disables input, and runs for a certain number of interactions (as determined by '-n' option). That's not the case in pg_basebackup, where it only adds the extra newline. > > An example where using isatty() might fail is if you run > pg_basebackup from a tty but redirect the output to a file, I > believe that in that case isatty() will return true, but it's very > likely that the user might want batch mode output. > IMHO that should work correctly, as already explained by Arthur. > > But maybe we should also add Arthurs idea anyway (when not in batch > mode), as it seems pretty lame to output progress in one line if you > are not in a tty. > I'd say to just use isatty, unless we can come up with a compelling use case that is not satisfied by it. And if we end up using --batch-mode, perhaps we should only allow it when --progress is used. It's the only thing it affects, and it makes no difference when used without it. Furthermore, I propose to reword this: Runs pg_basebackup in batch mode. This is useful if the output is to be pipped so the other end of the pipe reads each line. Using this option with --progress will result in printing each progress output with a newline at the end, instead of a carrige return. like this: Runs pg_basebackup in batch mode, in which --progress terminates the lines with a regular newline instead of carriage return. This is useful if the output is redirected to a file or a pipe, instead of a plain console. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: percentile value check can be slow
On Sat, Nov 18, 2017 at 11:05:47PM +0100, Tomas Vondra wrote: > Hi, > > On 11/18/2017 10:30 PM, David Fetter wrote: > > On Sat, Nov 18, 2017 at 10:44:36AM -0500, Tom Lane wrote: > >> jotpe writes: > >>> I tried to enter invalid percentile fractions, and was astonished > >>> that it seems to be checked after many work is done? > >> > >> IIRC, only the aggregate's final-function is concerned with direct > >> arguments, so it's not all that astonishing. > > > > It may not be surprising from the point of view of a systems > > programmer, but it's pretty surprising that this check is deferred to > > many seconds in when the system has all the information it need in > > order to establish this before execution begins. > > > > I'm not sure I see an easy way to do this check early, but it's worth > > trying on grounds of POLA violation. I have a couple of ideas on how > > to do this, one less invasive but hinky, the other a lot more invasive > > but better overall. > > > > Ugly Hack With Ugly Performance Consequences: > > Inject a subtransaction at the start of execution that supplies an > > empty input to the final function with the supplied direct > > arguments. > > I'm pretty sure you realize this is quite unlikely to get accepted. I should hope not, but I mentioned it because I'd like to get it on the record as rejected. > > Bigger Lift: > > Require a separate recognizer function direct arguments and fire > > it during post-parse analysis. Perhaps this could be called > > recognizer along with the corresponding mrecognizer. It's not > > clear that it's sane to have separate ones, but I thought I'd > > bring it up for completeness. > > > > Is 'recognizer' an established definition I should know? Is it the same > as 'validator' or is it something new/different? I borrowed it from http://langsec.org/ I'm not entirely sure what you mean by a validator, but a recognizer is something that gives a quick and sure read as to whether the input is well-formed. In general, it's along the lines of a tokenizer, a parser, and something that does very light post-parse analysis for correctness of form. For the case that started the thread, a recognizer would check something along the lines of CHECK('[0,1]' @> ALL(input_array)) > > Way Bigger Lift, As Far As I Can Tell, But More Fun For Users: > > Allow optional CHECK constraints in CREATE AGGREGATE for direct > > arguments. > > > > How will any of the approaches deal with something like > > select percentile_cont((select array_agg(v) from p)) >within group (order by a) from t; > > In this case the the values are unknown after the parse analysis, so I > guess it does not really address that. It doesn't. Does it make sense to do a one-shot execution for cases like that? It might well be worth it to do the aggregate once in advance as a throw-away if the query execution time is already going to take awhile. Of course, you can break that one by making p a JOIN to yet another thing... > FWIW while I won't stand in the way of improving this, I wonder if this > is really worth the additional complexity. If you get errors like this > with a static list of values, you will fix the list and you're done. If > the list is dynamic (computed in the query itself), you'll still get the > error much later during query execution. > > So if you're getting many failures like this for the "delayed error > reporting" to be an issue, perhaps there's something wrong in you stack > and you should address that instead? I'd like to think that getting something to fail quickly and cheaply when it can will give our end users a better experience. Here, "cheaply" refers to their computing resources and time. Clearly, not having this happen in this case bothered Johannes enough to wade in here. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate