Re: SQL:2023 JSON simplified accessor support
On 2024-09-27 Fr 5:49 AM, David E. Wheeler wrote: On Sep 26, 2024, at 16:45, Alexandra Wang wrote: I didn’t run pgindent earlier, so here’s the updated version with the correct indentation. Hope this helps! Oh, nice! I don’t suppose the standard also has defined an operator equivalent to ->>, though, has it? I tend to want the text output far more often than a JSON scalar. That would defeat being able to chain these. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Opinion poll: Sending an automated email to a thread when it gets added to the commitfest
On Thu, Sep 26, 2024 at 2:57 PM Jelte Fennema-Nio wrote: > On Thu, 26 Sept 2024 at 08:06, Robert Haas wrote: > > Focusing on the first patch seems odd to me, though > > Indeed the first few patches will often be small, and the big patch > will appear later. When I split patches up, those small patches should > usually be reviewable without looking at the big patch in detail, and > hopefully they shouldn't be too contentious: e.g. a comment > improvement or some small refactor. But often those patches don't seem > to be reviewed significantly quicker or merged significantly earlier > than the big patch. That makes it seem to me that even though they > should be relatively low-risk to commit and low-effort to review, > reviewers are scared away by the sheer number of patches in the > patchset, or by the size of the final patch. That's why I thought it > could be useful to specifically show the size of the first patch in > addition to the total patchset size, so that reviewers can easily spot > some small hopefully easy to review patch at the start of a patchset. Fair enough! Personally what I'd want to know is how large the biggest patch is, but I see your point, too. -- Robert Haas EDB: http://www.enterprisedb.com
Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
Re: Tom Lane > So the first part of that is great, but if your script file is > large you probably won't be happy about having the whole thing > repeated in the "QUERY" field. So this needs some work on > user-friendliness. Does this really have to be addressed? It would be way better than it is now, and errors during extension creation are rare and mostly for developers only, so it doesn't have to be pretty. > I'm inclined to think that maybe we'd be best off keeping the server > end of it straightforward, and trying to teach psql to abbreviate the > QUERY field in a useful way. IIRC you get this same type of problem > with very large SQL-language functions and suchlike. I'd treat this as a separate patch, if it's considered to be a good idea. Christoph
Re: Conflict Detection and Resolution
On Fri, Sep 27, 2024 at 10:44 AM shveta malik wrote: > > > > > > > Thanks for the review. > > > Here is the v14 patch-set fixing review comments in [1] and [2]. > > > > > > > Thanks for the patches. I am reviewing patch001, it is WIP, but please > > find initial set of comments: > > > Please find the next set of comments. 16) In pg_dump.h, there is a lot of duplication of structures from conflict.h, we can avoid that by making below changes: --In SubscriptionInfo(), we can have a list of ConflictTypeResolver structure and fill the elements of the list in getSubscriptions() simply by output of pg_subscription_conflict. --Then in dumpSubscription() we can traverse the list to verify if the resolver is the default one, if so, skip the dump. We can create a new function to return whether the resolver is default or not. --We can get rid of enum ConflictType, enum ConflictResolver, ConflictResolverNames, ConflictTypeDefaultResolvers from pg_dump.h 17) In describe.c, we can have an 'order by' in the query so that order is not changed everytime we update a resolver. Please see this: For sub1, \dRs was showing below as output for Conflict Resolvers: insert_exists = error, update_origin_differs = apply_remote, update_exists = error, update_missing = skip, delete_origin_differs = apply_remote, delete_missing = skip Once I update resolver, the order gets changed: postgres=# ALTER SUBSCRIPTION sub1 CONFLICT RESOLVER (insert_exists='apply_remote'); ALTER SUBSCRIPTION \dRs: update_origin_differs = apply_remote, update_exists = error, update_missing = skip, delete_origin_differs = apply_remote, delete_missing = skip, insert_exists = apply_remote 18) Similarly after making change 16, for pg_dump too, it will be good if we maintain the order and thus can have order-by in pg_dump's query as well. thanks Shveta
Re: Using per-transaction memory contexts for storing decoded tuples
On Mon, 23 Sept 2024 at 09:59, Amit Kapila wrote: > > On Sun, Sep 22, 2024 at 11:27 AM David Rowley wrote: > > > > On Fri, 20 Sept 2024 at 17:46, Amit Kapila wrote: > > > > > > On Fri, Sep 20, 2024 at 5:13 AM David Rowley wrote: > > > > In general, it's a bit annoying to have to code around this > > > > GenerationContext fragmentation issue. > > > > > > Right, and I am also slightly afraid that this may not cause some > > > regression in other cases where defrag wouldn't help. > > > > Yeah, that's certainly a possibility. I was hoping that > > MemoryContextMemAllocated() being much larger than logical_work_mem > > could only happen when there is fragmentation, but certainly, you > > could be wasting effort trying to defrag transactions where the > > changes all arrive in WAL consecutively and there is no > > defragmentation. It might be some other large transaction that's > > causing the context's allocations to be fragmented. I don't have any > > good ideas on how to avoid wasting effort on non-problematic > > transactions. Maybe there's something that could be done if we knew > > the LSN of the first and last change and the gap between the LSNs was > > much larger than the WAL space used for this transaction. That would > > likely require tracking way more stuff than we do now, however. > > > > With more information tracking, we could avoid some non-problematic > transactions but still, it would be difficult to predict that we > didn't harm many cases because to make the memory non-contiguous, we > only need a few interleaving small transactions. We can try to think > of ideas for implementing defragmentation in our code if we first can > prove that smaller block sizes cause problems. > > > With the smaller blocks idea, I'm a bit concerned that using smaller > > blocks could cause regressions on systems that are better at releasing > > memory back to the OS after free() as no doubt malloc() would often be > > slower on those systems. There have been some complaints recently > > about glibc being a bit too happy to keep hold of memory after free() > > and I wondered if that was the reason why the small block test does > > not cause much of a performance regression. I wonder how the small > > block test would look on Mac, FreeBSD or Windows. I think it would be > > risky to assume that all is well with reducing the block size after > > testing on a single platform. > > > > Good point. We need extensive testing on different platforms, as you > suggest, to verify if smaller block sizes caused any regressions. I did similar tests on Windows. rb_mem_block_size was changed from 8kB to 8MB. Below table shows the result (average of 5 runs) and Standard Deviation (of 5 runs) for each block-size. === block-size |Average time (ms) |Standard Deviation (ms) - 8kb|12580.879 ms |144.6923467 16kb |12442.7256 ms |94.02799006 32kb |12370.7292 ms |97.7958552 64kb |11877.4888 ms |222.2419142 128kb|11828.8568 ms |129.732941 256kb|11801.086 ms |20.60030913 512kb|12361.4172 ms |65.27390105 1MB |12343.3732 ms |80.84427202 2MB |12357.675 ms |79.40017604 4MB |12395.8364 ms |76.78273689 8MB |11712.8862 ms |50.74323039 == >From the results, I think there is a small regression for small block size. I ran the tests in git bash. I have also attached the test script. Thanks and Regards, Shlok Kyal #!/bin/bash if [ -z "$1" ] then size="8kB" else size=$1 fi echo 'Clean up' echo $size ./pg_ctl stop -D data rm -rf data* logfile echo 'Set up' ./initdb -D data -U postgres cat << EOF >> data/postgresql.conf wal_level = logical autovacuum = false checkpoint_timeout = 1h shared_buffers = '10GB' work_mem = '1GB' logical_decoding_work_mem = '2097151 kB' max_wal_size = 20GB min_wal_size = 10GB rb_mem_block_size = $size EOF ./pg_ctl -D data start -w -l logfile ( echo -E "SELECT * FROM pg_create_logical_replication_slot('test', 'test_decoding');" echo -E "CREATE TABLE foo (id int);" echo -E "INSERT INTO foo VALUES (generate_series(1, 1000));" ) | ./psql -U postgres for i in `seq 1 5` do ( echo -E "\timing" echo -E "SELECT count(*) FROM pg_logical_slot_peek_changes('test', NULL, NULL);" ) | ./psql -U postgres done
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
Christoph Berg writes: > Re: Tom Lane >> (It might be worth some effort to trim away comments appearing >> just before a command, but I didn't tackle that here.) > The "error when psql" comments do look confusing, but I guess in other > places the comment just before the query adds valuable context, so I'd > say leaving the comments in is ok. It looks like if we did want to suppress that, the right fix is to make gram.y track statement start locations more honestly, as in 0002 attached (0001 is the same as before). This'd add a few cycles per grammar nonterminal reduction, which is kind of annoying but probably is negligible in the grand scheme of things. Still, I'd not propose it just for this. But if memory serves, we've had previous complaints about pg_stat_statements failing to strip leading comments from queries, and this'd fix that. I think it likely also improves error cursor positioning for cases involving empty productions --- I'm a tad surprised that no other regression cases changed. regards, tom lane diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index fab59ad5f6..0fae1332d2 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -54,6 +54,7 @@ #include "funcapi.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "nodes/queryjumble.h" #include "storage/fd.h" #include "tcop/utility.h" #include "utils/acl.h" @@ -107,6 +108,17 @@ typedef struct ExtensionVersionInfo struct ExtensionVersionInfo *previous; /* current best predecessor */ } ExtensionVersionInfo; +/* + * Information for script_error_callback() + */ +typedef struct +{ + const char *sql; /* entire script file contents */ + const char *filename; /* script file pathname */ + ParseLoc stmt_location; /* current stmt start loc, or -1 if unknown */ + ParseLoc stmt_len; /* length in bytes; 0 means "rest of string" */ +} script_error_callback_arg; + /* Local functions */ static List *find_update_path(List *evi_list, ExtensionVersionInfo *evi_start, @@ -670,9 +682,60 @@ read_extension_script_file(const ExtensionControlFile *control, return dest_str; } +/* + * error context callback for failures in script-file execution + */ +static void +script_error_callback(void *arg) +{ + script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg; + int syntaxerrposition; + const char *lastslash; + + /* If it's a syntax error, convert to internal syntax error report */ + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0) + { + /* + * We must report the whole string because otherwise details such as + * psql's line number report would be wrong. + */ + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(callback_arg->sql); + } + else if (callback_arg->stmt_location >= 0) + { + /* + * Since no syntax cursor will be shown, it's okay and helpful to trim + * the reported query string to just the current statement. + */ + const char *query = callback_arg->sql; + int location = callback_arg->stmt_location; + int len = callback_arg->stmt_len; + + query = CleanQuerytext(query, &location, &len); + internalerrquery(pnstrdup(query, len)); + } + + /* + * Trim the reported file name to remove the path. We know that + * get_extension_script_filename() inserted a '/', regardless of whether + * we're on Windows. + */ + lastslash = strrchr(callback_arg->filename, '/'); + if (lastslash) + lastslash++; + else + lastslash = callback_arg->filename; /* shouldn't happen, but cope */ + errcontext("extension script file \"%s\"", lastslash); +} + /* * Execute given SQL string. * + * The filename the string came from is also provided, for error reporting. + * * Note: it's tempting to just use SPI to execute the string, but that does * not work very well. The really serious problem is that SPI will parse, * analyze, and plan the whole string before executing any of it; of course @@ -682,12 +745,27 @@ read_extension_script_file(const ExtensionControlFile *control, * could be very long. */ static void -execute_sql_string(const char *sql) +execute_sql_string(const char *sql, const char *filename) { + script_error_callback_arg callback_arg; + ErrorContextCallback scripterrcontext; List *raw_parsetree_list; DestReceiver *dest; ListCell *lc1; + /* + * Setup error traceback support for ereport(). + */ + callback_arg.sql = sql; + callback_arg.filename = filename; + callback_arg.stmt_location = -1; + callback_arg.stmt_len = -1; + + scripterrcontext.callback = script_error_callback; + scripterrcontext.arg = (void *) &callback_arg; + scripterrcontext.previous = error_context_stack; + error_context_stack = &scripterrcontext; + /* * Parse the SQL string into a list of raw parse trees. */ @@ -709,6 +787,10 @@ execute_sql_string(const char *sql) List *stmt_list; ListCell *lc2; + /* Report location of this query for error c
Re: [PoC] Federated Authn/z with OAUTHBEARER
Jacob Champion wrote: > Peter asked me if there were plans to provide a "standard" validator > module, say as part of contrib. The tricky thing is that Bearer > validation is issuer-specific, and many providers give you an opaque > token that you're not supposed to introspect at all. > > We could use token introspection (RFC 7662) for online verification, > but last I looked at it, no one had actually implemented those > endpoints. For offline verification, I think the best we could do > would be to provide a generic JWT Profile (RFC 9068) validator, but > again I don't know if anyone is actually providing those token formats > in practice. I'm inclined to push that out into the future. Have you considered sending the token for validation to the server, like this curl -X GET "https://www.googleapis.com/oauth2/v3/userinfo"; -H "Authorization: Bearer $TOKEN" and getting the userid (e.g. email address) from the response, as described in [1]? ISTM that this is what pgadmin4 does - in paricular, see the get_user_profile() function in web/pgadmin/authenticate/oauth2.py. [1] https://www.oauth.com/oauth2-servers/signing-in-with-google/verifying-the-user-info/ -- Antonin Houska Web: https://www.cybertec-postgresql.com
Re: MAINTAIN privilege -- what do we need to un-revert it?
On Fri, Sep 27, 2024 at 01:27:38PM -0700, Jeff Davis wrote: > Looks good to me. Thanks, committed. -- nathan
Re: Vacuum statistics
On Fri, Sep 27, 2024 at 2:16 PM Masahiko Sawada wrote: > > Hi, > > On Thu, Sep 5, 2024 at 2:01 PM Alena Rybakina > wrote: > > > > Hi! Thank you for your review! > > > > On 05.09.2024 15:47, jian he wrote: > > > > On Thu, Sep 5, 2024 at 1:23 AM Alena Rybakina > > wrote: > > > > Hi, all! > > > > I have attached the new version of the code and the diff files > > (minor-vacuum.no-cbot). > > Thank you for updating the patches. I've reviewed the 0001 patch and > have two comments. I took a very brief look at this and was wondering if it was worth having a way to make the per-table vacuum statistics opt-in (like a table storage parameter) in order to decrease the shared memory footprint of storing the stats. - Melanie
Re: BitmapHeapScan streaming read user and prelim refactoring
On Mon, Aug 26, 2024 at 10:49 AM Tom Lane wrote: > > Robert Haas writes: > > On Wed, Jun 19, 2024 at 2:21 PM Melanie Plageman > > wrote: > >> If we want to make it possible to use no tools and only manually grep > >> for struct members, that means we can never reuse struct member names. > >> Across a project of our size, that seems like a very serious > >> restriction. Adding prefixes in struct members makes it harder to read > >> code -- both because it makes the names longer and because people are > >> more prone to abbreviate the meaningful parts of the struct member > >> name to make the whole name shorter. > > > I don't think we should go so far as to never reuse a structure member > > name. But I also do use 'git grep' a lot to find stuff, and I don't > > appreciate it when somebody names a key piece of machinery 'x' or 'n' > > or something, especially when references to that thing could > > reasonably occur almost anywhere in the source code. So if somebody is > > creating a struct whose names are fairly generic and reasonably short, > > I like the idea of using a prefix for those names. If the structure > > members are things like that_thing_i_stored_behind_the_fridge (which > > is long) or cytokine (which is non-generic) then they're greppable > > anyway and it doesn't really matter. But surely changing something > > like rs_flags to just flags is just making everyone's life harder: > > I'm with Robert here: I care quite a lot about the greppability of > field names. I'm not arguing for prefixes everywhere, but I don't > think we should strip out prefixes we've already created, especially > if the result will be to have extremely generic field names. Okay, got it -- folks like the prefixes. I'm picking this patch set back up again after a long pause and I will restore all prefixes. What does the rs_* in the HeapScanDescData stand for, though? - Melanie
Re: Vacuum statistics
Hi, On Fri, 2024-09-27 at 11:15 -0700, Masahiko Sawada wrote: > I'm concerned that a pg_stat_vacuum_tables view has some duplicated > statistics that we already collect in different ways. For instance, > total_blks_{read,hit,dirtied,written} are already tracked at > system-level by pg_stat_io, pg_stat_vacuum_tables.total_blks_{read,hit,dirtied,written} tracks blocks used by vacuum in different ways while vacuuming this particular table while pg_stat_io tracks blocks used by vacuum on the cluster level. > and per-relation block I/O statistics can > be collected using pg_stat_statements. This is impossible. pg_stat_statements tracks block statistics on a statement level. One statement could touch many tables and many indexes, and all used database blocks will be counted by the pg_stat_statements counters on a statement-level. Autovacuum statistics won't be accounted by the pg_stat_statements. After all, pg_stat_statements won't hold the statements statistics forever. Under pressure of new statements the statement eviction can happen and statistics will be lost. All of the above is addressed by relation-level vacuum statistics held in the Cumulative Statistics System proposed by this patch. -- regards, Andrei Zubkov Postgres Professional
Re: pg_ctl/miscinit: print "MyStartTime" as a long long instead of long to avoid 2038 problem.
On Fri, Sep 27, 2024 at 02:48:01PM +, Max Johnson wrote: > I think that it would be a good idea to include these fixes in the next > minor release. After working for a couple months on getting our embedded > systems 2038 compliant, it has become very apparent that 2038 will be a > substantial ordeal. Maximizing the number of systems that include this > fix would make things a little easier when that time comes around. Alright. I was able to back-patch it to v12 without too much trouble, fortunately. I'll commit that soon unless anyone else has feedback. -- nathan
Re: SQL:2023 JSON simplified accessor support
On Sep 27, 2024, at 12:07, Andrew Dunstan wrote: > That would defeat being able to chain these. Not if it’s a different operator. But I’m fine to just keep using ->> at the end of a chain. D
Re: BitmapHeapScan streaming read user and prelim refactoring
On Wed, Jun 19, 2024 at 2:13 PM Melanie Plageman wrote: > > On Wed, Jun 19, 2024 at 12:38 PM Tomas Vondra > wrote: > > > > + * XXX I don't understand why we should have this special node if we > > > + * don't have special nodes for other scan types. > > > > > > In this case, up until the final commit (using the read stream > > > interface), there are six fields required by bitmap heap scan that are > > > not needed by any other user of HeapScanDescData. There are also > > > several members of HeapScanDescData that are not needed by bitmap heap > > > scans and all of the setup in initscan() for those fields is not > > > required for bitmap heap scans. > > > > > > Also, because the BitmapHeapScanDesc needs information like the > > > ParallelBitmapHeapState and prefetch_maximum (for use in prefetching), > > > the scan_begin() callback would have to take those as parameters. I > > > thought adding so much bitmap table scan-specific information to the > > > generic table scan callbacks was a bad idea. > > > > > > Once we add the read stream API code, the number of fields required > > > for bitmap heap scan that are in the scan descriptor goes down to > > > three. So, perhaps we could justify having that many bitmap heap > > > scan-specific fields in the HeapScanDescData. > > > > > > Though, I actually think we could start moving toward having > > > specialized scan descriptors if the requirements for that scan type > > > are appreciably different. I can't think of new code that would be > > > added to the HeapScanDescData that would have to be duplicated over to > > > specialized scan descriptors. > > > > > > With the final read stream state, I can see the argument for bloating > > > the HeapScanDescData with three extra members and avoiding making new > > > scan descriptors. But, for the intermediate patches which have all of > > > the bitmap prefetch members pushed down into the HeapScanDescData, I > > > think it is really not okay. Six members only for bitmap heap scans > > > and two bitmap-specific members to begin_scan() seems bad. > > > > > > What I thought we plan to do is commit the refactoring patches > > > sometime after the branch for 18 is cut and leave the final read > > > stream patch uncommitted so we can do performance testing on it. If > > > you think it is okay to have the six member bloated HeapScanDescData > > > in master until we push the read stream code, I am okay with removing > > > the BitmapTableScanDesc and BitmapHeapScanDesc. > > > > > > > I admit I don't have a very good idea what the ideal / desired state > > look like. My comment is motivated solely by the feeling that it seems > > strange to have one struct serving most scan types, and then a special > > struct for one particular scan type ... > > I see what you are saying. We could make BitmapTableScanDesc inherit > from TableScanDescData which would be similar to what we do with other > things like the executor scan nodes themselves. We would waste space > and LOC with initializing the unneeded members, but it might seem less > weird. > > Whether we want the specialized scan descriptors at all is probably > the bigger question, though. > > The top level BitmapTableScanDesc is motivated by wanting fewer bitmap > table scan specific members passed to scan_begin(). And the > BitmapHeapScanDesc is motivated by this plus wanting to avoid bloating > the HeapScanDescData. > > If you look at at HEAD~1 (with my patches applied) and think you would > be okay with > 1) the contents of the BitmapHeapScanDesc being in the HeapScanDescData and > 2) the extra bitmap table scan-specific parameters in scan_begin_bm() > being passed to scan_begin() > > then I will remove the specialized scan descriptors. > > The final state (with the read stream) will still have three bitmap > heap scan-specific members in the HeapScanDescData. > > Would it help if I do a version like this so you can see what it is like? I revisited this issue (how to keep from bloating the Heap and Table scan descriptors and adding many parameters to the scan_begin() table AM callback) and am trying to find a less noisy way to address it than my previous proposal. I've attached a prototype of what I think might work applied on top of master instead of on top of my patchset. For the top-level TableScanDescData, I suggest we use a union with the members of each scan type in it in anonymous structs (see 0001). This will avoid too much bloat because there are other scan types (like TID Range scans) whose members we can move into the union. It isn't great, but it avoids a new top-level scan descriptor and changes to the generic scanstate node. We will still have to pass the parameters needed to set up the parallel bitmap iterators to scan_begin() in the intermediate patches, but if we think that we can actually get the streaming read version of bitmapheapscan in in the same release, then I think it should be okay because the final version of these table AM callbacks do not nee
Re: Conflict Detection and Resolution
Here are some review comments for v14-0001. This is a WIP, but here are my comments for all the SGML parts. (There will be some overlap here with comments already posted by Shveta) == 1. file modes after applying the patch mode change 100644 => 100755 doc/src/sgml/ref/alter_subscription.sgml mode change 100644 => 100755 doc/src/sgml/ref/create_subscription.sgml What's going on here? Why are those SGMLs changed to executable? == Commit message 2. nit - a missing period in the first sentence nit - typo /reseting/resetting/ == doc/src/sgml/logical-replication.sgml 3. - Conflicts + Conflicts and conflict resolution nit - change the capitalisation to "and Conflict Resolution" to match other titles. ~~~ 4. + Additional logging is triggered in various conflict scenarios, each identified as a + conflict type, and the conflict statistics are collected (displayed in the + pg_stat_subscription_stats view). + Users have the option to configure a conflict_resolver for each + conflict_type when creating a subscription. + For more information on the supported conflict_types detected and + conflict_resolvers, refer to section + CONFLICT RESOLVERS. + nit - "Additional logging is triggered" sounds strange. I reworded this in the nits attachment. Please see if you approve. nit - The "conflict_type" and "conflict_resolver" you are referring to here are syntax elements of the CREATE SUBSCRIPTION, so here I think they should just be called (without the underscores) "conflict type" and "conflict resolver". nit - IMO this would be better split into multiple paragraphs. nit - There is no such section called "CONFLICT RESOLVERS". I reworded this link text. == doc/src/sgml/monitoring.sgml 5. The changes here all render with the link including the type "(enum)" displayed, which I thought it unnecessary/strange. For example: See insert_exists (enum) for details about this conflict. IIUC there is no problem here, but maybe the other end of the link needed to define xreflabels. I have made the necessary modifications in the create_subscription.sgml. == doc/src/sgml/ref/alter_subscription.sgml 6. +ALTER SUBSCRIPTION name CONFLICT RESOLVER ( conflict_type [= conflict_resolver] [, ...] ) This syntax seems wrong to me. Currently, it says: ALTER SUBSCRIPTION name CONFLICT RESOLVER ( conflict_type [= conflict_resolver] [, ...] ) But, shouldn't that say: ALTER SUBSCRIPTION name CONFLICT RESOLVER ( conflict_type = conflict_resolver [, ...] ) ~~~ 7. +ALTER SUBSCRIPTION name RESET CONFLICT RESOLVER FOR (conflict_type) I can see that this matches the implementation, but I was wondering why don't you permit resetting multiple conflict_types at the same time. e.g. what if I want to reset some but not ALL? ~~~ nit - there are some minor whitespace indent problems in the SGML ~~~ 8. + +CONFLICT RESOLVER ( conflict_type [= conflict_resolver] [, ... ] ) + + + This clause alters either the default conflict resolvers or those set by . + Refer to section CONFLICT RESOLVERS + for the details on supported conflict_types and conflict_resolvers. + + + + + + conflict_type + + +The conflict type being reset to its default resolver setting. +For details on conflict types and their default resolvers, refer to section CONFLICT RESOLVERS + + + + This section seems problematic: e.g the syntax seems wrong same as before. ~ There are other nits. (I've given a rough fix in the nits attachment. Please see it and make it better). nit - why do you care if it is "either the default conflict resolvers or those set...". Why not just say "current resolver" nit - it does not mention 'conflict_resolver' type in the normal way nit - there is no actual section called "CONFLICT RESOLVERS" nit - the part that says "The conflict type being reset to its default resolver setting." is bogus for this form of the ALTER statement. ~~~ 9. There is no description for the "RESET CONFLICT RESOLVER ALL" ~~~ 10. There is no description for the "RESET CONFLICT RESOLVER FOR (conflict_type)" == doc/src/sgml/ref/create_subscription.sgml 11. General - Order + +CONFLICT RESOLVER ( conflict_type = + This optional clause specifies options for conflict resolvers for different conflict_types. + nit - IMO we don't need the words "options for" here. ~~~ 16. + + The conflict_type and their default behaviour are listed below. nit - sounded strange to me. reworded it slightly. ~~~ 17. + nit - Here, and for all other conflict types, add "xreflabel". See my review comment #5 for the reason why. ~~~ 18. + + The conflict_resolver and their behaviour + are listed below. Users can use any of the following resolvers for automatic conflict + resolution. + nit - reworded this too, to be like the previous review comment. ~~~ 19. General - readability. 19a. IMO the information about what are the def
Changing the default random_page_cost value
tl;dr let's assume SSDs are popular and HDDs are the exception and flip our default As I write this email, it's the year 2024. I think it is time we lower our "default" setting of random_page_cost (as set in postgresql.conf.sample and the docs). Even a decade ago, the current default of 4 was considered fairly conservative and often lowered. The git logs shows that this value was last touched in 2006, during the age of spinning metal. We are now in a new era, the age of SSDs, and thus we should lower this default value to reflect the fact that the vast majority of people using Postgres these days are doing so on solid state drives. We tend to stay ultra-conservative in all of our settings, but we also need to recognize when there has been a major shift in the underlying hardware - and calculations that our defaults are based on. Granted, there are other factors involved, and yes, perhaps we should tweak some of the similar settings as well, but ranom_page_cost is the one setting most out of sync with today's hardware realities. So I'll be brave and throw a number out there: 1.2. And change our docs to say wordage like "if you are using an older hard disk drive technology, you may want to try raising rpc" to replace our fairly-hidden note about SSDs buried in the last sentence - of the fourth paragraph - of the rpc docs. Real data about performance on today's SSDs are welcome, and/or some way to generate a more accurate default. Cheers, Greg
Re: [PoC] Federated Authn/z with OAUTHBEARER
On Fri, Sep 27, 2024 at 10:58 AM Antonin Houska wrote: > Have you considered sending the token for validation to the server, like this > > curl -X GET "https://www.googleapis.com/oauth2/v3/userinfo"; -H > "Authorization: Bearer $TOKEN" In short, no, but I'm glad you asked. I think it's going to be a common request, and I need to get better at explaining why it's not safe, so we can document it clearly. Or else someone can point out that I'm misunderstanding, which honestly would make all this much easier and less complicated. I would love to be able to do it that way. We cannot, for the same reason libpq must send the server an access token instead of an ID token. The /userinfo endpoint tells you who the end user is, but it doesn't tell you whether the Bearer is actually allowed to access the database. That difference is critical: it's entirely possible for an end user to be authorized to access the database, *and yet* the Bearer token may not actually carry that authorization on their behalf. (In fact, the user may have actively refused to give the Bearer that permission.) That's why people are so pedantic about saying that OAuth is an authorization framework and not an authentication framework. To illustrate, think about all the third-party web services out there that ask you to Sign In with Google. They ask Google for permission to access your personal ID, and Google asks you if you're okay with that, and you either allow or deny it. Now imagine that I ran one of those services, and I decided to become evil. I could take my legitimately acquired Bearer token -- which should only give me permission to query your Google ID -- and send it to a Postgres database you're authorized to access. The server is supposed to introspect it, say, "hey, this token doesn't give the bearer access to the database at all," and shut everything down. For extra credit, the server could notice that the client ID tied to the access token isn't even one that it recognizes! But if all the server does is ask Google, "what's the email address associated with this token's end user?", then it's about to make some very bad decisions. The email address it gets back doesn't belong to Jacob the Evil Bearer; it belongs to you. Now, the token introspection endpoint I mentioned upthread should give us the required information (scopes, etc.). But Google doesn't implement that one. In fact they don't seem to have implemented custom scopes at all in the years since I started work on this feature, which makes me think that people are probably not going to be able to safely log into Postgres using Google tokens. Hopefully there's some feature buried somewhere that I haven't seen. Let me know if that makes sense. (And again: I'd love to be proven wrong. It would improve the reach of the feature considerably if I am.) Thanks, --Jacob
Re: Using per-transaction memory contexts for storing decoded tuples
On Fri, Sep 27, 2024 at 12:39 AM Shlok Kyal wrote: > > On Mon, 23 Sept 2024 at 09:59, Amit Kapila wrote: > > > > On Sun, Sep 22, 2024 at 11:27 AM David Rowley wrote: > > > > > > On Fri, 20 Sept 2024 at 17:46, Amit Kapila > > > wrote: > > > > > > > > On Fri, Sep 20, 2024 at 5:13 AM David Rowley > > > > wrote: > > > > > In general, it's a bit annoying to have to code around this > > > > > GenerationContext fragmentation issue. > > > > > > > > Right, and I am also slightly afraid that this may not cause some > > > > regression in other cases where defrag wouldn't help. > > > > > > Yeah, that's certainly a possibility. I was hoping that > > > MemoryContextMemAllocated() being much larger than logical_work_mem > > > could only happen when there is fragmentation, but certainly, you > > > could be wasting effort trying to defrag transactions where the > > > changes all arrive in WAL consecutively and there is no > > > defragmentation. It might be some other large transaction that's > > > causing the context's allocations to be fragmented. I don't have any > > > good ideas on how to avoid wasting effort on non-problematic > > > transactions. Maybe there's something that could be done if we knew > > > the LSN of the first and last change and the gap between the LSNs was > > > much larger than the WAL space used for this transaction. That would > > > likely require tracking way more stuff than we do now, however. > > > > > > > With more information tracking, we could avoid some non-problematic > > transactions but still, it would be difficult to predict that we > > didn't harm many cases because to make the memory non-contiguous, we > > only need a few interleaving small transactions. We can try to think > > of ideas for implementing defragmentation in our code if we first can > > prove that smaller block sizes cause problems. > > > > > With the smaller blocks idea, I'm a bit concerned that using smaller > > > blocks could cause regressions on systems that are better at releasing > > > memory back to the OS after free() as no doubt malloc() would often be > > > slower on those systems. There have been some complaints recently > > > about glibc being a bit too happy to keep hold of memory after free() > > > and I wondered if that was the reason why the small block test does > > > not cause much of a performance regression. I wonder how the small > > > block test would look on Mac, FreeBSD or Windows. I think it would be > > > risky to assume that all is well with reducing the block size after > > > testing on a single platform. > > > > > > > Good point. We need extensive testing on different platforms, as you > > suggest, to verify if smaller block sizes caused any regressions. > > I did similar tests on Windows. rb_mem_block_size was changed from 8kB > to 8MB. Below table shows the result (average of 5 runs) and Standard > Deviation (of 5 runs) for each block-size. > > === > block-size |Average time (ms) |Standard Deviation (ms) > - > 8kb|12580.879 ms |144.6923467 > 16kb |12442.7256 ms |94.02799006 > 32kb |12370.7292 ms |97.7958552 > 64kb |11877.4888 ms |222.2419142 > 128kb|11828.8568 ms |129.732941 > 256kb|11801.086 ms |20.60030913 > 512kb|12361.4172 ms |65.27390105 > 1MB |12343.3732 ms |80.84427202 > 2MB |12357.675 ms |79.40017604 > 4MB |12395.8364 ms |76.78273689 > 8MB |11712.8862 ms |50.74323039 > == > > From the results, I think there is a small regression for small block size. > > I ran the tests in git bash. I have also attached the test script. Thank you for testing on Windows! I've run the same benchmark on Mac (Sonoma 14.7, M1 Pro): 8kB: 4852.198 ms 16kB: 4822.733 ms 32kB: 4776.776 ms 64kB: 4851.433 ms 128kB: 4804.821 ms 256kB: 4781.778 ms 512kB: 4776.486 ms 1MB: 4783.456 ms 2MB: 4770.671 ms 4MB: 4785.800 ms 8MB: 4747.447 ms I can see there is a small regression for small block sizes. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Vacuum statistics
Hi, On Thu, Sep 5, 2024 at 2:01 PM Alena Rybakina wrote: > > Hi! Thank you for your review! > > On 05.09.2024 15:47, jian he wrote: > > On Thu, Sep 5, 2024 at 1:23 AM Alena Rybakina > wrote: > > Hi, all! > > I have attached the new version of the code and the diff files > (minor-vacuum.no-cbot). Thank you for updating the patches. I've reviewed the 0001 patch and have two comments. I think we can split the 0001 patch into two parts: adding pg_stat_vacuum_tables system views that shows the vacuum statistics that we are currently collecting such as scanned_pages and removed_pages, and another one is to add new statistics to collect such as vacrel->set_all_visible_pages and visibility map updates. I'm concerned that a pg_stat_vacuum_tables view has some duplicated statistics that we already collect in different ways. For instance, total_blks_{read,hit,dirtied,written} are already tracked at system-level by pg_stat_io, and per-relation block I/O statistics can be collected using pg_stat_statements. Having duplicated statistics consumes more memory for pgstat and could confuse users if these statistics are not consistent. I think it would be better to avoid collecting duplicated statistics in different places. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
Re: Tom Lane > Perhaps. I spent a little more effort on this and added code to > report errors that don't come with an error location. On those, > we don't have any constraints about what to report in the QUERY > field, so I made it trim the string to just the current query > within the script, which makes things quite a bit better. You > can see the results in the test_extensions regression test changes. That looks very good me, thanks! > (It might be worth some effort to trim away comments appearing > just before a command, but I didn't tackle that here.) The "error when psql" comments do look confusing, but I guess in other places the comment just before the query adds valuable context, so I'd say leaving the comments in is ok. Christoph
Re: Thoughts about NUM_BUFFER_PARTITIONS
On Mon, Aug 05, 2024 at 12:32:20AM +0500, Andrey M. Borodin wrote: > I´ve found some dead code: BufMappingPartitionLockByIndex() is unused, > and unused for a long time. See patch 1. I don't see a reason to also get rid of BufTableHashPartition(), but otherwise this looks reasonable to me. It would probably be a good idea to first check whether there are any external callers we can find. > I´ve prototyped similar GUC for anyone willing to do such experiments. > See patch 2, 4. Probably, I´ll do some experiments too, on customer's > clusters and workloads :) Like Tomas, I'm not too wild about making this a GUC. And as Heikki pointed out upthread, a good first step would be to benchmark different NUM_BUFFER_PARTITIONS settings on modern hardware. I suspect the current setting is much lower than optimal (e.g., I doubled it and saw a TPS boost for read-only pgbench on an i5-13500T), but I don't think we fully understand the performance characteristics with different settings yet. If we find that the ideal setting depends heavily on workload/hardware, then perhaps we can consider adding a GUC, but I don't think we are there yet. >> Anyway, this value is inherently a trade off. If it wasn't, we'd set it >> to something super high from the start. But having more partitions of >> the lock table has a cost too, because some parts need to acquire all >> the partition locks (and that's O(N) where N = number of partitions). > > I´ve found none such cases, actually. Or, perhaps, I was not looking good > enough. pg_buffercache iterates over buffers and releases locks. See > patch 3 to fix comments. Yeah, I think 0003 is reasonable, too. pg_buffercache stopped acquiring all the partition locks in v10 (see commit 6e65454), which is also the commit that removed all remaining uses of BufMappingPartitionLockByIndex(). In fact, I think BufMappingPartitionLockByIndex() was introduced just for this pg_buffercache use-case (see commit ea9df81). -- nathan
Re: MAINTAIN privilege -- what do we need to un-revert it?
On Fri, Sep 27, 2024 at 09:22:48AM -0700, Jeff Davis wrote: > I suggest that we add the wording to the > query portion of the doc, near "security- > restricted operation". How does this look? diff --git a/doc/src/sgml/ref/create_materialized_view.sgml b/doc/src/sgml/ref/create_materialized_view.sgml index 0d2fea2b97..62d897931c 100644 --- a/doc/src/sgml/ref/create_materialized_view.sgml +++ b/doc/src/sgml/ref/create_materialized_view.sgml @@ -143,7 +143,9 @@ CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] table_name A SELECT, TABLE, or VALUES command. This query will run within a security-restricted operation; in particular, calls to functions that - themselves create temporary tables will fail. + themselves create temporary tables will fail. Also, while the query is + running, the is temporarily changed to + pg_catalog, pg_temp. -- nathan
Re: MAINTAIN privilege -- what do we need to un-revert it?
On Fri, 2024-09-27 at 15:04 -0500, Nathan Bossart wrote: > On Fri, Sep 27, 2024 at 09:22:48AM -0700, Jeff Davis wrote: > > I suggest that we add the wording to the > > query portion of the doc, near > > "security- > > restricted operation". > > How does this look? Looks good to me. Regards, Jeff Davis
Re: Vacuum statistics
On Fri, Sep 27, 2024 at 12:19 PM Melanie Plageman wrote: > > On Fri, Sep 27, 2024 at 2:16 PM Masahiko Sawada wrote: > > > > Hi, > > > > On Thu, Sep 5, 2024 at 2:01 PM Alena Rybakina > > wrote: > > > > > > Hi! Thank you for your review! > > > > > > On 05.09.2024 15:47, jian he wrote: > > > > > > On Thu, Sep 5, 2024 at 1:23 AM Alena Rybakina > > > wrote: > > > > > > Hi, all! > > > > > > I have attached the new version of the code and the diff files > > > (minor-vacuum.no-cbot). > > > > Thank you for updating the patches. I've reviewed the 0001 patch and > > have two comments. > > I took a very brief look at this and was wondering if it was worth > having a way to make the per-table vacuum statistics opt-in (like a > table storage parameter) in order to decrease the shared memory > footprint of storing the stats. I'm not sure how users can select tables that enable vacuum statistics as I think they basically want to have statistics for all tables, but I see your point. Since the size of PgStat_TableCounts approximately tripled by this patch (112 bytes to 320 bytes), it might be worth considering ways to reduce the number of entries or reducing the size of vacuum statistics. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Row pattern recognition
On Wed, Sep 18, 2024 at 10:00 PM Tatsuo Ishii wrote: > > Attached are the v22 patches. Just rebased. Thanks! With some bigger partitions, I hit an `ERROR: wrong pos: 1024`. A test that reproduces it is attached. While playing with the feature, I've been trying to identify runs of matched rows by eye. But it's pretty difficult -- the best I can do is manually count rows using a `COUNT(*) OVER ...`. So I'd like to suggest that MEASURES be part of the eventual v1 feature, if there's no other way to determine whether a row was skipped by a previous match. (That was less obvious to me before the fix in v17.) -- I've been working on an implementation [1] of SQL/RPR's "parenthesized language" and preferment order. (These are defined in SQL/Foundation 2023, section 9.41.) The tool gives you a way to figure out, for a given pattern, what matches are supposed to be attempted and in what order: $ ./src/test/modules/rpr/rpr_prefer "a b? a" ( ( a ( b ) ) a ) ( ( a ( ) ) a ) Many simple patterns result in an infinite set of possible matches. So if you use an unbounded quantifiers, you have to also use --max-rows to limit the size of the hypothetical window frame: $ ./src/test/modules/rpr/rpr_prefer --max-rows 2 "^ PERMUTE(a*, b+)? $" ( ( ^ ( ( ( ( ( ( a ) ( b ) ) ) - ) ) ) ) $ ) ( ( ^ ( ( ( ( ( ( ) ( b b ) ) ) - ) ) ) ) $ ) ( ( ^ ( ( ( ( ( ( ) ( b ) ) ) - ) ) ) ) $ ) ( ( ^ ( ( ( - ( ( ( b b ) ( ) ) ) ) ) ) ) $ ) ( ( ^ ( ( ( - ( ( ( b ) ( a ) ) ) ) ) ) ) $ ) ( ( ^ ( ( ( - ( ( ( b ) ( ) ) ) ) ) ) ) $ ) ( ( ^ ( ) ) $ ) I've found this useful to check my personal understanding of the spec and the match behavior, but it could also potentially be used to generate test cases, or to help users debug their own patterns. For example, a pattern that has a bunch of duplicate sequences in its PL is probably not very well optimized: $ ./src/test/modules/rpr/rpr_prefer --max-rows 4 "a+ a+" ( ( a a a ) ( a ) ) ( ( a a ) ( a a ) ) ( ( a a ) ( a ) ) ( ( a ) ( a a a ) ) ( ( a ) ( a a ) ) ( ( a ) ( a ) ) And patterns with catastrophic backtracking behavior tend to show a "sawtooth" pattern in the output, with a huge number of potential matches being generated relative to the number of rows in the frame. My implementation is really messy -- it leaks memory like a sieve, and I cannibalized the parser from ECPG, which just ended up as an exercise in teaching myself flex/bison. But if there's interest in having this kind of tool in the tree, I can work on making it reviewable. Either way, I should be able to use it to double-check more complicated test cases. A while back [2], you were wondering whether our Bison implementation would be able to parse the PATTERN grammar directly. I think this tool proves that the answer is "yes", but PERMUTE in particular causes a shift/reduce conflict. To fix it, I applied the same precedence workaround that we use for CUBE and ROLLUP. Thanks again! --Jacob [1] https://github.com/jchampio/postgres/tree/dev/rpr [2] https://www.postgresql.org/message-id/20230721.151648.412762379013769790.t-ishii%40sranhm.sra.co.jp commit 819c1abba21c9b59cceedd55962c3fdcb982aad5 Author: Jacob Champion Date: Thu Sep 26 14:12:38 2024 -0700 RPR: test larger window partitions The second test case fails with ERROR: wrong pos: 1024 diff --git a/src/test/regress/expected/rpr.out b/src/test/regress/expected/rpr.out index 0789e09435..dcfd67f143 100644 --- a/src/test/regress/expected/rpr.out +++ b/src/test/regress/expected/rpr.out @@ -834,3 +834,40 @@ ERROR: SEEK is not supported LINE 8: SEEK ^ HINT: Use INITIAL. +-- Smoke test for larger partitions. +WITH s AS ( + SELECT v, count(*) OVER w AS c + FROM (SELECT generate_series(1, 5000) v) + WINDOW w AS ( + ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING + AFTER MATCH SKIP PAST LAST ROW + INITIAL + PATTERN ( r+ ) + DEFINE r AS TRUE + ) +) +-- Should be exactly one long match across all rows. +SELECT * FROM s WHERE c > 0; + v | c +---+-- + 1 | 5000 +(1 row) + +WITH s AS ( + SELECT v, count(*) OVER w AS c + FROM (SELECT generate_series(1, 5000) v) + WINDOW w AS ( + ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING + AFTER MATCH SKIP PAST LAST ROW + INITIAL + PATTERN ( r ) + DEFINE r AS TRUE + ) +) +-- Every row should be its own match. +SELECT count(*) FROM s WHERE c > 0; + count +--- + 5000 +(1 row) + diff --git a/src/test/regress/sql/rpr.sql b/src/test/regress/sql/rpr.sql index 302e2b86a5..6ff61e6d60 100644 --- a/src/test/regress/sql/rpr.sql +++ b/src/test/regress/sql/rpr.sql @@ -403,3 +403,32 @@ SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER UP AS price > PREV(price), DOWN AS price < PREV(price) ); + +-- Smoke test for larger partitions. +WITH s AS ( + SELECT v, count(*) OVER w AS c + FROM (SELECT generate_series(1, 5000) v) + WINDOW w AS ( + ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWIN
Re: PATCH: jsonpath string methods: lower, upper, initcap, l/r/btrim, replace, split_part
On Sep 26, 2024, at 13:59, Florents Tselai wrote: > Speaking of extensible: the jsonpath standard does mention function > extensions [1] , > so it looks like we're covered by the standard, and the mutability aspect is > an implementation detail. No? That’s not the standard used for Postgres jsonpath. Postgres follows the SQL/JSON standard in the SQL standard, which is not publicly available, but a few people on the list have copies they’ve purchased and so could provide some context. In a previous post I wondered if the SQL standard had some facility for function extensions, but I suspect not. Maybe in the next iteration? > And having said that, the whole jsonb/jsonpath parser/executor infrastructure > is extremely powerful > and kinda under-utilized if we use it "only" for jsonpath. > Tbh, I can see it supporting more specific DSLs and even offering hooks for > extensions. > And I know for certain I'm not the only one thinking about this. > See [2] for example where they've lifted, shifted and renamed the > jsonb/jsonpath infra to build a separate language for graphs I’m all for extensibility, though jsonpath does need to continue to comply with the SQL standard. Do you have some idea of the sorts of hooks that would allow extension authors to use some of that underlying capability? Best, David
general purpose array_sort
Hi hackers, per David's suggestion, this patch implements general purpose array sort. We can do the following with this patch: SELECT array_sort('{1.1,3.3,5.5,2.2,4.4,6.6}'::float8[], 'asc'); SELECT array_sort('{abc DEF 123abc,ábc sßs ßss DÉF,DŽxxDŽ džxxDž Džxxdž,ȺȺȺ,ⱥⱥⱥ,ⱥȺ}'::text[]); SELECT array_sort('{abc DEF 123abc,ábc sßs ßss DÉF,DŽxxDŽ džxxDž Džxxdž,ȺȺȺ,ⱥⱥⱥ,ⱥȺ}'::text[], 'asc', 'pg_c_utf8'); -- Regards Junwang Zhao v1-0001-general-purpose-array_sort.patch Description: Binary data
Re: PATCH: jsonpath string methods: lower, upper, initcap, l/r/btrim, replace, split_part
> On 27 Sep 2024, at 12:45 PM, David E. Wheeler wrote: > > On Sep 26, 2024, at 13:59, Florents Tselai wrote: > >> Speaking of extensible: the jsonpath standard does mention function >> extensions [1] , >> so it looks like we're covered by the standard, and the mutability aspect is >> an implementation detail. No? > > That’s not the standard used for Postgres jsonpath. Postgres follows the > SQL/JSON standard in the SQL standard, which is not publicly available, but a > few people on the list have copies they’ve purchased and so could provide > some context. > > In a previous post I wondered if the SQL standard had some facility for > function extensions, but I suspect not. Maybe in the next iteration? > >> And having said that, the whole jsonb/jsonpath parser/executor >> infrastructure is extremely powerful >> and kinda under-utilized if we use it "only" for jsonpath. >> Tbh, I can see it supporting more specific DSLs and even offering hooks for >> extensions. >> And I know for certain I'm not the only one thinking about this. >> See [2] for example where they've lifted, shifted and renamed the >> jsonb/jsonpath infra to build a separate language for graphs > > I’m all for extensibility, though jsonpath does need to continue to comply > with the SQL standard. Do you have some idea of the sorts of hooks that would > allow extension authors to use some of that underlying capability? Re-tracing what I had to do 1. Define a new JsonPathItemType jpiMyExtType and map it to a JsonPathKeyword 2. Add a new JsonPathKeyword and make the lexer and parser aware of that, 3. Tell the main executor executeItemOptUnwrapTarget what to do when the new type is matched. I think 1, 2 are the trickiest because they require hooks to jsonpath_scan.l and parser jsonpath_gram.y 3. is the meat of a potential hook, which would be something like extern JsonPathExecResult executeOnMyJsonpathItem(JsonPathExecContext *cxt, JsonbValue *jb, JsonValueList *found); This should be called by the main executor executeItemOptUnwrapTarget when it encounters case jpiMyExtType It looks like quite an endeavor, to be honest.
Re: Truncate logs by max_log_size
> On 27 Sep 2024, at 03:30, Euler Taveira wrote: > > Let's say you arbitrarily provide max_log_size = 100 Consider max_log_size = 10Mb. The perspective might look very different. It’s not about WHERE anymore. It's a guard against heavy abuse. The feature looks very important for me. Best regards, Andrey Borodin.
Re: Extension security improvement: Add support for extensions with an owned schema
Hi, I've spent a bit of time looking at this patch. It seems there's a clear consensus that having "owned schemas" for extensions would be good for security. To me it also seems as a convenient way to organize stuff. It was possible to create extensions in a separate schema before, ofc, but that's up to the DBA. With this the extension author to enforce that. One thing that's not quite clear to me is what's the correct way for existing extensions to switch to an "owned schema". Let's say you have an extension. How do you transition to this? Can you just add it to the control file and then some magic happens? A couple minor comments: 1) doc/src/sgml/extend.sgml An extension is owned_schema if it requires a new dedicated schema for its objects. Such a requirement can make security concerns related to search_path injection much easier to reason about. The default is false, i.e., the extension can be installed into an existing schema. Doesn't "extension is owned_schema" sound a bit weird? I'd probably say "extension may own a schema" or something like that. Also, "requires a new dedicated schema" is a bit ambiguous. It's not clear if it means the schema is expected to exist, or if it creates the schema itself. And perhaps it should clarify what "much easier to reason about" means. That's pretty vague, and as a random extension author I wouldn't know about the risks to consider. Maybe there's a section about this stuff that we could reference? 2) doc/src/sgml/ref/create_extension.sgml relocated. The named schema must already exist if the extension's control file does not specify owned_schema. Seems a bit unclear, I'd say having "owned_schema = false" in the control file still qualifies as "specifies owned_schema". So might be better to say it needs to be set to true? Also, perhaps "dedicated_schema" would be better than "owned_schema"? I mean, the point is not that it's "owned" by the extension, but that there's nothing else in it. But that's nitpicking. 3) src/backend/commands/extension.c I'm not sure why AlterExtensionNamespace moves the privilege check. Why should it not check the privilege for owned schema too? 4) src/bin/pg_dump/pg_dump.c checkExtensionMembership has typo "owned_schem". Shouldn't binary_upgrade_extension_member still set ext=NULL in the for loop, the way the original code does that? The long if conditions might need some indentation, I guess. pgindent leaves them like this, but 100 columns seems a bit too much. I'd do a line break after each condition, I guess. regards -- Tomas Vondra
Re: Add has_large_object_privilege function
Michael Paquier writes: > - New functions introduced during a development cycle should use OIDs in > the range 8000-. See 98eab30b93d5, consisting of running > ./unused_oids to get a random range. There's been seen several fixups of this kind recently. Should we add a note about this to the comment at the top of all of the pg_*.dat files that have explicit oid assignments? People might be more likely to notice that than the the section over in bki.sgml. - ilmari
Re: pg_verifybackup: TAR format backup verification
On Fri, Sep 27, 2024 at 2:07 AM Amul Sul wrote: > Thank you, Robert. The code changes look much better now. Cool. > A few minor comments: > > +each tablespace, named after the tablespace's OID. If the backup > +is compressed, the relevant compression extension is added to the > +end of each file name. > > I am a bit unsure about the last line, especially the use of the word > "added." I feel like it's implying that we're adding something, which > isn't true. If you add .gz to the end of 16904.tar, you get 16904.tar.gz. This seems like correct English to me. > Typo: futher OK, thanks. > The addition of simple_ptr_list_destroy will be part of a separate > commit, correct? To me, it doesn't seem worth splitting that out into a separate commit. -- Robert Haas EDB: http://www.enterprisedb.com
Re: On disable_cost
On Fri, 27 Sept 2024 at 20:42, Laurenz Albe wrote: > 2. The "disabled nodes" are not only shown at the nodes where nodes >were actually disabled, but also at every nodes above these nodes. I'm also not a fan either and I'd like to see this output improved. It seems like it's easy enough to implement some logic to detect when a given node is disabled just by checking if the disable_nodes count is higher than the sum of the disabled_node field of the node's children. If there are no children (a scan node) and disabed_nodes > 0 then it must be disabled. There's even a nice fast path where we don't need to check the children if disabled_nodes == 0. Here's a POC grade patch of how I'd rather see it looking. I opted to have a boolean field as I didn't see any need for an integer count. I also changed things around so we always display the boolean property in non-text EXPLAIN. Normally, we don't mind being more verbose there. I also fixed a bug in make_sort() where disabled_nodes isn't being set properly. I'll do an independent patch for that if this goes nowhere. David poc_improve_disabled_nodes_explain_output.patch Description: Binary data
Re: Changing the default random_page_cost value
Greg Sabino Mullane writes: > So I'll be brave and throw a number out there: 1.2. And change our > docs to say wordage like "if you are using an older hard disk drive > technology, you may want to try raising rpc" to replace our > fairly-hidden note about SSDs buried in the last sentence - of the > fourth paragraph - of the rpc docs. It might also be worth mentioning cloudy block storage (e.g. AWS' EBS), which is typically backed by SSDs, but has extra network latency. - ilmari
Re: SQL:2023 JSON simplified accessor support
On Sep 26, 2024, at 16:45, Alexandra Wang wrote: > I didn’t run pgindent earlier, so here’s the updated version with the > correct indentation. Hope this helps! Oh, nice! I don’t suppose the standard also has defined an operator equivalent to ->>, though, has it? I tend to want the text output far more often than a JSON scalar. Best, David
IndexAccessMethod API & Index Only Scans
Hi all! Congrats on releasing v17! I'm adding support for Index Only Scans to a custom IAM impl and I've got a little dilemma. My IAM implementation is essentially a composite index that might have up to 32 columns and while it can return any column in the index definition it's quite expensive to do so. It doesn't have an already formed index tuple sitting right there like the built-in btree index does. Returning 1 or 2 columns is usually a win over the regular Index Scan version, but more than that and it's at least a wash if not a total loss. Since not all Index Only Scans need *all* the columns, was there ever any thought around providing the required attnos as a field on IndexScanDescData? That information seems to be there in the nodeindexonlyscan.c machinery. As a compromise, I've settled on my `amcanreturn` function saying it only knows how to return attno==1, which is sufficient for some query patterns, but I'd like to be able to have my index directly feed tuples into aggregate plans and such too. It's just too expensive for me to always return all the columns when generally maybe only 1 or 2 are needed (there doesn't seem to be a way to weave that into the cost estimations, but that wouldn't matter if I knew which specific columns I need to fetch out of my index). I'm pretty familiar with the IAM and the internals around it, but maybe I've missed something -- can I get at this information some other way? Thanks for your time and consideration! eric
Re: Pgoutput not capturing the generated columns
On Tue, Sep 24, 2024 at 5:16 AM Peter Smith wrote: > > Hi. Here are my review comments for v32-0001 > > You wrote: "I have addressed all the comments in the v32-0001 Patch.", > however, I found multiple old review comments not addressed. Please > give a reason if a comment is deliberately left out, otherwise, I will > assume they are omitted by accident and so keep repeating them. > > There were also still some unanswered questions from previous reviews, > so I have reminded you about those again here. > > == > Commit message > > 1. > This commit enables support for the 'publish_generated_columns' option > in logical replication, allowing the transmission of generated column > information and data alongside regular table changes. The option > 'publish_generated_columns' is a PUBLICATION parameter. > > ~ > > That PUBLICATION info in the 2nd sentence would be easier to say in > the 1st sentence. > SUGGESTION: > This commit supports the transmission of generated column information > and data alongside regular table changes. This behaviour is controlled > by a new PUBLICATION parameter ('publish_generated_columns'). > > ~~~ > > 2. > When 'publish_generated_columns' is false, generated columns are not > replicated, even when present in a PUBLICATION col-list. > > Hm. This contradicts the behaviour that Amit wanted, (e.g. > "column-list takes precedence"). So I am not sure if this patch is > already catering for the behaviour suggested by Amit or if that is yet > to come in v33. For now, I am assuming that 32* has not caught up with > the latest behaviour requirements, but that might be a wrong > assumption; perhaps it is only this commit message that is bogus. > > ~~~ > > 3. General. > > On the same subject, there is lots of code, like: > > if (att->attgenerated && !pub->pubgencols) > continue; > > I suspect that might not be quite what you want for the "column-list > takes precedence" behaviour, but I am not going to identify all those > during this review. It needs lots of combinations of column list tests > to verify it. > > == > doc/src/sgml/ddl.sgml > > 4ab. > nit - Huh?? Not changed the linkend as told in a previous review [1-#3a] > nit - Huh?? Not changed to call this a "parameter" instead of an > "option" as told in a previous review [1-#3b] > > == > doc/src/sgml/protocol.sgml > > 5. > - > - Next, the following message part appears for each column included in > - the publication (except generated columns): > - > - > > nit -- Huh?? I don't think you can just remove this whole paragraph. > But, probably you can just remove the "except generated columns" part. > I posted this same comment [4 #11] 20 patch versions back. > > == > doc/src/sgml/ref/create_publication.sgml > > 6abc. > nit - Huh?? Not changed the parameter ID as told in a previous review [1-#6] > nit - Huh?? Not removed paragraph "This option is only available..." > as told in a previous review. See [1-#7] > nit - Huh?? Not removed paragraph "This parameter can only be set" as > told in a previous review. See [1-#8] > > == > src/backend/catalog/pg_publication.c > > 7. > if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated) > - ereport(ERROR, > + ereport(WARNING, > errcode(ERRCODE_INVALID_COLUMN_REFERENCE), > - errmsg("cannot use generated column \"%s\" in publication column list", > + errmsg("specified generated column \"%s\" in publication column list > for publication with publish_generated_columns as false", > colname)); > > I did not understand how this WARNING can know > "publish_generated_columns as false"? Should the code be checking the > function parameter 'pubgencols'? > > The errmsg also seemed a bit verbose. How about: > "specified generated column \"%s\" in publication column list when > publish_generated_columns = false" > > == > src/backend/replication/logical/proto.c > > 8. > logicalrep_write_tuple: > logicalrep_write_attrs: > > Reminder. I think I have multiple questions about this code from > previous reviews that may be still unanswered. See [2 #4]. Maybe when > you implement Amit's "column list takes precedence" behaviour then > this code is fine as-is (because the replication message might include > gencols or not-gecols regardless of the 'publish_generated_columns' > value). But I don't think that is the current implementation, so > something did not quite seem right. I am not sure. If you say it is > fine then I will believe it, but the question [2 #4] remains > unanswered. > > == > src/backend/replication/pgoutput/pgoutput.c > > 9. > send_relation_and_attrs: > > Reminder: Here is another question that was answered from [2 #5]. I > did not really trust it for the current implementation, but for the > "column list takes precedence" behaviour probably it will be ok. > > ~~~ > > 10. > +/* > + * Prepare new column list bitmap. This includes all the columns of the > table. > + */ > +static Bitmapset * > +prepare_all_columns_bms(PGOutputData *data, RelationSyncEntry *en
Re: Pgoutput not capturing the generated columns
On Tue, Sep 24, 2024 at 7:08 AM Peter Smith wrote: > > Hi. Here are my v32-0002 review comments: > > == > src/backend/replication/logical/tablesync.c > > 1. fetch_remote_table_info > > /* > - * Get column lists for each relation. > + * Get column lists for each relation, and check if any of the > + * publications have the 'publish_generated_columns' parameter enabled. > > I am not 100% sure about this logic anymore. Maybe it is OK, but it > requires careful testing because with Amit's "column lists take > precedence" it is now possible for the publication to say > 'publish_generated_columns=false', but the publication can still > publish gencols *anyway* if they were specified in a column list. > > ~~~ > This comment is still open. Will fix this in the next version of patches. > 2. > /* > * Fetch info about column lists for the relation (from all the > * publications). > */ > + StringInfo pub_names = makeStringInfo(); > + > + get_publications_str(MySubscription->publications, pub_names, true); > resetStringInfo(&cmd); > appendStringInfo(&cmd, > ~ > > nit - The comment here seems misplaced. > > ~~~ > > 3. > + if (server_version >= 12) > + { > + has_pub_with_pubgencols = server_version >= 18 && > has_pub_with_pubgencols; > + > + if (!has_pub_with_pubgencols) > + appendStringInfo(&cmd, " AND a.attgenerated = ''"); > + } > > My previous review comment about this [1 #10] was: > Can the 'gencols_allowed' var be removed, and the condition just be > replaced with if (!has_pub_with_pubgencols)? It seems equivalent > unless I am mistaken. > > nit - So the current v32 code is not what I was expecting. What I > meant was 'has_pub_with_pubgencols' can only be true if server_version > >= 18, so I thought there was no reason to check it again. For > reference, I've changed it to like I meant in the nitpicks attachment. > Please see if that works the same. > > == > [1] my review of v31-0002. > https://www.postgresql.org/message-id/CAHut%2BPusbhvPrL1uN1TKY%3DFd4zu3h63eDebZvsF%3Duy%2BLBKTwgA%40mail.gmail.com > I have addressed all the comments in the v34-0002 Patch. Please refer to the updated v34-0002 Patch here in [1]. See [1] for the changes added. [1] https://www.postgresql.org/message-id/CAHv8RjJkUdYCdK_bL3yvEV%3DzKrA2dsnZYa1VMT2H5v0%2BqbaGbA%40mail.gmail.com Thanks and Regards, Shubham Khanna.
Re: pg_ctl/miscinit: print "MyStartTime" as a long long instead of long to avoid 2038 problem.
I think that it would be a good idea to include these fixes in the next minor release. After working for a couple months on getting our embedded systems 2038 compliant, it has become very apparent that 2038 will be a substantial ordeal. Maximizing the number of systems that include this fix would make things a little easier when that time comes around. Thanks, Max From: Nathan Bossart Sent: Thursday, September 26, 2024 9:38 PM To: Max Johnson Cc: pgsql-hack...@postgresql.org ; t...@sss.pgh.pa.us Subject: Re: pg_ctl/miscinit: print "MyStartTime" as a long long instead of long to avoid 2038 problem. On Wed, Sep 25, 2024 at 08:04:59PM +, Max Johnson wrote: > I think your patch looks good, no objections. I am happy to have contributed. Great. I've attached what I have staged for commit. My first instinct was to not bother back-patching this since all currently-supported versions will have been out of support for over 8 years by the time this becomes a practical issue. However, I wonder if it makes sense to back-patch for the kinds of 32-bit embedded systems you cited upthread. I can imagine that such systems might need to work for a very long time without any software updates, in which case it'd probably be a good idea to make this fix available in the next minor release. What do you think? -- nathan
Re: Pgoutput not capturing the generated columns
On Mon, Sep 23, 2024 at 6:19 PM vignesh C wrote: > > On Fri, 20 Sept 2024 at 17:15, Shubham Khanna > wrote: > > > > On Wed, Sep 11, 2024 at 8:55 AM Peter Smith wrote: > > > > I have fixed all the comments. The attached patches contain the desired > > changes. > > Also the merging of 0001 and 0002 can be done once there are no > > comments on the patch to help in reviewing. > > Few comments: > 1) This commit message seems wrong, currently irrespective of > publish_generated_columns, the column specified in column list take > preceedene: > When 'publish_generated_columns' is false, generated columns are not > replicated, even when present in a PUBLICATION col-list. > > 2) Since we have added pubgencols to pg_pubication.h we can specify > "Bump catversion" in the commit message. > > 3) In create publication column list/publish_generated_columns > documentation we should mention that if generated column is mentioned > in column list, generated columns mentioned in column list will be > replication irrespective of publish_generated_columns option. > > 4) This warning should be mentioned only if publish_generated_columns is > false: > if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated) > - ereport(ERROR, > + ereport(WARNING, > > errcode(ERRCODE_INVALID_COLUMN_REFERENCE), > - errmsg("cannot use generated > column \"%s\" in publication column list", > + errmsg("specified generated > column \"%s\" in publication column list for publication with > publish_generated_columns as false", >colname)); > > 5) These tests are not required for this feature: > + 'ALTER PUBLICATION pub5 ADD TABLE test_table WHERE (col1 > 0);' => { > + create_order => 51, > + create_sql => > + 'ALTER PUBLICATION pub5 ADD TABLE > dump_test.test_table WHERE (col1 > 0);', > + regexp => qr/^ > + \QALTER PUBLICATION pub5 ADD TABLE ONLY > dump_test.test_table WHERE ((col1 > 0));\E > + /xm, > + like => { %full_runs, section_post_data => 1, }, > + unlike => { > + exclude_dump_test_schema => 1, > + exclude_test_table => 1, > + }, > + }, > + > + 'ALTER PUBLICATION pub5 ADD TABLE test_second_table WHERE > (col2 = \'test\');' > + => { > + create_order => 52, > + create_sql => > + 'ALTER PUBLICATION pub5 ADD TABLE > dump_test.test_second_table WHERE (col2 = \'test\');', > + regexp => qr/^ > + \QALTER PUBLICATION pub5 ADD TABLE ONLY > dump_test.test_second_table WHERE ((col2 = 'test'::text));\E > + /xm, > + like => { %full_runs, section_post_data => 1, }, > + unlike => { exclude_dump_test_schema => 1, }, > + }, > I have addressed all the comments in the v34-0001 Patch. Please refer to the updated v34-0001 Patch here in [1]. See [1] for the changes added. [1] https://www.postgresql.org/message-id/CAHv8RjJkUdYCdK_bL3yvEV%3DzKrA2dsnZYa1VMT2H5v0%2BqbaGbA%40mail.gmail.com Thanks and Regards, Shubham Khanna.
Re: Changing the default random_page_cost value
On Fri, Sep 27, 2024 at 8:07 AM Greg Sabino Mullane wrote: > tl;dr let's assume SSDs are popular and HDDs are the exception and flip > our default > > Granted, there are other factors involved, and yes, perhaps we should > tweak some of the similar settings as well, but ranom_page_cost is the one > setting most out of sync with today's hardware realities. So I'll be brave > and throw a number out there: 1.2. And change our docs to say wordage like > "if you are using an older hard disk drive technology, you may want to try > raising rpc" to replace our fairly-hidden note about SSDs buried in the > last sentence - of the fourth paragraph - of the rpc docs. > +1 I suggest a slightly nicer comment in the default conf file, like "For spinning hard drives, raise this to at least 3 and test" Roberto
Re: Changing the default random_page_cost value
On Fri, 2024-09-27 at 10:07 -0400, Greg Sabino Mullane wrote: > So I'll be brave and throw a number out there: 1.2. +1 Laurenz Albe
Re: MAINTAIN privilege -- what do we need to un-revert it?
On Fri, Sep 27, 2024 at 12:42:34PM +0900, Yugo NAGATA wrote: > I agree with you. I overlooked WITH NO DATA. > I attached a updated patch. Thanks. Unless someone objects, I plan to commit this shortly. -- nathan
Re: micro-optimize nbtcompare.c routines
I've marked this one as Withdrawn. Apologies for the noise. -- nathan
Re: long-standing data loss bug in initial sync of logical replication
Hi Kuroda-san, Thanks for reviewing the patch. > > Solution: > > 1. When we alter a publication using commands like ‘ALTER PUBLICATION > > pub_name DROP TABLE table_name’, first all tables in the publications > > are invalidated using the function ‘rel_sync_cache_relation_cb’. Then > > again ‘rel_sync_cache_publication_cb’ function is called which > > invalidates all the tables. > > On my environment, rel_sync_cache_publication_cb() was called first and > invalidate > all the entries, then rel_sync_cache_relation_cb() was called and the > specified > entry is invalidated - hence second is NO-OP. > You are correct. I made a silly mistake while writing the write-up. rel_sync_cache_publication_cb() is called first and invalidate all the entries, then rel_sync_cache_relation_cb() is called and the specified entry is invalidated > > This happens because of the following > > callback registered: > > CacheRegisterSyscacheCallback(PUBLICATIONRELMAP, > > rel_sync_cache_publication_cb, (Datum) 0); > > But even in this case, I could understand that you want to remove the > rel_sync_cache_publication_cb() callback. Yes, I think rel_sync_cache_publication_cb() callback can be removed, as it is invalidating all the other tables as well (which are not in this publication). > > 2. When we add/drop a schema to/from a publication using command like > > ‘ALTER PUBLICATION pub_name ADD TABLES in SCHEMA schema_name’, first > > all tables in that schema are invalidated using > > ‘rel_sync_cache_relation_cb’ and then again > > ‘rel_sync_cache_publication_cb’ function is called which invalidates > > all the tables. > > Even in this case, rel_sync_cache_publication_cb() was called first and then > rel_sync_cache_relation_cb(). > Yes, your observation is correct. rel_sync_cache_publication_cb() is called first and then rel_sync_cache_relation_cb(). > > > > 3. When we alter a namespace using command like ‘ALTER SCHEMA > > schema_name RENAME to new_schema_name’ all the table in cache are > > invalidated as ‘rel_sync_cache_publication_cb’ is called due to the > > following registered callback: > > CacheRegisterSyscacheCallback(NAMESPACEOID, > > rel_sync_cache_publication_cb, (Datum) 0); > > > > So, we added a new callback function ‘rel_sync_cache_namespacerel_cb’ > > will be called instead of function ‘rel_sync_cache_publication_cb’ , > > which invalidates only the cache of the tables which are part of that > > particular namespace. For the new function the ‘namespace id’ is added > > in the Invalidation message. > > Hmm, I feel this fix is too much. Unlike ALTER PUBLICATION statements, I think > ALTER SCHEMA is rarely executed at the production stage. However, this > approach > requires adding a new cache callback system, which affects the entire postgres > system; this is not very beneficial compared to the outcome. It should be > discussed > on another thread to involve more people, and then we can add the improvement > after being accepted. > Yes, I also agree with you. I have removed the changes in the updated patch. > > Performance Comparison: > > I have run the same tests as shared in [1] and observed a significant > > decrease in the degradation with the new changes. With selective > > invalidation degradation is around ~5%. This results are an average of > > 3 runs. > > IIUC, the executed workload did not contain ALTER SCHEMA command, so > third improvement did not contribute this improvement. I have removed the changes corresponding to the third improvement. I have addressed the comment for 0002 patch and attached the patches. Also, I have moved the tests in the 0002 to 0001 patch. Thanks and Regards, Shlok Kyal v10-0002-Selective-Invalidation-of-Cache.patch Description: Binary data v10-0001-Distribute-invalidatons-if-change-in-catalog-tab.patch Description: Binary data
Re: On disable_cost
On Wed, 2024-08-21 at 10:29 -0400, Robert Haas wrote: > I went ahead and committed these patches. I know there's some debate > over whether we want to show the # of disabled nodes and if so whether > it should be controlled by COSTS, and I suspect I haven't completely > allayed David's concerns about the initial_cost_XXX functions although > I think that I did the right thing. But, I don't have the impression > that anyone is desperately opposed to the basic concept, so I think it > makes sense to put these into the tree and see what happens. We have > quite a bit of time left in this release cycle to uncover bugs, hear > from users or other developers, etc. about what problems there may be > with this. If we end up deciding to reverse course or need to fix a > bunch of stuff, so be it, but let's see what the feedback is. I am somewhat unhappy about the "Disabled Nodes" in EXPLAIN. First, the commit message confused me: it claims that the information is displayed with EXPLAIN ANALYZE, but it's shown with every EXPLAIN. But that's not important. My complaints are: 1. The "disabled nodes" are always displayed. I'd be happier if it were only shown for COSTS ON, but I think it would be best if they were only shown with VERBOSE ON. After all, the messages are pretty verbose... 2. The "disabled nodes" are not only shown at the nodes where nodes were actually disabled, but also at every nodes above these nodes. This would be fine: Sort -> Nested Loop Join -> Hash Join -> Index Scan Disabled Nodes: 1 -> Hash -> Index Scan Disabled Nodes: 1 -> Index Scan Disabled Nodes: 1 This is annoying: Sort Disabled Nodes: 3 -> Nested Loop Join Disabled Nodes: 3 -> Hash Join Disabled Nodes: 2 -> Index Scan Disabled Nodes: 1 -> Hash -> Index Scan Disabled Nodes: 1 -> Index Scan Disabled Nodes: 1 I have no idea how #2 could be implemented, but it would be nice to have. Please, please, can we show the "disabled nodes" only with VERBOSE? Yours, Laurenz Albe
Re: MAINTAIN privilege -- what do we need to un-revert it?
On Fri, 2024-09-27 at 10:34 -0500, Nathan Bossart wrote: > On Fri, Sep 27, 2024 at 12:42:34PM +0900, Yugo NAGATA wrote: > > I agree with you. I overlooked WITH NO DATA. > > I attached a updated patch. > > Thanks. Unless someone objects, I plan to commit this shortly. The command is run effectively in two parts: the CREATE part and the REFRESH part. The former just uses the session search path, while the latter uses the safe search path. I suggest that we add the wording to the query portion of the doc, near "security- restricted operation". Regards, Jeff Davis
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
Christoph Berg writes: > Re: Tom Lane >> So the first part of that is great, but if your script file is >> large you probably won't be happy about having the whole thing >> repeated in the "QUERY" field. So this needs some work on >> user-friendliness. > Does this really have to be addressed? It would be way better than it > is now, and errors during extension creation are rare and mostly for > developers only, so it doesn't have to be pretty. Perhaps. I spent a little more effort on this and added code to report errors that don't come with an error location. On those, we don't have any constraints about what to report in the QUERY field, so I made it trim the string to just the current query within the script, which makes things quite a bit better. You can see the results in the test_extensions regression test changes. (It might be worth some effort to trim away comments appearing just before a command, but I didn't tackle that here.) regards, tom lane diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index fab59ad5f6..0fae1332d2 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -54,6 +54,7 @@ #include "funcapi.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "nodes/queryjumble.h" #include "storage/fd.h" #include "tcop/utility.h" #include "utils/acl.h" @@ -107,6 +108,17 @@ typedef struct ExtensionVersionInfo struct ExtensionVersionInfo *previous; /* current best predecessor */ } ExtensionVersionInfo; +/* + * Information for script_error_callback() + */ +typedef struct +{ + const char *sql; /* entire script file contents */ + const char *filename; /* script file pathname */ + ParseLoc stmt_location; /* current stmt start loc, or -1 if unknown */ + ParseLoc stmt_len; /* length in bytes; 0 means "rest of string" */ +} script_error_callback_arg; + /* Local functions */ static List *find_update_path(List *evi_list, ExtensionVersionInfo *evi_start, @@ -670,9 +682,60 @@ read_extension_script_file(const ExtensionControlFile *control, return dest_str; } +/* + * error context callback for failures in script-file execution + */ +static void +script_error_callback(void *arg) +{ + script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg; + int syntaxerrposition; + const char *lastslash; + + /* If it's a syntax error, convert to internal syntax error report */ + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0) + { + /* + * We must report the whole string because otherwise details such as + * psql's line number report would be wrong. + */ + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(callback_arg->sql); + } + else if (callback_arg->stmt_location >= 0) + { + /* + * Since no syntax cursor will be shown, it's okay and helpful to trim + * the reported query string to just the current statement. + */ + const char *query = callback_arg->sql; + int location = callback_arg->stmt_location; + int len = callback_arg->stmt_len; + + query = CleanQuerytext(query, &location, &len); + internalerrquery(pnstrdup(query, len)); + } + + /* + * Trim the reported file name to remove the path. We know that + * get_extension_script_filename() inserted a '/', regardless of whether + * we're on Windows. + */ + lastslash = strrchr(callback_arg->filename, '/'); + if (lastslash) + lastslash++; + else + lastslash = callback_arg->filename; /* shouldn't happen, but cope */ + errcontext("extension script file \"%s\"", lastslash); +} + /* * Execute given SQL string. * + * The filename the string came from is also provided, for error reporting. + * * Note: it's tempting to just use SPI to execute the string, but that does * not work very well. The really serious problem is that SPI will parse, * analyze, and plan the whole string before executing any of it; of course @@ -682,12 +745,27 @@ read_extension_script_file(const ExtensionControlFile *control, * could be very long. */ static void -execute_sql_string(const char *sql) +execute_sql_string(const char *sql, const char *filename) { + script_error_callback_arg callback_arg; + ErrorContextCallback scripterrcontext; List *raw_parsetree_list; DestReceiver *dest; ListCell *lc1; + /* + * Setup error traceback support for ereport(). + */ + callback_arg.sql = sql; + callback_arg.filename = filename; + callback_arg.stmt_location = -1; + callback_arg.stmt_len = -1; + + scripterrcontext.callback = script_error_callback; + scripterrcontext.arg = (void *) &callback_arg; + scripterrcontext.previous = error_context_stack; + error_context_stack = &scripterrcontext; + /* * Parse the SQL string into a list of raw parse trees. */ @@ -709,6 +787,10 @@ execute_sql_string(const char *sql) List *stmt_list; ListCell *lc2; + /* Report location of this query for error context callback */ + callback_arg.stmt_location = parsetre
Re: Make COPY format extendable: Extract COPY TO format implementations
Hi, On Sun, Aug 4, 2024 at 3:20 PM Sutou Kouhei wrote: > > Hi, > > I re-ran the benchmark(*) with the v19 patch set and the > following CPUs: > > 1. AMD Ryzen 9 3900X 12-Core Processor > 2. Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz > > (*) > * Use tables that have {5,10,15,20,25,30} integer columns > * Use tables that have {1,2,3,4,5,6,7,8,9,10}M rows > * Use '/dev/null' for COPY TO > * Use blackhole_am for COPY FROM > > See the attached graphs for details. > > Notes: > * X-axis is the number of columns > * Y-axis is the number of M rows > * Z-axis is the elapsed time percent (smaller is faster, > e.g. 99% is a bit faster than the HEAD and 101% is a bit > slower than the HEAD) > * Z-ranges aren't same (The Ryzen case uses about 79%-121% > but the Intel case uses about 91%-111%) > * Red means the patch is slower than HEAD > * Blue means the patch is faster than HEAD > * The upper row shows FROM results > * The lower row shows TO results > > Here are summaries based on the results: > > For FROM: > * With Ryzen: It shows that negative performance impact > * With Intel: It shows that negative performance impact with > 1-5M rows and positive performance impact with 6M-10M rows > For TO: > * With Ryzen: It shows that positive performance impact > * With Intel: It shows that positive performance impact > > Here are insights based on the results: > > * 0001 (that introduces Copy{From,To}Routine} and adds some > "if () {...}" for them but the existing formats still > doesn't use them) has a bit negative performance impact > * 0002 (that migrates the existing codes to > Copy{From,To}Routine} based implementations) has positive > performance impact > * For FROM: Negative impact by 0001 and positive impact by > 0002 almost balanced > * We should use both of 0001 and 0002 than only 0001 > * With Ryzon: It's a bit slower than HEAD. So we may not > want to reject this propose for FROM > * With Intel: > * With 1-5M rows: It's a bit slower than HEAD > * With 6-10M rows: It's a bit faster than HEAD > * For TO: Positive impact by 0002 is larger than negative > impact by 0002 > * We should use both of 0001 and 0002 than only 0001 > * 0003 (that makes Copy{From,To}Routine Node) has a bit > negative performance impact > * But I don't know why. This doesn't change per row > related codes. Increasing Copy{From,To}Routine size > (NodeTag is added) may be related. > * 0004 (that moves Copy{From,To}StateData to copyapi.h) > doesn't have impact > * It makes sense because this doesn't change any > implementations. > * 0005 (that add "void *opaque" to Copy{From,To}StateData) > has a bit negative impact for FROM and a bit positive > impact for TO > * But I don't know why. This doesn't change per row > related codes. Increasing Copy{From,To}StateData size > ("void *opaque" is added) may be related. I was surprised that the 0005 patch made COPY FROM slower (with fewer rows) and COPY TO faster overall in spite of just adding one struct field and some functions. I'm interested in why the performance trends of COPY FROM are different between fewer than 6M rows and more than 6M rows. > > How to proceed this proposal? > > * Do we need more numbers to judge this proposal? > * If so, could someone help us? > * There is no negative performance impact for TO with both > of Ryzen and Intel based on my results. Can we merge only > the TO part? > * Can we defer the FROM part? Should we proceed this > proposal with both of the FROM and TO part? > * Could someone provide a hint why the FROM part is more > slower with Ryzen? > Separating the patches into two parts (one is for COPY TO and another one is for COPY FROM) could be a good idea. It would help reviews and investigate performance regression in COPY FROM cases. And I think we can commit them separately. Also, could you please rebase the patches as they conflict with the current HEAD? I'll run some benchmarks on my environment as well. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Add new COPY option REJECT_LIMIT
+/* + * Extract REJECT_LIMIT value from a DefElem. + */ +static int64 +defGetCopyRejectLimitOptions(DefElem *def) +{ + int64 reject_limit; + + if (def->arg == NULL) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("REJECT_LIMIT requires a positive integer"))); + + if (nodeTag(def->arg) == T_Integer) + { + reject_limit = defGetInt64(def); + if (reject_limit <= 0) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("REJECT_LIMIT (%lld) must be greater than zero", + (long long) reject_limit))); + } + else + { + char *sval = defGetString(def); + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("REJECT_LIMIT (%s) must be a positive integer", + sval))); + } + + return reject_limit; +} there will be an integer overflow issue? Can you try the following? static int64 defGetCopyRejectLimitOptions(DefElem *def) { int64reject_limit; reject_limit = defGetInt64(def); if (reject_limit <= 0) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("REJECT_LIMIT (%lld) must be greater than zero", (long long) reject_limit))); } REJECT_LIMIT integer i think, you want REJECT_LIMIT be bigint? so here it should be REJECT_LIMIT bigint\ ?