Re: help for pg_wal issue
At Mon, 27 Jun 2022 12:28:18 +0200, Laurenz Albe wrote in > You forgot to tell us how exactly you are performing that backup. Yeah, but Google told me that Veritas may complain with that message. Anywhat the tools is, it seems like that the tool could not continue after it found that a file had been deleted before copying it. That means the server is live. If you take a copy of files from a live server, the backup tool should allow some files are removed or added while copying. If you are taking a base backup (aka online backup), you need to take several subsidiary steps. https://www.postgresql.org/docs/devel/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP If you still see the point, feel free to ask further here but with showing exactly how you are peforming that backup:p regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: help for pg_wal issue
At Tue, 28 Jun 2022 16:28:31 +0900 (JST), Kyotaro Horiguchi wrote in > At Mon, 27 Jun 2022 12:28:18 +0200, Laurenz Albe > wrote in > > You forgot to tell us how exactly you are performing that backup. > > Yeah, but Google told me that Veritas may complain with that > message. Anywhat the tools is, it seems like that the tool could not > continue after it found that a file had been deleted before copying > it. That means the server is live. If you take a copy of files from a > live server, the backup tool should allow some files are removed or > added while copying. > > If you are taking a base backup (aka online backup), you need to take > several subsidiary steps. > > https://www.postgresql.org/docs/devel/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP > - If you still see the point, feel free to ask further here but with Mmm.. My fingers are fat and eyes are slipperly. + If you don't still see the point, feel free to ask further here but with > showing exactly how you are peforming that backup:p > > regards. > > -- > Kyotaro Horiguchi > NTT Open Source Software Center -- Kyotaro Horiguchi NTT Open Source Software Center
GIN index operator ?(jsonb,text) not working?
Hi, dbfiddle: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=060af497bbb75ecddad9fd2744f8022b --- create table test101 ( doc_id bigserial, document jsonb); insert into test101(document) values ('{"user_removed" :false}') returning *; insert into test101(document) select '{"user_removed" :false}'::jsonb from generate_series(1,1); insert into test101(document) select ('{"user_remove" :false, "test":' || i ||'}'::text)::jsonb from generate_series(1,400) i; CREATE INDEX test101_gin_user_removed_na ON test101 USING GIN (document jsonb_ops) where (document ? 'user_removed') is false; CREATE INDEX test101_gin_user_removed_na_b ON test101(document) where (document ? 'user_removed') is false; Since 400 is very few percent compared to 1. but the following query will not use any of the indexes. explain (analyze) select * from test101 where document ? 'user_removed' is false ; -- I recommend David Deutsch's <> Jian
Re: Libpq question related to allocated resources
On 6/27/2022 23:22, Tom Lane wrote: Karl Denninger writes: But -- I still have a /lot /of memory out on the heap according to jemalloc stats that is not being deallocated, and what's worse is that if I rig the code to call PQfinish and then PQconnect once again I get /even more /imbalanced allocate/free counts (and the memory use in said buckets to go with them.) Hmmm ... I'm not aware of any memory leaks in libpq, but that doesn't mean there are none. Of course, if you're forgetting to PQclear() some PGresults, that's not libpq's fault ;-). Well, yes, which is why I wrapped those calls to make very sure that's not the case (internal reference count in the code when in "debugging mode", basically) along with all the uses of escape/unescape (e.g. bytea fields.) All come back clean on each "round" through which makes it quite puzzling. I'll do more digging. I've got wrappers around all memory allocation in my development libraries that for internal allocations make quite sure that they're both properly paired and sentinels are on the "bookends" so if the code does smash something it catches it, and there's nothing being flagged; the arena, as my code sees it from what it allocated and the calls it made to libpq, are empty when it comes back as it should be. Obviously there's leakage somewhere but at this point I'm quite certain its not in my code itself (its certainly possible FastCGI's lib has a problem somewhere since it has to construct the environment from the web server's CGI call for each call to the application, each of those is distinct, and if something goes wrong there it will leak like crazy since each of those constructs is unique and then must be properly released when that call is complete.) The obvious question, given the warnings in the FastCGI library: Does libpq /modify /the process environment? No. At least, I see no setenv() calls in it, and I think that it'd be pretty unfriendly for a library to do that to its host application. In this case it would be fatal if that was to happen since the environment is synthetic and different on each call; if any part of the environment gets modified then the release by the caller will either leak or, possibly, result in a SEGV. A quick grep implies that indeed it may in backend/libpq/auth.c at least, backend/libpq is unrelated to interfaces/libpq. (I've seen hints that they arose from a common code base, but if so, that was a few decades and a lot of rewrites ago.) regards, tom lane Gotcha. It wasn't clear that this was or wasn't implicated and I'm digging for potential sources, thus the question. Thanks. -- Karl Denninger k...@denninger.net /The Market Ticker/ /[S/MIME encrypted email preferred]/ smime.p7s Description: S/MIME Cryptographic Signature
Re: Unique index prohibits partial aggregates
On Wed, 29 Jun 2022 at 00:45, Bos, Fred wrote: > Finally, is there a way to force postgres to do the partial hash aggregate, > either by changing a setting or by influencing the expected amount of output > groups for each query? You could do something like: ALTER TABLE bhload_nohyp_noin ALTER COLUMN t SET (n_distinct = 200); ANALYZE bhload_nohyp_noin; Please be aware that this may have detrimental effects if you do any joins or group bys directly on this column. Otherwise, providing you don't have a unique index on that column, then it should trick the planner into thinking there will be fewer groups than it currently thinks there will be, which will likely result in the parallel plan that you desire. David
User's responsibility when using a chain of "immutable" functions?
I’ve copied my self-contained testcase at the end. I create three functions, marking each of them "immutable". "f1()" simply returns the manifest constant 'dog'. So it seems perfectly honest to mark it as I did. "f2()" simply returns "f1()"—so, here too, it seems that I'm being honest. But I do see that I'm using human reasoning, and that Postgres cannot check that I'm right. In the same way, and with the same reasoning for my marking, "f3()" returns "f2()". Then I do this: select rpad('at start', 30) as history, f1(), f2(), f3(); Then I drop, and then re-create "f(1)", now returning 'cat', and do this: select rpad('after drop, re-create f1()', 30) as history, f1(), f2(), f3(); Finally, I create-and-replace "f3()", using the identical source text, and do this: select rpad('after create-and-replace f3()', 30) as history, f1(), f2(), f3(); Here's what I see when I run my .sql script: history | f1 | f2 | f3 +-+-+- at start | dog | dog | dog after drop, re-create f1() | cat | cat | dog after create-and-replace f3() | cat | cat | cat I understand that the possible session-duration caching that I allow with "immutable" doesn't guarantee that I'll get any caching. But I had expected a cascade purge on anything that was cashed on delete of "f1()". Something seems odd to me: if I do my "select f1(), f2(), f3()" after dropping "f1()" (i.e. before re-creating it) then I get an ordinary error saying that "f1()" doesn't exist. So it seems that Postgres does understand the dynamic dependency chain—even when the result from "f3()" is cached. If I then recreate "f1()" to return 'cat', I get no error—but, same as in my straight-through test, "f3()" continues to return its cached (and now "wrong") result. Should I simply understand that when I have such a dynamic dependency chain of "immutable" functions, and should I drop and re-create the function at the start of the chain, then all bets are off until I drop and re-create every function along the rest of the chain? -- testcase.sql \t off drop function if exists f3() cascade; drop function if exists f2() cascade; drop function if exists f1() cascade; create function f1() returns text immutable language plpgsql as $body$ begin return 'dog'; end; $body$; create function f2() returns text immutable language plpgsql as $body$ begin return f1(); end; $body$; create function f3() returns text immutable language plpgsql as $body$ begin return f2(); end; $body$; select rpad('at start', 30) as history, f1(), f2(), f3(); \t on drop function f1() cascade; create function f1() returns text immutable language plpgsql as $body$ begin return 'cat'; end; $body$; select rpad('after drop, re-create f1()', 30) as history, f1(), f2(), f3(); create or replace function f3() returns text immutable language plpgsql as $body$ declare t1 constant text := f2(); begin return t1; end; $body$; select rpad('after create-and-replace f3()', 30) as history, f1(), f2(), f3(); \t off
Re: User's responsibility when using a chain of "immutable" functions?
> On Jun 28, 2022, at 18:41, Bryn Llewellyn wrote: > Should I simply understand that when I have such a dynamic dependency chain > of "immutable" functions, and should I drop and re-create the function at the > start of the chain, then all bets are off until I drop and re-create every > function along the rest of the chain? Yes. You don't have to drop and recreate the functions, though. DISCARD PLANS handles it as well: xof=# create function f1() returns text as $$ begin return 'cat'; end $$ language plpgsql immutable; CREATE FUNCTION xof=# create function f2() returns text as $$ begin return f1(); end $$ language plpgsql immutable; CREATE FUNCTION xof=# create function f3() returns text as $$ begin return f2(); end $$ language plpgsql immutable; CREATE FUNCTION xof=# select f1(), f2(), f3(); f1 | f2 | f3 -+-+- cat | cat | cat (1 row) xof=# drop function f1(); DROP FUNCTION xof=# create function f1() returns text as $$ begin return 'dog'; end $$ language plpgsql immutable; CREATE FUNCTION xof=# select f1(), f2(), f3(); f1 | f2 | f3 -+-+- dog | dog | cat (1 row) xof=# discard plans; DISCARD PLANS xof=# select f1(), f2(), f3(); f1 | f2 | f3 -+-+- dog | dog | dog (1 row) xof=# The contract on an immutable function is that it returns the same return value for particular input values regardless of database or system state: that is, it's a pure function. Changing the definition in such a way breaks the contract, so I don't think PostgreSQL needs to do heroics to accommodate that situation. (For example, changing the definition of an immutable function that's used in an expression index could corrupt the index.) If one's fixing a bug, then rolling out the change in a controlled way is a reasonable requirement.
Re: User's responsibility when using a chain of "immutable" functions?
On Tue, Jun 28, 2022 at 7:03 PM Christophe Pettus wrote: > > > > On Jun 28, 2022, at 18:41, Bryn Llewellyn wrote: > > Should I simply understand that when I have such a dynamic dependency > chain of "immutable" functions, and should I drop and re-create the > function at the start of the chain, then all bets are off until I drop and > re-create every function along the rest of the chain? > > Yes. > > You don't have to drop and recreate the functions, though. DISCARD PLANS > handles it as well: > > Specifically: > xof=# select f1(), f2(), f3(); > f1 | f2 | f3 > -+-+- > cat | cat | cat > (1 row) > The pl/pgsql plan cache now contains the following: SELECT f1() => 'cat' SELECT f2() => 'cat' > > xof=# drop function f1(); > DROP FUNCTION > Now the cache only contains: SELECT f2() => 'cat' The f1 plan has been invalidated due to the drop/replace action on the f1 function > xof=# create function f1() returns text as $$ begin return 'dog'; end $$ > language plpgsql immutable; > CREATE FUNCTION > xof=# select f1(), f2(), f3(); > f1 | f2 | f3 > -+-+- > dog | dog | cat > (1 row) > > And so f3()'s invocation of "SELECT f2()" yields 'cat' from the cache since that one hasn't been invalidated. While f2() replans its f1() invocation and thus returns 'dog' The fundamental limitation here is that there really is no attempt being made to deal with inter-functional dependencies. Their bodies are blackboxes (...wonders how this resolves in the new SQL Standard Function Bodies implementation...) and no explicit dependency information is recorded either. So we don't know that the saved plan for f2() depends on a specific version of f1() and thus if f1() is changed plans involving f2() should be invalidated along with plans involving f1(). Nor is there sufficient recognized benefit to doing so. David J.
Re: User's responsibility when using a chain of "immutable" functions?
> david.g.johns...@gmail.com wrote: > >> x...@thebuild.com wrote: >> >>> b...@yugabyte.com wrote: >>> >>> Should I simply understand that when I have such a dynamic dependency chain >>> of "immutable" functions, and should I drop and re-create the function at >>> the start of the chain, then all bets are off until I drop and re-create >>> every function along the rest of the chain? >> >> >> Yes. >> >> You don't have to drop and recreate the functions, though. DISCARD PLANS >> handles it as well: > > Specifically: > >> select f1(), f2(), f3(); >> f1 | f2 | f3 >> -+-+- >> cat | cat | cat > > > The pl/pgsql plan cache now contains the following: > > SELECT f1() => 'cat' > SELECT f2() => 'cat' > >> drop function f1(); > > Now the cache only contains: > > SELECT f2() => 'cat' > > The f1 plan has been invalidated due to the drop/replace action on the f1 > function > >> create function f1() returns text as $$ begin return 'dog'; end $$ language >> plpgsql immutable; >> select f1(), f2(), f3(); >> f1 | f2 | f3 >> -+-+- >> dog | dog | cat > > And so f3()'s invocation of "SELECT f2()" yields 'cat' from the cache since > that one hasn't been invalidated. While f2() replans its f1() invocation and > thus returns 'dog' > > The fundamental limitation here is that there really is no attempt being made > to deal with inter-functional dependencies. Their bodies are blackboxes > (...wonders how this resolves in the new SQL Standard Function Bodies > implementation...) and no explicit dependency information is recorded either. > So we don't know that the saved plan for f2() depends on a specific version > of f1() and thus if f1() is changed plans involving f2() should be > invalidated along with plans involving f1(). Nor is there sufficient > recognized benefit to doing so. DISCARD PLANS is unsafe in a multi-user concurrent scenario. The doc says explicitly that its scope is just the single session. And it's easy to show the danger by using my testcase manually, step by appropriate step, with two concurrent sessions. However, you said (indirectly) that the session-duration caching is a red herring—and that the real danger comes with an expression-based index that involves a PL/pgSQL function. I agree. PG's lack of dependency tracking shows up with just a "worker" function f1() and a "jacket" function f2() when you base the index on f2(). You can happily drop and recreate f1() with a new implementation while the index lives on. (For the reasons that we've mentioned, the "2BP01: cannot drop function... because other objects depend on it" error doesn't occur.) I've concluded that the only practical practice for "immutable" is to reserve its use for functions that don't mention even a single user-created artifact. Moreover, this "hermetic" property of a to-be-immutable function can be established only by human analysis of the function's source code.
Re: User's responsibility when using a chain of "immutable" functions?
st 29. 6. 2022 v 6:28 odesílatel Bryn Llewellyn napsal: > *david.g.johns...@gmail.com wrote:* > > > *x...@thebuild.com wrote:* > > b...@yugabyte.com wrote: > > Should I simply understand that when I have such a dynamic dependency > chain of "immutable" functions, and should I drop and re-create the > function at the start of the chain, then all bets are off until I drop and > re-create every function along the rest of the chain? > > > Yes. > > You don't have to drop and recreate the functions, though. DISCARD PLANS > handles it as well: > > > Specifically: > > > > > > *select f1(), f2(), f3(); f1 | f2 | f3 -+-+- cat | cat | > cat* > > The pl/pgsql plan cache now contains the following: > > SELECT f1() => 'cat' > SELECT f2() => 'cat' > > > *drop function f1();* > Now the cache only contains: > > SELECT f2() => 'cat' > > The f1 plan has been invalidated due to the drop/replace action on the f1 > function > > > *create function f1() returns text as $$ begin return 'dog'; end $$ > language plpgsql immutable;* > > > > *select f1(), f2(), f3(); f1 | f2 | f3 -+-+- dog | dog | > cat* > > > And so f3()'s invocation of "SELECT f2()" yields 'cat' from the cache > since that one hasn't been invalidated. While f2() replans its f1() > invocation and thus returns 'dog' > > The fundamental limitation here is that there really is no attempt being > made to deal with inter-functional dependencies. Their bodies are > blackboxes (...wonders how this resolves in the new SQL Standard > Function Bodies implementation...) and no explicit dependency information > is recorded either. So we don't know that the saved plan for f2() depends > on a specific version of f1() and thus if f1() is changed plans involving > f2() should be invalidated along with plans involving f1(). Nor is there > sufficient recognized benefit to doing so. > > > DISCARD PLANS is unsafe in a multi-user concurrent scenario. The doc says > explicitly that its scope is just the single session. And it's easy to show > the danger by using my testcase manually, step by appropriate step, with > two concurrent sessions. > > However, you said (indirectly) that the session-duration caching is a red > herring—and that the real danger comes with an expression-based index that > involves a PL/pgSQL function. I agree. > > PG's lack of dependency tracking shows up with just a "worker" function > f1() and a "jacket" function f2() when you base the index on f2(). You can > happily drop and recreate f1() with a new implementation while the index > lives on. (For the reasons that we've mentioned, the "2BP01: cannot drop > function... because other objects depend on it" error doesn't occur.) > > *I've concluded that the only practical practice for "immutable" is to > reserve its use for functions that don't mention even a single user-created > artifact*. > > Moreover, this "hermetic" property of a to-be-immutable function can be > established only by human analysis of the function's source code. > Our immutable functions are more tolerant than they should be - for real immutable functions we should disallow SQL inside functions (and everything that is not immutable (plpgsql_check raises warning in this case)), but it is allowed. On the second hand, it allows some very dirty tricks with the planner. Regards Pavel
Re: User's responsibility when using a chain of "immutable" functions?
On Tuesday, June 28, 2022, Bryn Llewellyn wrote: > > DISCARD PLANS is unsafe in a multi-user concurrent scenario. The doc says > explicitly that its scope is just the single session. And it's easy to show > the danger by using my testcase manually, step by appropriate step, with > two concurrent sessions. > Impractical maybe, but hardly unsafe (well, relative to not doing anything). > > *I've concluded that the only practical practice for "immutable" is to > reserve its use for functions that don't mention even a single user-created > artifact*. > That seems like too strong a position to take for me. Go ahead and build immutable utility functions. Just don’t be stupid break your promise. But even then, there are ways to fix things in case of bugs. David J.
Re: User's responsibility when using a chain of "immutable" functions?
Pavel Stehule writes: > st 29. 6. 2022 v 6:28 odesílatel Bryn Llewellyn napsal: >> Moreover, this "hermetic" property of a to-be-immutable function can be >> established only by human analysis of the function's source code. > Our immutable functions are more tolerant than they should be - for real > immutable functions we should disallow SQL inside functions (and everything > that is not immutable (plpgsql_check raises warning in this case)), but it > is allowed. It's generally believed that Turing's proof of the undecidability of the halting problem [1] implies that it's impossible to mechanically prove or refute function properties like immutability. Admittedly, Turing was concerned with the most general case --- that is, he showed that *there exist* cases for which no algorithm can give the right answer, not that any specific practical case can't be proven. Still, that result has discouraged most people from spending much time on mechanically checking such things. If you declare a function immutable, Postgres will believe you; the consequences if you lied are on your own head. regards, tom lane [1] https://en.wikipedia.org/wiki/Halting_problem
Re: User's responsibility when using a chain of "immutable" functions?
st 29. 6. 2022 v 7:46 odesílatel Tom Lane napsal: > Pavel Stehule writes: > > st 29. 6. 2022 v 6:28 odesílatel Bryn Llewellyn > napsal: > >> Moreover, this "hermetic" property of a to-be-immutable function can be > >> established only by human analysis of the function's source code. > > > Our immutable functions are more tolerant than they should be - for real > > immutable functions we should disallow SQL inside functions (and > everything > > that is not immutable (plpgsql_check raises warning in this case)), but > it > > is allowed. > > It's generally believed that Turing's proof of the undecidability of > the halting problem [1] implies that it's impossible to mechanically > prove or refute function properties like immutability. Admittedly, > Turing was concerned with the most general case --- that is, he showed > that *there exist* cases for which no algorithm can give the right > answer, not that any specific practical case can't be proven. > Still, that result has discouraged most people from spending much > time on mechanically checking such things. If you declare a function > immutable, Postgres will believe you; the consequences if you lied > are on your own head. > We cannot ensure that the function is immutable, but we can detect that the function is not very probably immutable (in execution time). calling volatile function from immutable function using SELECT from tables inside immutable function This is a clear violation of some "protocol". I know why it is, and I don't propose change, because it can break thousands of applications. And for some specific cases the strong restrictivity can be safe but not practical. > regards, tom lane > > [1] https://en.wikipedia.org/wiki/Halting_problem >
Re: User's responsibility when using a chain of "immutable" functions?
Pavel Stehule writes: > st 29. 6. 2022 v 7:46 odesílatel Tom Lane napsal: >> ... that result has discouraged most people from spending much >> time on mechanically checking such things. If you declare a function >> immutable, Postgres will believe you; the consequences if you lied >> are on your own head. > We cannot ensure that the function is immutable, but we can detect that the > function is not very probably immutable (in execution time). Sure, there are a lot of easy cases where we could say "that's obviously not immutable". But is it worth spending engineering effort and runtime on that? I suspect the cases that people might actually mess up are less obvious, so that we might accomplish little more than offering a false sense of security. regards, tom lane
Re: User's responsibility when using a chain of "immutable" functions?
On Tue, 2022-06-28 at 19:02 -0700, Christophe Pettus wrote: > > On Jun 28, 2022, at 18:41, Bryn Llewellyn wrote: > > Should I simply understand that when I have such a dynamic dependency > > chain of "immutable" functions, and should I drop and re-create the > > function at the start of the chain, then all bets are off until I drop > > and re-create every function along the rest of the chain? > > Yes. That is not enough in the general case. You are not allowed to redefine an IMMUTABLE function in a way that changes its behavior: CREATE FUNCTION const(integer) RETURNS integer LANGUAGE plpgsql IMMUTABLE AS 'BEGIN RETURN $1; END;'; CREATE TABLE t (x integer); INSERT INTO t VALUES (1); CREATE INDEX ON t (const(x)); SET enable_seqscan = off; SELECT * FROM t WHERE const(x) = 1; -- returns a correct result x ═══ 1 (1 row) CREATE OR REPLACE FUNCTION const(integer) RETURNS integer LANGUAGE plpgsql IMMUTABLE AS 'BEGIN RETURN $1 + 1; END;'; SELECT * FROM t WHERE const(x) = 1; -- returns a bad result x ═══ 1 (1 row) Of course, you are allowed to cheat if you know what you are doing. But any problem you encounter that way is your own problem entirely. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com