Re: search_path for PL/pgSQL functions partially cached?
On Fri, 3 Jan 2025 08:34:57 -0700 "David G. Johnston" wrote: > On Friday, January 3, 2025, Jan Behrens wrote: > > > > I would like to know if the above example is correct. It seems overall > > bulky, but I haven't found a better way, assuming that it can be > > unknown where a particular extension has been installed to. In > > particular I feel a bit insecure about where I have to fully qualify, > > and where not. See the comments in the code above. > > > Short answer, you cannot looking at a definition and know the answer - > whether the code is going to be executed in a sanitized search_path is what > matters. I don't understand. Do you mean my last example is wrong / insecure? If so, why? > Anything that would be executed during pg_restore has to be made > safe. Therefore, code that is only ever executed by applications directly > can use swarch_path. Why should the function be executed during pg_restore? > > I’d probably modify the function signature to take search_path as a second > optional argument and then invoke a set search_path within the function. > At worse the caller can place current_setting(search_path) as the value of > that argument though being explicit would be recommended. > > David J. I could do that, but I would like to understand if that is really necessary as it makes the interface more complicated, and I would like to avoid unnecessary complexity in my interface. Is it really impossible to have functions without SET search_path in the definition of a PL/pgSQL function if I fully-qualify all types in the DECLARE section and if all other non-qualified identifiers occur after set_config('search_path', ...)? Kind regards, Jan Behrens
Re: search_path for PL/pgSQL functions partially cached?
On Fri, 3 Jan 2025 10:16:15 -0700 "David G. Johnston" wrote: > It is at risk because it depends on the session search_path. That is all. > Whether that risk turns into a failure to execute depends on how/when it is > executed. I'm not that comfortable talking about security risks in this > context though the current design goal is to mitigate such security issues > by setting things up so the function execution fails rather than is > executed insecurely. This is presently mainly done by setting the > search_path to just effectively pg_catalog before executing the query, > breaking any code depending on other schemas existing in the search_path. I'm not sure if there is a misunderstanding. In my last example (e-mail dated Fri, 3 Jan 2025 13:53:32 +0100), the user who has control over the contents of the "query_p" argument is an application programmer, not a real end-user. The function is also *not* marked as SECURITY DEFINER, so it always runs with the privileges of the caller. I don't see any specific security risk here, except that I'm unsure if the function is written properly with regard to qualification of the used types after PL/pgSQL's BEGIN. As I learned, I must fully-qualify types *before* the BEGIN, i.e. in the DECLARE section. But does this also hold for types after the BEGIN when I previously ensure that the search_path is correctly set (set within the function's body)? > > > > Anything that would be executed during pg_restore has to be made > > > safe. Therefore, code that is only ever executed by applications > > directly > > > can use swarch_path. > > > > Why should the function be executed during pg_restore? > > > If the function is used in building an index, or a materialized view, are > the common cases. Trigger functions too. > > Note, this is talking about evaluating functions generally, not the one > provided here specifically. I don't think my function would be evaluated during a pg_restore then. > > > I could do that, but I would like to understand if that is really > > necessary as it makes the interface more complicated, and I would like > > to avoid unnecessary complexity in my interface. > > > > Is it really impossible to have functions without SET search_path in > > the definition of a PL/pgSQL function if I fully-qualify all types in > > the DECLARE section and if all other non-qualified identifiers occur > > after set_config('search_path', ...)? > > > If you add a set_config to the body of the function then you indeed avoid > the problem. It is basically equivalent to adding a SET clause to the > create function command. In this case even when the function is executed > in a sanitized search_path environment (such as the one established by > pg_restore) you are not relying on it. That non-reliance is all that > really matters. > > David J. But if I use "SET search_path FROM CURRENT", then the called function won't know the search_path that is set at the caller's side (which is what I need to make my interface nice to use). I would prefer my current solution, but I would like to be sure that my example (the one in my e-mail dated Fri, 3 Jan 2025 13:53:32 +0100) is correct. I still am not sure about that. Kind Regards, Jan Behrens
Re: search_path for PL/pgSQL functions partially cached?
On Fri, 3 Jan 2025 13:56:02 -0800 Adrian Klaver wrote: > At this point I am lost as to what the overall goal of this is. > > Can you provide a 1 ft view if what it is you are trying to achieve? Sure! I would like to create a component (e.g. a PostgreSQL extension) that provides a function which processes some complex data, without making any requirements regarding where the data is stored. To pass this data to the function, I could use arrays of composite types, but that seems to be very bulky. Another option would be to use cursors, but that didn't turn out to work very smooth either. Instead, I plan to expect the function to receive a query string that will get the data that is being processed by the function. That query string should be allowed to refer to tables in the search_path at the caller's side. Therefore, I cannot use the "SET search_path FROM CURRENT" in my "CREATE FUNCTION" statement, because it would overwrite the current search_path on each call of the function. Thus my idea is to do this (simplified): CREATE FUNCTION "some_function" ("query_p" TEXT) RETURNS "some_type" -- I cannot use SET search_path FROM CURRENT here, because "query_p" -- shall refer to tables in the search_path of the caller. LANGUAGE plpgsql AS $$ DECLARE "old_search_path" TEXT; -- I have to fully qualify types in the DECLARE section. "some_variable" "some_schema"."some_type"; BEGIN SELECT current_setting('search_path') INTO "old_search_path"; PERFORM set_config( 'search_path', 'some_schema, pg_temp, ' || "old_search_path", TRUE ); -- Do I have to fully qualify types and operators from -- "myschema" here? Or is it safe to not fully qualify them? END; $$; That is my overall idea. My problem is that I'm confused about WHEN EXACTLY I have to qualify tables/types, etc. It is very hard to understand from reading (just) the documentation. Kind Regards, Jan Behrens
Re: search_path for PL/pgSQL functions partially cached?
On Sat, 4 Jan 2025 00:22:03 +0100 Jan Behrens wrote: > > -- Do I have to fully qualify types and operators from > -- "myschema" here? Or is it safe to not fully qualify them? > This was meant to read: ...from "some_schema" here.
Re: search_path for PL/pgSQL functions partially cached?
On Fri, 3 Jan 2025 at 18:22, Jan Behrens wrote: > Instead, I plan to expect the function to receive a query string that > will get the data that is being processed by the function. > > That query string should be allowed to refer to tables in the > search_path at the caller's side. > > Therefore, I cannot use the "SET search_path FROM CURRENT" in my > "CREATE FUNCTION" statement, because it would overwrite the current > search_path on each call of the function. > I wonder if it would help if EXECUTE took an optional search_path to use while executing the query.
Re: search_path for PL/pgSQL functions partially cached?
On 1/3/25 13:33, Jan Behrens wrote: On Fri, 3 Jan 2025 10:16:15 -0700 "David G. Johnston" wrote: But if I use "SET search_path FROM CURRENT", then the called function won't know the search_path that is set at the caller's side (which is what I need to make my interface nice to use). At this point I am lost as to what the overall goal of this is. Can you provide a 1 ft view if what it is you are trying to achieve? I would prefer my current solution, but I would like to be sure that my example (the one in my e-mail dated Fri, 3 Jan 2025 13:53:32 +0100) is correct. I still am not sure about that. Kind Regards, Jan Behrens -- Adrian Klaver adrian.kla...@aklaver.com
Re: search_path for PL/pgSQL functions partially cached?
On Fri, Jan 3, 2025 at 2:33 PM Jan Behrens wrote: > I would prefer my current solution, but I would like to be sure that my > example (the one in my e-mail dated Fri, 3 Jan 2025 13:53:32 +0100) is > correct. I still am not sure about that. > > If it does what you want and it is only ever executed by application code over a client connection you should be fine. Your client connection will always have whatever search_path you arrange to have in place and the application developer will know whether their environment is correct or not when they test it, and have the ability to change their environment as needed. David J.
Re: search_path for PL/pgSQL functions partially cached?
On Sat, 28 Dec 2024 00:40:09 +0100 Jan Behrens wrote: > > Add qualification or attach a “set search_path” clause to “create > > function”. Code stored in the server should not rely on the session > > search_path. > > > > David J. I have been trying to adjust some of my code, and I still have cases where I have to rely on the session's search_path. I'll provide an example below. > > [...] > > My question is: Am I safe if I use fully-qualified types in the DECLARE > section only? Or do I need to provide full qualification also in the > code below (after SET search_path TO 'myschema')? > > And bonus question: Is it documented somewhere? > > [...] > > Kind Regards > Jan Behrens The following code is taken from a project I'm currently working on: -- Let's assume we don't know the name of the schema in which the -- "pgratio" extension with the RATIONAL data type is installed. CREATE SCHEMA "qwertyuiop"; CREATE EXTENSION "pgratio" WITH SCHEMA "qwertyuiop"; -- This installs schema "myschema" with some dynamic function: BEGIN; CREATE SCHEMA "myschema"; SET LOCAL search_path TO "myschema"; -- Append schema of "pgratio" extension, which provides the RATIONAL -- data type, to search_path: SELECT set_config( 'search_path', current_setting('search_path') || ', ' || quote_ident(nspname), TRUE ) FROM pg_namespace, pg_extension WHERE pg_namespace.oid = extnamespace AND extname = 'pgratio'; CREATE DOMAIN "rational_wrapper" AS RATIONAL; CREATE FUNCTION "some_function" ("query_p" TEXT) RETURNS RATIONAL -- I cannot use SET search_path FROM CURRENT here, because "query_p" -- shall refer to tables in the search_path of the caller. LANGUAGE plpgsql AS $$ DECLARE "old_search_path" TEXT; -- I have to fully qualify the following type. -- Moreover, I can't use RATIONAL as I don't know its schema. "result" "myschema"."rational_wrapper"; BEGIN SELECT current_setting('search_path') INTO "old_search_path"; PERFORM set_config( 'search_path', 'myschema, ' || quote_ident(nspname) || ', pg_temp, ' || "old_search_path", TRUE ) FROM pg_namespace, pg_extension WHERE pg_namespace.oid = extnamespace AND extname = 'pgratio'; -- Is it safe to not fully qualify type RATIONAL below? -- And, if yes, where in the documentation is this explained? CREATE TEMP TABLE "mytemptable" ("val" RATIONAL); EXECUTE 'INSERT INTO "mytemptable" ' 'SELECT "query"."a" * "query"."b" ' 'FROM (' || "query_p" || ') AS "query"'; -- Do some things here. SELECT sum("val") INTO "result" FROM "mytemptable"; PERFORM set_config('search_path', "old_search_path", TRUE); RETURN "result"; END; $$; COMMIT; CREATE TABLE "tbl" ("foo" INT8, "bar" INT8); INSERT INTO "tbl" VALUES (5, 7), (1, 10); SELECT "myschema"."some_function"( 'SELECT "foo" AS "a", "bar" AS "b" FROM "tbl"' ); \c SELECT "myschema"."some_function"( 'SELECT "foo" AS "a", "bar" AS "b" FROM "tbl"' ); The code for the pgratio extension that provides the RATIONAL data type is found here: https://www.public-software-group.org/pgratio Running that code on my machine correctly gives: some_function --- 45 (1 row) You are now connected to database "jbe" as user "jbe". some_function --- 45 (1 row) Because extensions can only be installed in one schema, it may be a bad idea to have a component requiring an extension to be installed in a particular schema (because if different components have different expectations on the schema name, e.g. some might expect "pgratio" to be installed in "public" and others might expect it in "pgratio" or some other schema such as "qwertyuiop", this would lead to an unresolvable conflict). I would like to know if the above example is correct. It seems overall bulky, but I haven't found a better way, assuming that it can be unknown where a particular extension has been installed to. In particular I feel a bit insecure about where I have to fully qualify, and where not. See the comments in the code above. Note that I want the function to accept a query that makes sense in the caller's search_path. Thus using "SET search_path FROM CURRENT" is not an option for me, I believe. Regards, Jan Behrens
Re: search_path for PL/pgSQL functions partially cached?
On Fri, Jan 3, 2025 at 9:48 AM Jan Behrens wrote: > On Fri, 3 Jan 2025 08:34:57 -0700 > "David G. Johnston" wrote: > > > On Friday, January 3, 2025, Jan Behrens wrote: > > > > > > I would like to know if the above example is correct. It seems overall > > > bulky, but I haven't found a better way, assuming that it can be > > > unknown where a particular extension has been installed to. In > > > particular I feel a bit insecure about where I have to fully qualify, > > > and where not. See the comments in the code above. > > > > > > Short answer, you cannot looking at a definition and know the answer - > > whether the code is going to be executed in a sanitized search_path is > what > > matters. > > I don't understand. Do you mean my last example is wrong / insecure? > If so, why? > It is at risk because it depends on the session search_path. That is all. Whether that risk turns into a failure to execute depends on how/when it is executed. I'm not that comfortable talking about security risks in this context though the current design goal is to mitigate such security issues by setting things up so the function execution fails rather than is executed insecurely. This is presently mainly done by setting the search_path to just effectively pg_catalog before executing the query, breaking any code depending on other schemas existing in the search_path. > > Anything that would be executed during pg_restore has to be made > > safe. Therefore, code that is only ever executed by applications > directly > > can use swarch_path. > > Why should the function be executed during pg_restore? If the function is used in building an index, or a materialized view, are the common cases. Trigger functions too. Note, this is talking about evaluating functions generally, not the one provided here specifically. > I could do that, but I would like to understand if that is really > necessary as it makes the interface more complicated, and I would like > to avoid unnecessary complexity in my interface. > > Is it really impossible to have functions without SET search_path in > the definition of a PL/pgSQL function if I fully-qualify all types in > the DECLARE section and if all other non-qualified identifiers occur > after set_config('search_path', ...)? > > If you add a set_config to the body of the function then you indeed avoid the problem. It is basically equivalent to adding a SET clause to the create function command. In this case even when the function is executed in a sanitized search_path environment (such as the one established by pg_restore) you are not relying on it. That non-reliance is all that really matters. David J.
Re: search_path for PL/pgSQL functions partially cached?
On Friday, January 3, 2025, Jan Behrens wrote: > > I would like to know if the above example is correct. It seems overall > bulky, but I haven't found a better way, assuming that it can be > unknown where a particular extension has been installed to. In > particular I feel a bit insecure about where I have to fully qualify, > and where not. See the comments in the code above. Short answer, you cannot looking at a definition and know the answer - whether the code is going to be executed in a sanitized search_path is what matters. Anything that would be executed during pg_restore has to be made safe. Therefore, code that is only ever executed by applications directly can use swarch_path. I’d probably modify the function signature to take search_path as a second optional argument and then invoke a set search_path within the function. At worse the caller can place current_setting(search_path) as the value of that argument though being explicit would be recommended. David J.