Re: search_path for PL/pgSQL functions partially cached?

2025-01-03 Thread Jan Behrens
On Fri, 3 Jan 2025 08:34:57 -0700
"David G. Johnston"  wrote:

> On Friday, January 3, 2025, Jan Behrens  wrote:
> >
> > I would like to know if the above example is correct. It seems overall
> > bulky, but I haven't found a better way, assuming that it can be
> > unknown where a particular extension has been installed to. In
> > particular I feel a bit insecure about where I have to fully qualify,
> > and where not. See the comments in the code above.
> 
> 
> Short answer, you cannot looking at a definition and know the answer -
> whether the code is going to be executed in a sanitized search_path is what
> matters.

I don't understand. Do you mean my last example is wrong / insecure?
If so, why?

> Anything that would be executed during pg_restore has to be made
> safe.  Therefore, code that is only ever executed by applications directly
> can use swarch_path.

Why should the function be executed during pg_restore?

> 
> I’d probably modify the function signature to take search_path as a second
> optional argument and then invoke a set search_path within the function.
> At worse the caller can place current_setting(search_path) as the value of
> that argument though being explicit would be recommended.
> 
> David J.

I could do that, but I would like to understand if that is really
necessary as it makes the interface more complicated, and I would like
to avoid unnecessary complexity in my interface.

Is it really impossible to have functions without SET search_path in
the definition of a PL/pgSQL function if I fully-qualify all types in
the DECLARE section and if all other non-qualified identifiers occur
after set_config('search_path', ...)?

Kind regards,
Jan Behrens




Re: search_path for PL/pgSQL functions partially cached?

2025-01-03 Thread Jan Behrens
On Fri, 3 Jan 2025 10:16:15 -0700
"David G. Johnston"  wrote:

> It is at risk because it depends on the session search_path.  That is all.
> Whether that risk turns into a failure to execute depends on how/when it is
> executed.  I'm not that comfortable talking about security risks in this
> context though the current design goal is to mitigate such security issues
> by setting things up so the function execution fails rather than is
> executed insecurely.  This is presently mainly done by setting the
> search_path to just effectively pg_catalog before executing the query,
> breaking any code depending on other schemas existing in the search_path.

I'm not sure if there is a misunderstanding. In my last example (e-mail
dated Fri, 3 Jan 2025 13:53:32 +0100), the user who has control over
the contents of the "query_p" argument is an application programmer,
not a real end-user. The function is also *not* marked as SECURITY
DEFINER, so it always runs with the privileges of the caller. I don't
see any specific security risk here, except that I'm unsure if the
function is written properly with regard to qualification of the used
types after PL/pgSQL's BEGIN. As I learned, I must fully-qualify types
*before* the BEGIN, i.e. in the DECLARE section. But does this also
hold for types after the BEGIN when I previously ensure that the
search_path is correctly set (set within the function's body)?

> 
> > > Anything that would be executed during pg_restore has to be made
> > > safe.  Therefore, code that is only ever executed by applications
> > directly
> > > can use swarch_path.
> >
> > Why should the function be executed during pg_restore?
> 
> 
> If the function is used in building an index, or a materialized view, are
> the common cases.  Trigger functions too.
> 
> Note, this is talking about evaluating functions generally, not the one
> provided here specifically.

I don't think my function would be evaluated during a pg_restore then.

> 
> > I could do that, but I would like to understand if that is really
> > necessary as it makes the interface more complicated, and I would like
> > to avoid unnecessary complexity in my interface.
> >
> > Is it really impossible to have functions without SET search_path in
> > the definition of a PL/pgSQL function if I fully-qualify all types in
> > the DECLARE section and if all other non-qualified identifiers occur
> > after set_config('search_path', ...)?
> >
> If you add a set_config to the body of the function then you indeed avoid
> the problem.  It is basically equivalent to adding a SET clause to the
> create function command.  In this case even when the function is executed
> in a sanitized search_path environment (such as the one established by
> pg_restore) you are not relying on it.  That non-reliance is all that
> really matters.
> 
> David J.

But if I use "SET search_path FROM CURRENT", then the called function
won't know the search_path that is set at the caller's side (which is
what I need to make my interface nice to use).

I would prefer my current solution, but I would like to be sure that my
example (the one in my e-mail dated Fri, 3 Jan 2025 13:53:32 +0100) is
correct. I still am not sure about that.

Kind Regards,
Jan Behrens




Re: search_path for PL/pgSQL functions partially cached?

2025-01-03 Thread Jan Behrens
On Fri, 3 Jan 2025 13:56:02 -0800
Adrian Klaver  wrote:

> At this point I am lost as to what the overall goal of this is.
> 
> Can you provide a 1 ft view if what it is you are trying to achieve?

Sure! I would like to create a component (e.g. a PostgreSQL extension)
that provides a function which processes some complex data, without
making any requirements regarding where the data is stored. To pass
this data to the function, I could use arrays of composite types, but
that seems to be very bulky. Another option would be to use cursors,
but that didn't turn out to work very smooth either.

Instead, I plan to expect the function to receive a query string that
will get the data that is being processed by the function.

That query string should be allowed to refer to tables in the
search_path at the caller's side.

Therefore, I cannot use the "SET search_path FROM CURRENT" in my
"CREATE FUNCTION" statement, because it would overwrite the current
search_path on each call of the function.

Thus my idea is to do this (simplified):

CREATE FUNCTION "some_function" ("query_p" TEXT) RETURNS "some_type"

  
  -- I cannot use SET search_path FROM CURRENT here, because "query_p"
  -- shall refer to tables in the search_path of the caller.
  

  LANGUAGE plpgsql AS $$
DECLARE
  "old_search_path" TEXT;

  
  -- I have to fully qualify types in the DECLARE section.
  

  "some_variable" "some_schema"."some_type";
BEGIN
  SELECT current_setting('search_path') INTO "old_search_path";
  PERFORM set_config(
'search_path',
'some_schema, pg_temp, ' || "old_search_path",
TRUE
  );

  
  -- Do I have to fully qualify types and operators from
  -- "myschema" here? Or is it safe to not fully qualify them?
  
END;
  $$;

That is my overall idea.

My problem is that I'm confused about WHEN EXACTLY I have to qualify
tables/types, etc. It is very hard to understand from reading (just) the
documentation.

Kind Regards,
Jan Behrens




Re: search_path for PL/pgSQL functions partially cached?

2025-01-03 Thread Jan Behrens
On Sat, 4 Jan 2025 00:22:03 +0100
Jan Behrens  wrote:

>   
>   -- Do I have to fully qualify types and operators from
>   -- "myschema" here? Or is it safe to not fully qualify them?
>   

This was meant to read:

...from "some_schema" here.




Re: search_path for PL/pgSQL functions partially cached?

2025-01-03 Thread Isaac Morland
On Fri, 3 Jan 2025 at 18:22, Jan Behrens  wrote:


> Instead, I plan to expect the function to receive a query string that
> will get the data that is being processed by the function.
>
> That query string should be allowed to refer to tables in the
> search_path at the caller's side.
>
> Therefore, I cannot use the "SET search_path FROM CURRENT" in my
> "CREATE FUNCTION" statement, because it would overwrite the current
> search_path on each call of the function.
>

 I wonder if it would help if EXECUTE took an optional search_path to use
while executing the query.


Re: search_path for PL/pgSQL functions partially cached?

2025-01-03 Thread Adrian Klaver

On 1/3/25 13:33, Jan Behrens wrote:

On Fri, 3 Jan 2025 10:16:15 -0700
"David G. Johnston"  wrote:





But if I use "SET search_path FROM CURRENT", then the called function
won't know the search_path that is set at the caller's side (which is
what I need to make my interface nice to use).


At this point I am lost as to what the overall goal of this is.

Can you provide a 1 ft view if what it is you are trying to achieve?




I would prefer my current solution, but I would like to be sure that my
example (the one in my e-mail dated Fri, 3 Jan 2025 13:53:32 +0100) is
correct. I still am not sure about that.

Kind Regards,
Jan Behrens




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: search_path for PL/pgSQL functions partially cached?

2025-01-03 Thread David G. Johnston
On Fri, Jan 3, 2025 at 2:33 PM Jan Behrens  wrote:

> I would prefer my current solution, but I would like to be sure that my
> example (the one in my e-mail dated Fri, 3 Jan 2025 13:53:32 +0100) is
> correct. I still am not sure about that.
>
>
If it does what you want and it is only ever executed by application code
over a client connection you should be fine.  Your client connection will
always have whatever search_path you arrange to have in place and the
application developer will know whether their environment is correct or not
when they test it, and have the ability to change their environment as
needed.

David J.


Re: search_path for PL/pgSQL functions partially cached?

2025-01-03 Thread Jan Behrens
On Sat, 28 Dec 2024 00:40:09 +0100
Jan Behrens  wrote:

> > Add qualification or attach a “set search_path” clause to “create
> > function”.  Code stored in the server should not rely on the session
> > search_path.
> > 
> > David J.

I have been trying to adjust some of my code, and I still have cases
where I have to rely on the session's search_path. I'll provide an
example below.

> 
> [...]
> 
> My question is: Am I safe if I use fully-qualified types in the DECLARE
> section only? Or do I need to provide full qualification also in the
> code below (after SET search_path TO 'myschema')?
> 
> And bonus question: Is it documented somewhere?
>
> [...]
> 
> Kind Regards
> Jan Behrens

The following code is taken from a project I'm currently working on:



-- Let's assume we don't know the name of the schema in which the
-- "pgratio" extension with the RATIONAL data type is installed.
CREATE SCHEMA "qwertyuiop";
CREATE EXTENSION "pgratio" WITH SCHEMA "qwertyuiop";

-- This installs schema "myschema" with some dynamic function:
BEGIN;

CREATE SCHEMA "myschema";
SET LOCAL search_path TO "myschema";

-- Append schema of "pgratio" extension, which provides the RATIONAL
-- data type, to search_path:
SELECT set_config(
  'search_path',
  current_setting('search_path') || ', ' || quote_ident(nspname),
  TRUE
) FROM pg_namespace, pg_extension
  WHERE pg_namespace.oid = extnamespace AND extname = 'pgratio';

CREATE DOMAIN "rational_wrapper" AS RATIONAL;

CREATE FUNCTION "some_function" ("query_p" TEXT) RETURNS RATIONAL

  
  -- I cannot use SET search_path FROM CURRENT here, because "query_p"
  -- shall refer to tables in the search_path of the caller.
  

  LANGUAGE plpgsql AS $$
DECLARE
  "old_search_path" TEXT;

  
  -- I have to fully qualify the following type.
  -- Moreover, I can't use RATIONAL as I don't know its schema.
  

  "result" "myschema"."rational_wrapper";
BEGIN
  SELECT current_setting('search_path') INTO "old_search_path";
  PERFORM set_config(
'search_path',
'myschema, ' || quote_ident(nspname) || ', pg_temp, ' ||
"old_search_path",
TRUE
  ) FROM pg_namespace, pg_extension
WHERE pg_namespace.oid = extnamespace AND extname = 'pgratio';

  
  -- Is it safe to not fully qualify type RATIONAL below?
  -- And, if yes, where in the documentation is this explained?
  

  CREATE TEMP TABLE "mytemptable" ("val" RATIONAL);
  EXECUTE 'INSERT INTO "mytemptable" '
'SELECT "query"."a" * "query"."b" '
'FROM (' || "query_p" || ') AS "query"';
  -- Do some things here.
  SELECT sum("val") INTO "result" FROM "mytemptable";
  PERFORM set_config('search_path', "old_search_path", TRUE);
  RETURN "result";
END;
  $$;

COMMIT;

CREATE TABLE "tbl" ("foo" INT8, "bar" INT8);
INSERT INTO "tbl" VALUES (5, 7), (1, 10);

SELECT "myschema"."some_function"(
  'SELECT "foo" AS "a", "bar" AS "b" FROM "tbl"'
);
\c
SELECT "myschema"."some_function"(
  'SELECT "foo" AS "a", "bar" AS "b" FROM "tbl"'
);



The code for the pgratio extension that provides the RATIONAL data type
is found here: https://www.public-software-group.org/pgratio

Running that code on my machine correctly gives:

some_function 
---
 45
(1 row)

You are now connected to database "jbe" as user "jbe".
 some_function 
---
 45
(1 row)

Because extensions can only be installed in one schema, it may be a bad
idea to have a component requiring an extension to be installed in a
particular schema (because if different components have different
expectations on the schema name, e.g. some might expect "pgratio" to be
installed in "public" and others might expect it in "pgratio" or some
other schema such as "qwertyuiop", this would lead to an unresolvable
conflict).

I would like to know if the above example is correct. It seems overall
bulky, but I haven't found a better way, assuming that it can be
unknown where a particular extension has been installed to. In
particular I feel a bit insecure about where I have to fully qualify,
and where not. See the comments in the code above.

Note that I want the function to accept a query that makes sense in the
caller's search_path. Thus using "SET search_path FROM CURRENT" is not
an option for me, I believe.

Regards,
Jan Behrens




Re: search_path for PL/pgSQL functions partially cached?

2025-01-03 Thread David G. Johnston
On Fri, Jan 3, 2025 at 9:48 AM Jan Behrens  wrote:

> On Fri, 3 Jan 2025 08:34:57 -0700
> "David G. Johnston"  wrote:
>
> > On Friday, January 3, 2025, Jan Behrens  wrote:
> > >
> > > I would like to know if the above example is correct. It seems overall
> > > bulky, but I haven't found a better way, assuming that it can be
> > > unknown where a particular extension has been installed to. In
> > > particular I feel a bit insecure about where I have to fully qualify,
> > > and where not. See the comments in the code above.
> >
> >
> > Short answer, you cannot looking at a definition and know the answer -
> > whether the code is going to be executed in a sanitized search_path is
> what
> > matters.
>
> I don't understand. Do you mean my last example is wrong / insecure?
> If so, why?
>

It is at risk because it depends on the session search_path.  That is all.
Whether that risk turns into a failure to execute depends on how/when it is
executed.  I'm not that comfortable talking about security risks in this
context though the current design goal is to mitigate such security issues
by setting things up so the function execution fails rather than is
executed insecurely.  This is presently mainly done by setting the
search_path to just effectively pg_catalog before executing the query,
breaking any code depending on other schemas existing in the search_path.


> > Anything that would be executed during pg_restore has to be made
> > safe.  Therefore, code that is only ever executed by applications
> directly
> > can use swarch_path.
>
> Why should the function be executed during pg_restore?


If the function is used in building an index, or a materialized view, are
the common cases.  Trigger functions too.

Note, this is talking about evaluating functions generally, not the one
provided here specifically.




> I could do that, but I would like to understand if that is really
> necessary as it makes the interface more complicated, and I would like
> to avoid unnecessary complexity in my interface.
>
> Is it really impossible to have functions without SET search_path in
> the definition of a PL/pgSQL function if I fully-qualify all types in
> the DECLARE section and if all other non-qualified identifiers occur
> after set_config('search_path', ...)?
>
> If you add a set_config to the body of the function then you indeed avoid
the problem.  It is basically equivalent to adding a SET clause to the
create function command.  In this case even when the function is executed
in a sanitized search_path environment (such as the one established by
pg_restore) you are not relying on it.  That non-reliance is all that
really matters.

David J.


Re: search_path for PL/pgSQL functions partially cached?

2025-01-03 Thread David G. Johnston
On Friday, January 3, 2025, Jan Behrens  wrote:
>
> I would like to know if the above example is correct. It seems overall
> bulky, but I haven't found a better way, assuming that it can be
> unknown where a particular extension has been installed to. In
> particular I feel a bit insecure about where I have to fully qualify,
> and where not. See the comments in the code above.


Short answer, you cannot looking at a definition and know the answer -
whether the code is going to be executed in a sanitized search_path is what
matters.  Anything that would be executed during pg_restore has to be made
safe.  Therefore, code that is only ever executed by applications directly
can use swarch_path.

I’d probably modify the function signature to take search_path as a second
optional argument and then invoke a set search_path within the function.
At worse the caller can place current_setting(search_path) as the value of
that argument though being explicit would be recommended.

David J.