Are PostgreSQL functions that return sets or tables evaluated lazily or eagerly?

2020-01-03 Thread Gerald Britton
Cross post from Stack Overflow:
https://stackoverflow.com/questions/59554144/are-postgresql-functions-that-return-sets-or-tables-evaluated-lazily-or-eagerly


I'm learning to write functions in PostgreSQL. When I got to the
documentation chapter on cursors, I came across this interesting comment:

A more interesting usage is to return a reference to a cursor that a
function has created, allowing the caller to read the rows. This provides
an efficient way to return large row sets from functions.

Near the top of this page: 42.7. Cursors
<https://www.postgresql.org/docs/current/plpgsql-cursors.html>

That made me wonder where, specifically, this would be more efficient than
a plain old function call.

I made up a little test function and call:

CREATE FUNCTION foo() RETURNS SETOF customers
LANGUAGE SQL AS $$
SELECT c.* FROM customers c
CROSS JOIN customers x
CROSS JOIN customers y;$$;
SELECT * FROM foo() LIMIT 1;

The customers table I'm working with has 20,000 rows so with the cross
joins that should be 8e+12 rows (which would take a while to fully
evaluate!). The select statement at the end appears to confirm that the
function is reading all rows (I had to cancel it after several seconds --
way more than to just return the first row)

That leads me to ask:

If (and under what circumstances) PostgreSQL evaluates functions lazily
(returning rows as requested by the caller) or eagerly (evaluation all rows
before returning the first one)?

-- 
Gerald Britton, MCSE-DP, MVP
LinkedIn Profile: http://ca.linkedin.com/in/geraldbritton


Re: Are PostgreSQL functions that return sets or tables evaluated lazily or eagerly?

2020-01-05 Thread Gerald Britton
Thank you all for the detailed explanations.  I think the most
disappointing is this bit:
>
>
> 2. Table function called in the FROM clause
> Table functions in the FROM clause, e.g. SELECT ... FROM myfunc();
> are always evaluated eagerly.


Which more or less matches my toy example.  OTOH Tom mentioned that marking
the function STABLE effectively makes it lazy since it is inlined and my
testing confirms that for SQL language functions.  Alas not for PL/pgSQL
functions, though.

Back to where I started in my top post:  I became interested in this due to
the doc note on returning a cursor and that it can be an efficient way to
handle large result sets.  I suppose that implies lazy evaluation.  Does
that mean that if I need plpgsql for a function for he language's power yet
want the results to be returned lazily, a cursor is the (only?) way to go?

On Fri, Jan 3, 2020 at 10:07 PM Andrew Gierth 
wrote:

> >>>>> "Gerald" == Gerald Britton  writes:
>
>  Gerald> That leads me to ask:
>
>  Gerald> If (and under what circumstances) PostgreSQL evaluates
>  Gerald> functions lazily (returning rows as requested by the caller) or
>  Gerald> eagerly (evaluation all rows before returning the first one)?
>
> This isn't trivial to answer because it depends significantly on the
> language the function is written in and how it was called. The first
> matching rule below controls what happens.
>
>
> 1. LANGUAGE SQL with inlining
>
> Table functions in language SQL are candidates for inlining, see
> https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions
>
> If an SQL-language function is inlined, then it behaves exactly as
> though the function body was written inline, which means it is evaluated
> as lazily as the query plan otherwise permits (for example, if there's a
> non-indexable ORDER BY clause, then clearly all the values have to be
> fetched before any are returned).
>
>
> 2. Table function called in the FROM clause
>
> Table functions in the FROM clause, e.g. SELECT ... FROM myfunc();
> are always evaluated eagerly.
>
>
> 3. LANGUAGE SQL without inlining, in the select-list
>
> If the final statement of an SQL function is a plain select with no
> wCTEs, then it is evaluated lazily: the first fetch will execute
> everything up to the first row of the final select, and subsequently one
> row will be fetched at a time. If the final statement is a DML statement
> with a RETURNING clause, or contains wCTEs, then it is evaluated
> eagerly.
>
>
> 4. LANGUAGE C / INTERNAL
>
> C-language functions (and therefore internal functions too) can choose
> whether to use value-per-call mode or materialize mode. Materialize mode
> is always "eager", but value-per-call mode is sometimes still eager (as
> in case 2 above); it can only be lazy if no preceding rule forced it to
> be otherwise.
>
> Most built-in table functions use value-per-call mode (a notable
> exception being the functions in the tablefunc module).
>
>
> 5. LANGUAGE PLPGSQL, PLPERL, PLTCL
>
> Plpgsql, plperl, and pltcl functions are always evaluated eagerly.
>
>
> 6. LANGUAGE plpython
>
> Plpython functions that return an iterator run in value-per-call mode,
> with a "next" call on the iterator for each row. To what extent this is
> a lazy or eager evaluation depends on the python code.
>
>
> 7. Other PL languages
>
> For non-core PL languages the documentation or source code may indicate
> whether the language uses materialize mode or value-per-call mode. (Most
> languages are probably not well equipped to do value-per-call mode. One
> that does allow it is pl/lua, which runs table functions as coroutines.)
>
>
> --
> Andrew (irc:RhodiumToad)
>


-- 
Gerald Britton, MCSE-DP, MVP
LinkedIn Profile: http://ca.linkedin.com/in/geraldbritton


Determine actual type of a pseudo-type argument

2020-01-05 Thread Gerald Britton
If I use a pseudo-type in the argument list of a function definition (SQL
or plpgsql), how can I determine the actual type used a runtime?

-- 
Gerald Britton, MCSE-DP, MVP
LinkedIn Profile: http://ca.linkedin.com/in/geraldbritton


Re: Determine actual type of a pseudo-type argument

2020-01-05 Thread Gerald Britton
E.g. anyelement, anyenum, anyarray are in a table of pseudo types in the
official docs.

I'll try pg_typeof...looks like a possibility.

On Sun, Jan 5, 2020, 5:37 PM Steve Baldwin  wrote:

> I'm not sure what you mean by pseudo-type, but does the pg_typeof()
> function help?
>
> Steve
>
> On Mon, Jan 6, 2020 at 9:34 AM Gerald Britton 
> wrote:
>
>> If I use a pseudo-type in the argument list of a function definition (SQL
>> or plpgsql), how can I determine the actual type used a runtime?
>>
>> --
>> Gerald Britton, MCSE-DP, MVP
>> LinkedIn Profile: http://ca.linkedin.com/in/geraldbritton
>>
>


Re: Are PostgreSQL functions that return sets or tables evaluated lazily or eagerly?

2020-01-19 Thread Gerald Britton
Love to see "return next" work like python yield! Anyone working on that?

On Sun, Jan 5, 2020, 5:46 PM Tom Lane  wrote:

> Gerald Britton  writes:
> > Back to where I started in my top post:  I became interested in this due
> to
> > the doc note on returning a cursor and that it can be an efficient way to
> > handle large result sets.  I suppose that implies lazy evaluation.  Does
> > that mean that if I need plpgsql for a function for he language's power
> yet
> > want the results to be returned lazily, a cursor is the (only?) way to
> go?
>
> Nope.  The docs' reference to a cursor only suggests that if you can
> express the function's result as a single SQL query, then opening a
> cursor for that query and returning the cursor name will work.  But
> if you need plpgsql to express the computation, that's not a terribly
> helpful suggestion.
>
> If you'd like to see some actual movement on the missing feature about
> lazy evaluation in FROM, you could help test/review the pending patch
> about it:
>
> https://commitfest.postgresql.org/26/2372/
>
> However, that still is only half of the problem, because you also need
> a PL that is prepared to cooperate, which I don't believe plpgsql is.
> I think (might be wrong) that a plpython function using "yield" can
> be made to compute its results lazily.
>
> regards, tom lane
>