Re: Is the PL/pgSQL refcursor useful in a modern three-tier app?

Bryn Llewellyn Wed, 22 Mar 2023 12:09:25 -0700

> laurenz.a...@cybertec.at wrote:
> 
> ...I understand that you ask questions to gain deeper understanding.
> 
>> b...@yugabyte.com wrote:
>> 
>> ...I had never come across use cases where [scrollability] was beneficial. I 
>> wanted, therefore, to hear about some. I thought that insights here would 
>> help me understand the mechanics.
> 
> I recently used cursor scrollability, so I can show you a use case:
> 
> github.com/cybertec-postgresql/db_migrator/blob/master/db_migrator--1.0.0.sql#L49
> 
> The goal is to get the query result count right away, without having to run a 
> second query for it: you declare the cursor, move to the end of the result 
> set, fetch the ROW_COUNT, then move back to the beginning of the result set 
> and start fetching the result rows.
> 
> ...I personally find that reading the PostgreSQL documentation gets you far, 
> but only so far: for deep understanding, you have to read the code. It is 
> usually well
> documented and readable, and I have come to see it as an extension of the 
> documentation that covers the details.


Thanks for the link to your SQL file at the line where you get the row count in 
the way that you describe. I saw that this is in the PL/pgSQL source text for 
function "materialize_foreign_table()" (~200 lines). And I saw that you use the 
cursor mechanism that we're discussing here in only one other function, 
"db_migrate_refresh()" (~480 lines). But this second one doesn't use the "move 
forward all" device to get a row count. I looked only at these two functions.

I noted that neither of these functions has a refcursor formal argument and 
that, rather, you open (i.e. create) each of the three cursors that you use 
within the two functions that uses them. I noted, too,  that for the three 
"select" statements that you use to open your refcursors, none of these has an 
"order by". I noted that your code implements "create table destination" and 
"insert into destination... select from source..." where order doesn't matter. 

However, source code famously reveals only what it does and not what the 
author's intention, and overall design philosophy, is. I looked at the README 
accounts for these two functions here:

github.com/cybertec-postgresql/db_migrator#materialize_foreign_table
github.com/cybertec-postgresql/db_migrator#db_migrate_refresh

But these accounts say nothing about the design of their implementations. The 
accounts, and the bigger picture, seem to imply that read consistency in the 
presence of concurrent activity from other sessions is not a concern. I'm 
guessing that this is accounted for upstream from how the code that I'm looking 
at operates—i.e. that the source database is extracted to staging tables like, 
say, export does so that your code operates as the only session that reads 
from, creates, and populates the tables that it references.

The upshot, therefore, is that I'm afraid that I can only guess at why you use 
"open, fetch, close" on a refcursor rather than an ordinary cursor for loop. 
After all, you use the technique only to traverse metadata tables about 
partitions, subpartitions, and columns. I'd be astonished if such tables have 
humongous numbers of rows (like hundreds of millions).

As a sanity test, I did this:

create function s.t_count()
  returns int
  set search_path = pg_catalog, pg_temp
  language plpgsql
as $body$
declare
  cur refcursor := 'cur';
  n int not null := 0;
begin
  open cur for select v from s.t;
  move forward all in cur;
  get diagnostics n = row_count;
  move absolute 0 in cur;
  return n;
end;
$body$;

Then I timed these two alternatives on a test table with ten million rows.

select count(*) from s.t;
select s.t_count();

They both got the same answer. But my function took about twice as long as the 
native count(*). Mind you, with "only" 10 million rows (and using a PG cluster 
in a VM on my Mac Book) the times were tiny: ~600ms versus ~300ms.

Then I tried this:

create procedure s.p(tab in text, mode in text)
  set search_path = pg_catalog, pg_temp
  security definer
  language plpgsql
as $body$
declare
  cur       constant refcursor not null := 'cur';
  stmt      constant text      not null := format('select k, v from s.%I',     
tab);
  cnt_stmt  constant text      not null := format('select count(*) from s.%I', 
tab);

  kk   int not null := 0;
  vv   int not null := 0;
  k    int;
  v    int;
  n    int not null := 0;
  cnt  int not null := 0;
begin
  case mode
    when 'naive' then
      execute cnt_stmt into cnt;
      for kk, vv in execute stmt loop
        n := n + 1;
        assert kk = n and vv = n*2;
      end loop;
      assert n = cnt;

    when 'refcursor' then
      open cur for execute stmt;
      move forward all in cur;
      get diagnostics cnt = row_count;
      move absolute 0 in cur;

      loop
        fetch cur into k, v;
        exit when not found;
        n := n + 1;
        assert k = n and v = n*2;
      end loop;
      close cur; -- Just as a formality.
      assert n = cnt;
  end case;
end;
$body$

I created my 10 million row table like this:

with g(v) as (select generate_series(1, 10*1000*1000))
insert into s.t(k, v) select g.v, (g.v)*2 from g;

And I timed each mode a few times to get a reasonable average. The naive mode 
took ~7.5 sec; and the refcursor mode took ~13 sec. I don't s'pose that the 
difference matters much (and anyway, in your use case, all those "create table" 
and "insert select" statements would dominate the times). But it seems safe to 
say that the refcursor mode doesn't _help_ performance. Moreover, because the 
naive mode uses the cursor mechanism under the covers, the memory consumption 
discussion is the same for both approaches.

I appreciate that, in the face of concurrent sessions making changes to the 
content of "s.t", and using the default "read committed" isolation level, the 
before-the-fact separate count(*) might disagree with the after-the-fact value 
from counting the rows that are traversed. And you need the count at the start 
of the loop so that you can create the right number of (sub)partitions. So I do 
see (because other considerations, including code complexity, are so similar) 
that the refcursor mode is more stylish. Maybe that's it.

Certainly, it was very useful to think about this and to try the tests that I 
did. So thank you very much for showing me your code.
————————————————————

Finally, I see how an understanding of internals helps the understanding of 
performance-related question. But I find it very hard to accept that I should 
read the C implementation of PostgreSQL in order to get the proper mental 
model, and the proper terms of art, that I need to understand it semantics (and 
the corresponding SQL and PL/pgSQL syntax). Having said that, I did a little 
test an saw that this:

move last in cur;
get diagnostics n = row_count;

reported just 1—in contrast to what you used:

move forward all in cur;

I've no idea how it's possible to navigate to the last result in the set 
without knowing how many there are. Maybe that fact is there internally—but 
with no explicit SQL or PL/pgSQL exposure of the value.

Re: Is the PL/pgSQL refcursor useful in a modern three-tier app?

Reply via email to