DDL tools for working with views in dependency order?

2021-11-29 Thread Wicher
Hello,

As part of analysis pipelines I'm dealing with stacks/trees of views — views 
selecting
from other views, often recombining it with table data — which at times undergo 
heavy
development. This often requires changing/discarding/renaming/… a view or many 
views
somewhere in the middle of the stack. Depending on whether the type signature 
of the view
is changing, a CREATE OR REPLACE to mutate views "in-place" may not be 
possible, so doing
refactoring often requires dropping & recreating views dependent on the mutated 
views.

Are there tools out there which allow for easily doing such work - rebuilding
parts of the dependency-tree of views as needed? I'm looking for something more 
convenient
than "put your view DDL statements in a carefully ordered & manually 
reference-checked
.sql file". I also want to have the state of the view tree outside of the 
database and
maintain/version it in concert with application code.

Furthermore, some of the views in the tree are materialized, and I'd like to be 
able to
issue some sort of "cascading refresh" at my convenience (so, not through a 
trigger), that
is, refresh all materialized views that directly or indirectly depend on data 
sourced from
certain tables (by-table resolution would be high enough, by-column resolution 
is not
necessary), and do so in dependency-order. Are there tools out there that allow 
me to do
such?

I'm currently working on my own spin on solving this mundane-sounding problem, 
and I'm
curious towards any existing approaches and any PostgreSQL-native primitives 
that may
be of aid. Please share your thoughts and comments!

Thanks and regards, Wicher




Re: How to reveal the codes of functions properly?

2021-11-30 Thread Wicher
On Tue, 30 Nov 2021 20:17:32 +
Shaozhong SHI  wrote:

> In what environment, that did not work.
> 
> 
> On Tuesday, 30 November 2021, Rob Sargent  wrote:
> 
> > On 11/30/21 9:30 AM, Shaozhong SHI wrote:
> >
> > Any one can shed the light on this?
> >
> > Regards,
> >
> > David
> >
> > \sf+ function_name
> >  

If you run that in a psql shell when psql has been passed `-E`, you'll see that 
under the
hood it runs:

SELECT n.nspname as "Schema",
  p.proname as "Name",
  pg_catalog.pg_get_function_result(p.oid) as "Result data type",
  pg_catalog.pg_get_function_arguments(p.oid) as "Argument data types",
 CASE p.prokind
  WHEN 'a' THEN 'agg'
  WHEN 'w' THEN 'window'
  WHEN 'p' THEN 'proc'
  ELSE 'func'
 END as "Type",
 CASE
  WHEN p.provolatile = 'i' THEN 'immutable'
  WHEN p.provolatile = 's' THEN 'stable'
  WHEN p.provolatile = 'v' THEN 'volatile'
 END as "Volatility",
 CASE
  WHEN p.proparallel = 'r' THEN 'restricted'
  WHEN p.proparallel = 's' THEN 'safe'
  WHEN p.proparallel = 'u' THEN 'unsafe'
 END as "Parallel",
 pg_catalog.pg_get_userbyid(p.proowner) as "Owner",
 CASE WHEN prosecdef THEN 'definer' ELSE 'invoker' END AS "Security",
 pg_catalog.array_to_string(p.proacl, E'\n') AS "Access privileges",
 l.lanname as "Language",
 p.prosrc as "Source code",
 pg_catalog.obj_description(p.oid, 'pg_proc') as "Description"
FROM pg_catalog.pg_proc p
 LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace
 LEFT JOIN pg_catalog.pg_language l ON l.oid = p.prolang
WHERE pg_catalog.pg_function_is_visible(p.oid)
  AND n.nspname <> 'pg_catalog'
  AND n.nspname <> 'information_schema'
ORDER BY 1, 2, 4;




Re: Advice on using materialized views

2021-12-07 Thread Wicher
On Mon, 06 Dec 2021 18:48:47 +
"Phil Endecott"  wrote:

> and 
> I need to modify the definition of a view that filters the "new" 
> values from the raw table each time the materialised view is 
> refreshed.

You won't necessarily need to rewrite the "recent data" view definitions, I 
think. What is
deemed "recent" depends on what's in the materialized views (it'd be anything 
newer than
whatever is in there). The good news is that you can simply query for that :-)
So trivially, in your "the data that is more recent than the stuff from the 
materialized
views" non-materialized view you'd use a definition like
SELECT  WHERE sometimestamp > (select max(sometimestamp) from 
the_materialized_view)
or something along those lines.




Re: Advice on using materialized views

2021-12-12 Thread Wicher
On Thu, 09 Dec 2021 16:06:27 +
"Phil Endecott"  wrote:

> Thanks both for your replies.
> 
> Wicher wrote:
> > On Mon, 06 Dec 2021 18:48:47 +
> > "Phil Endecott"  wrote:  
> >> and 
> >> I need to modify the definition of a view that filters the "new" 
> >> values from the raw table each time the materialised view is 
> >> refreshed.  
> >
> > You won't necessarily need to rewrite the "recent data" view definitions, I 
> > think. What is
> > deemed "recent" depends on what's in the materialized views (it'd be 
> > anything newer than
> > whatever is in there). The good news is that you can simply query for that 
> > :-)
> > So trivially, in your "the data that is more recent than the stuff from the 
> > materialized
> > views" non-materialized view you'd use a definition like
> > SELECT  WHERE sometimestamp > (select max(sometimestamp) from 
> > the_materialized_view)
> > or something along those lines.  
> 
> 
> I guess I was hoping that someone would suggest a more "magic"
> way to do this sort of thing. Actually I'm a bit surprised that
> materialised views don't event have a way to either
> 
> - Refresh a materialised view whenever a source table is modified;
> 
> - Refresh a materialised view whenever it is read, if a source table
> has changed since it was last refreshed.
> 
> Beyond that, I could imagine smart updates where e.g. if you
> modify source table rows with primary key K, then you only need
> to refresh materialised view rows derived from K.
> 
> I think this could all be done on top of triggers. I wonder, do any
> other databases do things like this automagically?
> 

Not too long ago I asking the list something similar but came up short:

https://www.postgresql.org/message-id/20211129102315.058595fe@tipi

Here's my own take on that problem area, tangent to your question. This
project aims to do (or make it easier to do) the things you mention:

https://git.sr.ht/~nullenenenen/DBSamizdat

It supports your first use case out of the box, and may make your second use 
case easier
to accommodate, give it a go :-)


There's a sweet spot for materialized views. But at some point 
(volume/computational
load/freshness requirements) it becomes necessary to use tables instead so that 
you can
indeed implement efficient partial recalculation. As far as I know. I too am 
curious
about other approaches.