Re: [GENERAL] Poor performance when using a window function in a view

Chris Hanks Fri, 01 Mar 2013 14:54:49 -0800

On Fri, Mar 1, 2013 at 6:37 AM, Merlin Moncure <mmonc...@gmail.com> wrote:


> On Fri, Mar 1, 2013 at 3:59 AM, Chris Hanks
> <christopher.m.ha...@gmail.com> wrote:
> > On Fri, Mar 1, 2013 at 1:21 AM, Raymond O'Donnell <r...@iol.ie> wrote:
> >>
> >> On 01/03/2013 00:19, Chris Hanks wrote:
> >> > On Thu, Feb 28, 2013 at 6:00 AM, Tom Lane <t...@sss.pgh.pa.us
> >> > <mailto:t...@sss.pgh.pa.us>> wrote:
> >> >
> >> >     Chris Hanks <christopher.m.ha...@gmail.com
> >> >     <mailto:christopher.m.ha...@gmail.com>> writes:
> >> >     > create or replace view values_view as
> >> >     > select fkey1, fkey3,
> >> >     >   (derived1 / max(derived1) over (partition by fkey1)) as
> >> > derived1,
> >> >     >   (derived2 / sum(derived1) over (partition by fkey1)) as
> derived2
> >> >     > from (
> >> >     >   select fkey1, fkey3,
> >> >     >     cast(sum((case when (value > 0.0) then 4 else 1 end)) as
> >> > double
> >> >     > precision) as derived1,
> >> >     >     sum((case when (value > 0.0) then (value * 4) else (value +
> 1)
> >> >     end)) as
> >> >     > derived2
> >> >     >   from values
> >> >     >   group by fkey1, fkey3
> >> >     > ) as t1;
> >> >
> >> >     > -- This query requires a sequential scan on values, though all
> the
> >> >     data it
> >> >     > needs could be found much more efficiently with an index scan.
> >> >     > explain analyze select * from values_view where fkey1 = 1263;
> >> >
> >> >     To use the outer WHERE clause as an index constraint, postgres
> would
> >> >     have to prove that scanning only the rows with fkey1 = 1263 would
> >> > still
> >> >     find all the rows that would get examined by the window functions
> >> > ---
> >> >     and in this case, it's not only the window functions that make
> that
> >> > less
> >> >     than obvious, but the grouped aggregates in the sub-select below
> >> > them.
> >> >     There's not nearly that amount of intelligence in the system about
> >> >     window functions, as yet.  So you'll have to write out the query
> >> >     longhand and put the WHERE clause at the lower level, if you want
> >> > this
> >> >     optimization to happen.
> >> >
> >> >                             regards, tom lane
> >> >
> >> >
> >> > Ok, that makes sense, thanks.
> >> >
> >> > Can anyone point me to an example of wrapping a function in a view,
> like
> >> > Merlin suggested? I'm not sure how that would work.
> >>
> >> Off the top of my head, I'd imagine it's as simple as:
> >>
> >>   create view ... as
> >>   select * from my_function(...);
> >>
> >> :-)
> >>
> >> Ray.
> >>
> >>
> >> --
> >> Raymond O'Donnell :: Galway :: Ireland
> >> r...@iol.ie
> >
> >
> >
> > Sorry, I don't understand. I'm able to make a function that takes an
> integer
> > and uses it in the subselect as "WHERE fkey1 = arg", and that works as I
> > expect it to and it's plenty fast. But I don't see how to write a view to
> > take advantage of this function - what arguments would go in
> > my_function(...) when I'm declaring the view?
>
> First let's clearly state the problem - create some test data:
> postgres=# create table foo as select v1, v2 from (select
> generate_series(1,10) v1) q1, (select generate_series(1,10000) v2) q2;
> postgres=# create table bar as select v1 from generate_series(1,10) v1;
>
> foo is data table, bar defines the key around which we have the data.
>
> create an index:
> postgres=# create index on foo(v1);
>
> simple lookup is optimized obviously:
> postgres=# explain select * from foo where v1 = 7;
>                               QUERY PLAN
> ----------------------------------------------------------------------
>  Index Scan using foo_v1_idx on foo  (cost=0.00..4.38 rows=1 width=8)
>    Index Cond: (v1 = 50)
>
> get indexed lookup for simple window function:
> postgres=# explain select v1, sum(v2) over(partition by v1 order by
> v2) from foo where v1 = 7;
>                                     QUERY PLAN
>
> ----------------------------------------------------------------------------------
>  WindowAgg  (cost=4.39..4.41 rows=1 width=8)
>    ->  Sort  (cost=4.39..4.39 rows=1 width=8)
>          Sort Key: v2
>          ->  Index Scan using foo_v1_idx on foo  (cost=0.00..4.38
> rows=1 width=8)
>                Index Cond: (v1 = 50)
>
> push to subquery (essentially what view does) and we lose the optimization:
> postgres=# explain select * from (select v1, sum(v2) over(partition by
> v1 order by v2) from foo) q  where v1 = 7;
>                                  QUERY PLAN
>
> -----------------------------------------------------------------------------
>  Subquery Scan on q  (cost=9747.82..12997.82 rows=1 width=12)
>    Filter: (q.v1 = 50)
>    ->  WindowAgg  (cost=9747.82..11747.82 rows=100000 width=8)
>          ->  Sort  (cost=9747.82..9997.82 rows=100000 width=8)
>                Sort Key: foo.v1, foo.v2
>                ->  Seq Scan on foo  (cost=0.00..1443.00 rows=100000
> width=8)
>
> let's work around it!
> postgres=# CREATE OR REPLACE FUNCTION getfoovals(
>   v1 INOUT int,
>   sumv2 OUT BIGINT) RETURNS SETOF RECORD AS
> $$
>   select
>     v1,
>     sum(v2)
>     over(partition by v1 order by v2)
>   from foo where foo.v1 = getfoovals.v1;
> $$ LANGUAGE SQL STABLE;
>
> select * from getfoovals(7);
>  v1 |  sumv2
> ----+----------
>   7 |        1
>   7 |        3
>   7 |        6
>   7 |       10
>   7 |       15
>   7 |       21
> <snip>
>
> abstract to view:
> postgres=# CREATE OR REPLACE VIEW foovals AS
>   select
>     v1,
>     (getfoovals(v1)).sumv2
>   from bar;
>
> postgres=#  select * from foovals where v1 = 7;
>
> Main problem with this technique is awkwardness around using column
> lest set returning function if it returns > 1 column...if you'r not
> careful you can get extra invocations of function.   you work hack
> around this to some degree via:
> *) offset 0 hacks
> *) return type coersion hacks
> *) 9.3 LATERAL feature completely nails it
>
> Can't use WITH because it fences of the optimization.
>
> This (window function optimization fencing) is probably #1 performance
> gotcha I hit in everyday coding after LATERAL and lack of better
> ability to inline simple SQL functions.
>
> merlin
>




Hi - thanks everyone for your input. I bit the bullet and just made a
materialized view for now. I got Merlin's suggestion working, though - I'm
going to revisit it when 9.3 comes out and I can use LATERAL.

Thanks again!
Chris

Re: [GENERAL] Poor performance when using a window function in a view

Reply via email to