On Fri, Mar 1, 2013 at 6:37 AM, Merlin Moncure <mmonc...@gmail.com> wrote:
> On Fri, Mar 1, 2013 at 3:59 AM, Chris Hanks > <christopher.m.ha...@gmail.com> wrote: > > On Fri, Mar 1, 2013 at 1:21 AM, Raymond O'Donnell <r...@iol.ie> wrote: > >> > >> On 01/03/2013 00:19, Chris Hanks wrote: > >> > On Thu, Feb 28, 2013 at 6:00 AM, Tom Lane <t...@sss.pgh.pa.us > >> > <mailto:t...@sss.pgh.pa.us>> wrote: > >> > > >> > Chris Hanks <christopher.m.ha...@gmail.com > >> > <mailto:christopher.m.ha...@gmail.com>> writes: > >> > > create or replace view values_view as > >> > > select fkey1, fkey3, > >> > > (derived1 / max(derived1) over (partition by fkey1)) as > >> > derived1, > >> > > (derived2 / sum(derived1) over (partition by fkey1)) as > derived2 > >> > > from ( > >> > > select fkey1, fkey3, > >> > > cast(sum((case when (value > 0.0) then 4 else 1 end)) as > >> > double > >> > > precision) as derived1, > >> > > sum((case when (value > 0.0) then (value * 4) else (value + > 1) > >> > end)) as > >> > > derived2 > >> > > from values > >> > > group by fkey1, fkey3 > >> > > ) as t1; > >> > > >> > > -- This query requires a sequential scan on values, though all > the > >> > data it > >> > > needs could be found much more efficiently with an index scan. > >> > > explain analyze select * from values_view where fkey1 = 1263; > >> > > >> > To use the outer WHERE clause as an index constraint, postgres > would > >> > have to prove that scanning only the rows with fkey1 = 1263 would > >> > still > >> > find all the rows that would get examined by the window functions > >> > --- > >> > and in this case, it's not only the window functions that make > that > >> > less > >> > than obvious, but the grouped aggregates in the sub-select below > >> > them. > >> > There's not nearly that amount of intelligence in the system about > >> > window functions, as yet. So you'll have to write out the query > >> > longhand and put the WHERE clause at the lower level, if you want > >> > this > >> > optimization to happen. > >> > > >> > regards, tom lane > >> > > >> > > >> > Ok, that makes sense, thanks. > >> > > >> > Can anyone point me to an example of wrapping a function in a view, > like > >> > Merlin suggested? I'm not sure how that would work. > >> > >> Off the top of my head, I'd imagine it's as simple as: > >> > >> create view ... as > >> select * from my_function(...); > >> > >> :-) > >> > >> Ray. > >> > >> > >> -- > >> Raymond O'Donnell :: Galway :: Ireland > >> r...@iol.ie > > > > > > > > Sorry, I don't understand. I'm able to make a function that takes an > integer > > and uses it in the subselect as "WHERE fkey1 = arg", and that works as I > > expect it to and it's plenty fast. But I don't see how to write a view to > > take advantage of this function - what arguments would go in > > my_function(...) when I'm declaring the view? > > First let's clearly state the problem - create some test data: > postgres=# create table foo as select v1, v2 from (select > generate_series(1,10) v1) q1, (select generate_series(1,10000) v2) q2; > postgres=# create table bar as select v1 from generate_series(1,10) v1; > > foo is data table, bar defines the key around which we have the data. > > create an index: > postgres=# create index on foo(v1); > > simple lookup is optimized obviously: > postgres=# explain select * from foo where v1 = 7; > QUERY PLAN > ---------------------------------------------------------------------- > Index Scan using foo_v1_idx on foo (cost=0.00..4.38 rows=1 width=8) > Index Cond: (v1 = 50) > > get indexed lookup for simple window function: > postgres=# explain select v1, sum(v2) over(partition by v1 order by > v2) from foo where v1 = 7; > QUERY PLAN > > ---------------------------------------------------------------------------------- > WindowAgg (cost=4.39..4.41 rows=1 width=8) > -> Sort (cost=4.39..4.39 rows=1 width=8) > Sort Key: v2 > -> Index Scan using foo_v1_idx on foo (cost=0.00..4.38 > rows=1 width=8) > Index Cond: (v1 = 50) > > push to subquery (essentially what view does) and we lose the optimization: > postgres=# explain select * from (select v1, sum(v2) over(partition by > v1 order by v2) from foo) q where v1 = 7; > QUERY PLAN > > ----------------------------------------------------------------------------- > Subquery Scan on q (cost=9747.82..12997.82 rows=1 width=12) > Filter: (q.v1 = 50) > -> WindowAgg (cost=9747.82..11747.82 rows=100000 width=8) > -> Sort (cost=9747.82..9997.82 rows=100000 width=8) > Sort Key: foo.v1, foo.v2 > -> Seq Scan on foo (cost=0.00..1443.00 rows=100000 > width=8) > > let's work around it! > postgres=# CREATE OR REPLACE FUNCTION getfoovals( > v1 INOUT int, > sumv2 OUT BIGINT) RETURNS SETOF RECORD AS > $$ > select > v1, > sum(v2) > over(partition by v1 order by v2) > from foo where foo.v1 = getfoovals.v1; > $$ LANGUAGE SQL STABLE; > > select * from getfoovals(7); > v1 | sumv2 > ----+---------- > 7 | 1 > 7 | 3 > 7 | 6 > 7 | 10 > 7 | 15 > 7 | 21 > <snip> > > abstract to view: > postgres=# CREATE OR REPLACE VIEW foovals AS > select > v1, > (getfoovals(v1)).sumv2 > from bar; > > postgres=# select * from foovals where v1 = 7; > > Main problem with this technique is awkwardness around using column > lest set returning function if it returns > 1 column...if you'r not > careful you can get extra invocations of function. you work hack > around this to some degree via: > *) offset 0 hacks > *) return type coersion hacks > *) 9.3 LATERAL feature completely nails it > > Can't use WITH because it fences of the optimization. > > This (window function optimization fencing) is probably #1 performance > gotcha I hit in everyday coding after LATERAL and lack of better > ability to inline simple SQL functions. > > merlin > Hi - thanks everyone for your input. I bit the bullet and just made a materialized view for now. I got Merlin's suggestion working, though - I'm going to revisit it when 9.3 comes out and I can use LATERAL. Thanks again! Chris