On Jan20, 2014, at 08:42 , David Rowley <dgrowle...@gmail.com> wrote: >> On Mon, Jan 20, 2014 at 2:45 PM, Florian Pflug <f...@phlo.org> wrote: >> * An assert that the frame end doesn't move backwards - I realized that >> it is after all easy to do that, if it's done after the loop which adds >> the new values, not before. > > I've applied this too, but I'm wondering why an elog for if the head moves > back, but an assert if the tail moves back?
When I put the frame head check in, I was concerned that the code might crash or loop endlessly if aggregatedbase was ever larger than frameheadpos, so I made it elog(), just for safety. The frame end check, OTOH, is done at the very end, so it doesn't protect against much, it just documents that it's not supposed to happen. But yeah, it's bit weird. Feel free to turn the frame end check into an elog() too if you want to. >> * I've also renamed INVFUNC to INVSFUNC. That's a pretty invasive change, and >> it's the last commit, so if you object to that, then you can merge up to >> eafa72330f23f7c970019156fcc26b18dd55be27 instead of >> de3d9148be9732c4870b76af96c309eaf1d613d7. > > > Seems like sfunc really should be tfunc then we could have invtfunc. I'd > probably > understand this better if I knew what the 's' was for in sfunc. I've not > applied > this just yet. Do you have a reason why you think it's better? My issue with just "invfunc" is mainly that it's too generic - it doesn't tell you what it's supposed to be the inverse of. I've always assumed that 's' in 'sfunc' and 'stype' stands for 'state', and that the naming is inspired by control theory, where the function which acts on the state space is often called S. > Thanks, yeah those really do need a review. I've lost a bit of direction with > them and I'm not quite sure just how much detail to go in to with it. I'd like > to explain a bit that users who need to use their numeric columns in window > aggregates might want to think about having a defined scale to the numeric > rather > than an undefined scale and explain that this is because the inverse > transition > function for numeric bails out if it loses the maximum seen dscale. Though it > does seem generally a good idea to have a defined scale, but then I guess > you've > got to have a bit of knowledge about the numbers you're storing in that case. > I'm not quite sure how to put that into words friendly enough for the > documents > just yet and or exactly where to put the words. So any ideas or patches you > have > around that would be great. Here's what I image the docs should look like, very roughly * In CREATE AGGREGATE, we should state the precise axioms we assume about forward and inverse transition functions. The last time I read the text there, it was a bit ambiguous about whether inverse transition functions assume commutativity, i.e. whether we assume that we can remove inputs other than the first one from the aggregation. Currently, all the inverse transition functions we have are, in fact, commutative, because all the forward transition function are also. But we could e.g. add an inverse to array_agg and string_agg, and those would obviously need this ordering guarantee. I'd also like us to explain that the "inverse" in "inverse transition function" shouldn't be taken too literally. For forward transition function F, inverse transition function G, and inputs a,b,... we *don't require that G(F(s,a),a) == s. We we do require is that if I is the initial state, then G(F(...F(F(I,a),b)...,z),a) == F(...F(I,b)...,z). (Well, actually we don't strictly require even that, the two states don't need to be identical, they just need to produce identical outputs when passed to the final function) * In CREATE AGGREGATE, we should also explain the NULL semantics you get with various combinations of strict and non-strict forward and inverse transition functions. I think a table listing all the combinations is in order there, with a column explaining the semantics you get. We should also clearly state that once you provide an inverse transition function, NULL isn't a valid state value anymore, except as an initial state, i.e. that the forward transition function must never return NULL in this case. * The window function page should explain the performance hazards when a moving frame head is involved. Ideally, we'd list which aggregates never have to restart, and which sometimes might, and what you can do to avoid that. We can also tell people to use EXPLAIN VERBOSE ANALYZE to check whether restarts are occurring. This is were we'd tell people to cast their numeric operands to a type with defined scale to avoid restarts, and were we'd state the SUM(float) *does* restart always due to precision loss issues. BTW, something that we haven't addressed at all is how inverse transition functions interact with what is called "ordered-set aggregates". I haven't wrapped my head around these fully, I think, so I'm still not sure if there's anything to do there or not. Can those even be used as window functions? Should we forbid ordered-set aggregates from specifying an inverse transition function? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers