On Mon, Dec 5, 2022 at 9:48 PM Vik Fearing <v...@postgresfriends.org> wrote:
> On 12/6/22 05:22, David G. Johnston wrote: > > On Mon, Dec 5, 2022 at 8:46 PM Vik Fearing <v...@postgresfriends.org> > wrote: > > > >> On 12/5/22 18:56, David G. Johnston wrote: > >>> Also, maybe we should have any_value do something like compute a 50/50 > >>> chance that any new value seen replaces the existing chosen value, > >> instead > >>> of simply returning the first value all the time. Maybe even prohibit > >> the > >>> first value from being chosen so long as a second value appears. > >> > >> The spec says the result is implementation-dependent meaning we don't > >> even need to document how it is obtained, but surely behavior like this > >> would preclude future optimizations like the ones I mentioned? > >> > > > > So, given the fact that we don't actually want to name a function > > first_value (because some users are readily confused as to when the > concept > > of first is actually valid or not) but some users do actually wish for > this > > functionality - and you are proposing to implement it here anyway - how > > about we actually do document that we promise to return the first > non-null > > value encountered by the aggregate. We can then direct people to this > > function and just let them know to pretend the function is really named > > first_value in the case where they specify an order by. (last_value comes > > for basically free with descending sorting). > > I can imagine an optimization that would remove an ORDER BY clause > because it isn't needed for any other aggregate. I'm referring to the query: select any_value(v order by v) from (values (2),(1),(3)) as vals (v); // produces 1, per the documented implementation-defined behavior. Someone writing: select any_value(v) from (values (2),(1),(3)) as vals (v) order by v; Is not presently, nor am I saying, promised the value 1. I'm assuming you are thinking of the second query form, while the guarantee only needs to apply to the first. David J.