On Wed, Mar 13, 2024 at 7:52 AM Andrei Lepikhov <a.lepik...@postgrespro.ru> wrote: > On 12/3/2024 22:20, Alexander Korotkov wrote: > > On Mon, Mar 11, 2024 at 2:43 PM Andrei Lepikhov > >> I think you are right. It is probably a better place than any other to > >> remove duplicates in an array. I just think we should sort and remove > >> duplicates from entry->consts in one pass. Thus, this optimisation > >> should be applied to sortable constants. > > > > Ok. > New version of the patch set implemented all we have agreed on for now. > We can return MAX_SAOP_ARRAY_SIZE constraint and Alena's approach to > duplicates deletion for non-sortable cases at the end. > > > >> Hmm, we already tried to do it at that point. I vaguely recall some > >> issues caused by this approach. Anyway, it should be done as quickly as > >> possible to increase the effect of the optimization. > > > > I think there were provided quite strong reasons why this shouldn't be > > implemented at the parse analysis stage [1], [2], [3]. The > > canonicalize_qual() looks quite appropriate place for that since it > > does similar transformations. > Ok. Let's discuss these reasons. In Robert's opinion [1,3], we should do > the transformation based on the cost model. But in the canonicalize_qual > routine, we still make the transformation blindly. Moreover, the second > patch reduces the weight of this reason, doesn't it? Maybe we shouldn't > think about that as about optimisation but some 'general form of > expression'? > Peter [2] worries about the possible transformation outcomes at this > stage. But remember, we already transform clauses like ROW() IN (...) to > a series of ORs here, so it is not an issue. Am I wrong? > Why did we discard the attempt with canonicalize_qual on the previous > iteration? - The stage of parsing is much more native for building SAOP > quals. We can reuse make_scalar_array_op and other stuff, for example. > During the optimisation stage, the only list partitioning machinery > creates SAOP based on a list of constants. So, in theory, it is possible > to implement. But do we really need to make the code more complex?
As we currently do OR-to-ANY transformation at the parse stage, the system catalog (including views, inheritance clauses, partial and expression indexes, and others) would have a form depending on enable_or_transformation at the moment of DDL execution. I think this is rather wrong. The enable_or_transformation should be run-time optimization which affects the resulting query plan, its result shouldn't be persistent. Regarding the ROW() IN (...) precedent. 1. AFAICS, this is not exactly an optimization. This transformation allows us to perform type matching individually for every value. Therefore it allows the execute some queries which otherwise would end up with error. 2. I don't think this is a sample of good design. This is rather hack, which is historically here, but we don't want to replicate this experience. Given all of the above, I think moving transformation to the canonicalize_qual() would be the right way to go. ------ Regards, Alexander Korotkov