Hi, Jian!

Thank you for your feedback.

On Tue, Oct 8, 2024 at 8:12 AM jian he <jian.universal...@gmail.com> wrote:
>
> On Mon, Oct 7, 2024 at 10:06 PM jian he <jian.universal...@gmail.com> wrote:
> >
> > assume v40 is the latest version.
>
> make_bitmap_paths_for_or_group
> {
>     /*
>      * First, try to match the whole group to the one index.
>      */
>     orargs = list_make1(ri);
>     indlist = build_paths_for_OR(root, rel,
>                                  orargs,
>                                  other_clauses);
>     if (indlist != NIL)
>     {
>         bitmapqual = choose_bitmap_and(root, rel, indlist);
>         jointcost = bitmapqual->total_cost;
>         jointlist = list_make1(bitmapqual);
>     }
>     /*
>      * Also try to match all containing clauses 'one-by-one.
>      */
>     foreach(lc, args)
>     {
>         orargs = list_make1(lfirst(lc));
>         indlist = build_paths_for_OR(root, rel,
>                                      orargs,
>                                      other_clauses);
>         if (indlist == NIL)
>         {
>             splitlist = NIL;
>             break;
>         }
>         bitmapqual = choose_bitmap_and(root, rel, indlist);
> }
>
> if other_clauses is not NIL, then "try to match all containing clauses
> 'one-by-one"
> the foreach loop "foreach(lc, args)" will apply other_clauses in
> build_paths_for_OR every time.
> then splitcost will obviously be higher than jointcost.

Some of other_clauses could match to some index column.  So, the
splitcost could be lower than jointcost.  Please check [1] test case,
but not it misses t_b_c_idx.  So the correct full script is following.

create table t (a int not null, b int not null, c int not null);
insert into t (select 1, 1, i from generate_series(1,10000) i);
insert into t (select i, 2, 2 from generate_series(1,10000) i);
create index t_a_b_idx on t (a, b);
create index t_b_c_idx on t (b, c);
create statistics t_a_b_stat (mcv) on a, b from t;
create statistics t_b_c_stat (mcv) on b, c from t;
vacuum analyze t;
explain select * from t where a = 1 and (b = 1 or b = 2) and c = 2;

Also, note its possible that splitlist != NULL, but jointlist == NULL.
Check [2] for example.

>
> if other_clauses is NIL.
> "foreach(lc, args)" will have list_length(args) startup cost.
> So overall, it looks like jointcost will alway less than splitcost,
> the only corner case would be both are zero.

If other_clauses is NIL, we could probably do a shortcut when
jointlist != NULL.  At least, I don't see the case why would we need
jointlist in this case at the first glance.  Will investigate that
futher.

>
> anyway, in make_bitmap_paths_for_or_group,
> above line "Pick the best option."  I added:
>
>     if (splitcost <= jointcost && splitcost != 0 && jointcost != 0)
>         elog(INFO, "%s:%d splitcost <= jointcost and both is not
> zero", __FILE_NAME__, __LINE__);
> and the regress tests passed.
> That means we don't need to iterate "((BoolExpr *)
> ri->orclause)->args"  in make_bitmap_paths_for_or_group
> ?

Indeed, the regression test coverage is lacking.  Your feedback is valuable.

Links.
1. 
https://www.postgresql.org/message-id/CAPpHfdtSXxhdv3mLOLjEewGeXJ%2BFtfhjqodn1WWuq5JLsKx48g%40mail.gmail.com
2. 
https://www.postgresql.org/message-id/CAPpHfduJtO0s9E%3DSHUTzrCD88BH0eik0UNog1_q3XBF2wLmH6g%40mail.gmail.com

------
Regards,
Alexander Korotkov
Supabase


Reply via email to