Hi, here's the new patch(s). Mostly the same, but trying to address your comments from earlier as well as clean up a few other things I noticed.
Cheers, Edmund On Fri, 9 Nov 2018 at 15:01, Edmund Horner <ejr...@gmail.com> wrote: > > On Tue, 6 Nov 2018 at 16:40, David Rowley <david.row...@2ndquadrant.com> > wrote: > > I've been looking over 0001 to 0003. I ran out of steam before 0004. > > Hi David, thanks for another big review with lots of improvements. > > > I like the design of the new patch. From what I threw so far at the > > selectivity estimation code, it seems pretty good. I also quite like > > the design in nodeTidscan.c for range scans. > > > > I didn't quite manage to wrap my head around the code that removes > > redundant quals from the tidquals. For example, with: > > > > postgres=# explain select * from t1 where ctid <= '(0,10)' and a = 0; > > QUERY PLAN > > -------------------------------------------------- > > Tid Scan on t1 (cost=0.00..3.19 rows=1 width=4) > > TID Cond: (ctid <= '(0,10)'::tid) > > Filter: (a = 0) > > (3 rows) > > > > and: > > > > postgres=# explain select * from t1 where ctid <= '(0,10)' or a = 20 > > and ctid >= '(0,0)'; > > QUERY PLAN > > ------------------------------------------------------------------------------ > > Tid Scan on t1 (cost=0.01..176.18 rows=12 width=4) > > TID Cond: ((ctid <= '(0,10)'::tid) OR (ctid >= '(0,0)'::tid)) > > Filter: ((ctid <= '(0,10)'::tid) OR ((a = 20) AND (ctid >= > > '(0,0)'::tid))) > > (3 rows) > > > > I understand why the 2nd query didn't remove the ctid quals from the > > filter, and I understand why the first query could. I just didn't > > manage to convince myself that the code behaves correctly for all > > cases. > > I agree it's not obvious. > > 1. We extract a set of tidquals that can be directly implemented by > the Tid scan. This set is of the form: "(CTID op ? AND ...) OR > (...)" (with some limitations). > 2. If they happened to come verbatim from the original RestrictInfos, > then they will be found in scan_clauses, and we can remove them. > 3. If they're not verbatim, i.e. the original RestrictInfos have > additional criteria that the Tid scan can't use, then tidquals won't > match anything in scan_clauses, and hence scan_clauses will be > unchanged. > 4. We do a bit of extra work for the common and useful case of "(CTID > op ? AND ...)". Since the top-level operation of the input quals is > an AND, it will typically be split into multiple RestrictInfo items. > We remove each part from scan_clauses. > > > 1. I see a few instances of: > > > > #define DatumGetItemPointer(X) ((ItemPointer) DatumGetPointer(X)) > > #define ItemPointerGetDatum(X) PointerGetDatum(X) > > > > in both tid.c and ginfuncs.c, and I see you have: > > > > + itemptr = (ItemPointer) DatumGetPointer(constval); > > > > Do you think it would be worth moving the macros out of tid.c and > > ginfuncs.c into postgres.h and use that macro instead? > > > > (I see the code in this file already did this, so it might not matter > > about this) > > I'm not sure about this one - - I think it's better as a separate > patch, since we'd also change ginfuncs.c. I have left it alone for > now. > > > 2. In TidCompoundRangeQualFromExpr() rlst is not really needed. You > > can just return MakeTidRangeQuals(found_quals); or return NIL. > > Yup, gone. > > > 3. Can you explain why this only needs to take place when list_length() == > > 1? > > > > /* > > * In the case of a compound qual such as "ctid > ? AND ctid < ? AND ...", > > * the various parts will have come from different RestrictInfos. So > > * remove each part separately. > > */ > > ... > > I've tried to improve the comment. > > > 4. Accidental change? > > > > - tidquals); > > + tidquals > > + ); > > > > 5. Shouldn't this comment get changed? > > > > - * NumTids number of tids in this scan > > + * NumRanges number of tids in this scan > > > > 6. There's no longer a field named NumTids > > > > - * TidList evaluated item pointers (array of size NumTids) > > + * TidRanges evaluated item pointers (array of size NumTids) > > > > 7. The following field is not documented in TidScanState: > > > > + bool tss_inScan; > > > > 8. Can you name this exprtype instead? > > > > + TidExprType type; /* type of op */ > > > > "type" is used by Node types to indicate their type. > > Yup, yup, yup, yup, yup. > > > 9. It would be neater this: > > > > if (expr->opno == TIDLessOperator || expr->opno == TIDLessEqOperator) > > tidopexpr->type = invert ? TIDEXPR_LOWER_BOUND : TIDEXPR_UPPER_BOUND; > > else if (expr->opno == TIDGreaterOperator || expr->opno == > > TIDGreaterEqOperator) > > tidopexpr->type = invert ? TIDEXPR_UPPER_BOUND : TIDEXPR_LOWER_BOUND; > > else > > tidopexpr->type = TIDEXPR_EQ; > > > > tidopexpr->exprstate = exprstate; > > > > tidopexpr->inclusive = expr->opno == TIDLessEqOperator || expr->opno > > == TIDGreaterEqOperator; > > > > as a switch: ... > > Yup, I think the switch is a bit nicer. > > > 10. I don't quite understand this comment: > > > > + * Create an ExprState corresponding to the value part of a TID comparison, > > + * and wrap it in a TidOpExpr. Set the type and inclusivity of the > > TidOpExpr > > + * appropriately, depending on the operator and position of the its > > arguments. > > > > I don't quite see how the code sets the inclusivity depending on the > > position of the arguments. > > > > Maybe the comment should be: > > > > + * For the given 'expr' build and return an appropriate TidOpExpr taking > > into > > + * account the expr's operator and operand order. > > I'll go with your wording. > > > 11. ScalarArrayOpExpr are commonly named "saop": ... > > Yup. > > > 12. You need to code SetTidLowerBound() with similar wraparound logic > > you have in SetTidUpperBound(). > > > > It's perhaps unlikely, but the following shows incorrect results. > > > > postgres=# select ctid from t1 where ctid > '(0,65535)' limit 1; > > ctid > > ------- > > (0,1) > > (1 row) > > > > -- the following is fine. > > > > Time: 1.652 ms > > postgres=# select ctid from t1 where ctid >= '(0,65535)' limit 1; > > ctid > > ------- > > (1,1) > > (1 row) > > > > Likely you can just upgrade to the next block when the offset is > > > MaxOffsetNumber. > > This is important, thanks for spotting it. > > I've tried to add some code to handle this case (and also that of > "ctid < '(0,0)'") with a couple of tests too. > > > 13. It looks like the previous code didn't make the assumption you're > > making in: > > > > + * A current-of TidExpr only exists by itself, and we should > > + * already have allocated a tidList entry for it. We don't > > + * need to check whether the tidList array needs to be > > + * resized. > > > > I'm not sure if it's a good idea to lock the executor code into what > > the grammar currently says is possible. The previous code didn't > > assume that. > > Fair enough, I've restored the previous code without the assumption. > > > 14. we pass 'false' to what? > > Obsolete comment (see reply to Alvaro). > > I've applied most of these, and I'll post a new patch soon.
v4-0001-Add-selectivity-and-nullness-estimates-for-CTID-syst.patch
Description: Binary data
v4-0003-Support-backward-scans-over-restricted-ranges-in-hea.patch
Description: Binary data
v4-0002-Support-range-quals-in-Tid-Scan.patch
Description: Binary data
v4-0004-Tid-Scan-results-are-ordered.patch
Description: Binary data