Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/23719 )
Change subject: IMPALA-14576, IMPALA-14577: add rewrite rules for geospatial relations ...................................................................... Patch Set 11: (7 comments) http://gerrit.cloudera.org:8080/#/c/23719/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23719/8//COMMIT_MSG@48 PS8, Line 48: > Are there any measurements about the perf gain? AddEnvIntersectsRule depend I didn't make comprehensive benchmarks yet. I tried a similar rewrite for customer data with lat lon points and intersecting polygons, and the difference was drastic if st_envintersects() was selective even without skipping whole files - query time was dominated by Parquet page decompression after the change, while before it predicate evaluation dominated by far. For non-selective queries this likely adds a bit of overhead, but it should be minimal compared to the actual polygon intersection in Java. http://gerrit.cloudera.org:8080/#/c/23719/8/fe/src/main/java/org/apache/impala/rewrite/AddEnvIntersectsRule.java File fe/src/main/java/org/apache/impala/rewrite/AddEnvIntersectsRule.java: http://gerrit.cloudera.org:8080/#/c/23719/8/fe/src/main/java/org/apache/impala/rewrite/AddEnvIntersectsRule.java@35 PS8, Line 35: > nit: should be singular, or remove 'a' before geospatial Done http://gerrit.cloudera.org:8080/#/c/23719/8/fe/src/main/java/org/apache/impala/rewrite/AddEnvIntersectsRule.java@70 PS8, Line 70: a > nit: redundant space Done http://gerrit.cloudera.org:8080/#/c/23719/8/fe/src/main/java/org/apache/impala/rewrite/AddEnvIntersectsRule.java@75 PS8, Line 75: return new CompoundPredicate(CompoundPredicate.Operator.AND, newPred, expr.clone()); : } > Could be just: Done http://gerrit.cloudera.org:8080/#/c/23719/8/fe/src/main/java/org/apache/impala/rewrite/PointEnvIntersectsRule.java File fe/src/main/java/org/apache/impala/rewrite/PointEnvIntersectsRule.java: http://gerrit.cloudera.org:8080/#/c/23719/8/fe/src/main/java/org/apache/impala/rewrite/PointEnvIntersectsRule.java@39 PS8, Line 39: st_envintersects(CONST_GEOM, st_point(x, y)) > Would it make sense to have a rule for st_envintersects(CONST_POINT, GEOM)? This rule only makes sense for points. Something similar could be applied to other geometry types constructed from coordinate columns, e.g. st_linestring(x1,y1,x2,y2), but this seems much more niche and I don't see how to exactly rewrite it, as both points can be outside the bounding rect while the line still crosses it. Something like min(x1,x2) >= minx ... could work, but this is not something we could push down anywhere at the moment. http://gerrit.cloudera.org:8080/#/c/23719/8/fe/src/main/java/org/apache/impala/rewrite/PointEnvIntersectsRule.java@42 PS8, Line 42: > nit: last AND is dangling Done http://gerrit.cloudera.org:8080/#/c/23719/8/testdata/workloads/functional-query/queries/QueryTest/geospatial-esri-planner.test File testdata/workloads/functional-query/queries/QueryTest/geospatial-esri-planner.test: http://gerrit.cloudera.org:8080/#/c/23719/8/testdata/workloads/functional-query/queries/QueryTest/geospatial-esri-planner.test@35 PS8, Line 35: Check that PointEnvIntersectsRule is applied > nit: This also checks that NormalizeGeospatialRelationsRule was applied bef Changed the order to not need NormalizeGeospatialRelationsRule -- To view, visit http://gerrit.cloudera.org:8080/23719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id65f646db6f1c89a74253e9ff755c39c400328be Gerrit-Change-Number: 23719 Gerrit-PatchSet: 11 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Xuebin Su <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Wed, 07 Jan 2026 15:25:45 +0000 Gerrit-HasComments: Yes
