This is an automated email from the ASF dual-hosted git repository.
jiayu pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/sedona-db.git
The following commit(s) were added to refs/heads/main by this push:
new 4948319 docs: Add barrier function documentation and use it in KNN
join example (#72)
4948319 is described below
commit 4948319a5f9281805f6c5d1c4768e20b568637f3
Author: Feng Zhang <[email protected]>
AuthorDate: Fri Sep 12 09:13:42 2025 -0700
docs: Add barrier function documentation and use it in KNN join example
(#72)
---
docs/reference/sql-joins.md | 34 ++++++++++++++++++++++++++++++++++
1 file changed, 34 insertions(+)
diff --git a/docs/reference/sql-joins.md b/docs/reference/sql-joins.md
index 1752ba2..3d43743 100644
--- a/docs/reference/sql-joins.md
+++ b/docs/reference/sql-joins.md
@@ -52,3 +52,37 @@ FROM cities AS cities_l
INNER JOIN cities AS cities_r
ON ST_KNN(cities_l.geometry, cities_r.geometry, 5, false)
```
+
+## Optimization Barrier
+
+Use the `barrier` function to prevent filter pushdown and control predicate
evaluation order in complex spatial joins. This function creates an
optimization barrier by evaluating boolean expressions at runtime.
+
+The `barrier` function takes a boolean expression as a string, followed by
pairs of variable names and their values that will be substituted into the
expression:
+
+```sql
+barrier(expression, var_name1, var_value1, var_name2, var_value2, ...)
+```
+
+The placement of filters relative to KNN joins changes the semantic meaning of
the query:
+
+- **Filter before KNN**: First filters the data, then finds K nearest
neighbors from the filtered subset. This answers "What are the K nearest
high-rated restaurants?"
+- **Filter after KNN**: First finds K nearest neighbors from all data, then
filters those results. This answers "Of the K nearest restaurants, which ones
are high-rated?"
+
+### Example
+
+Find the 3 nearest high-rated restaurants to luxury hotels, ensuring the KNN
join completes before filtering.
+
+```sql
+SELECT
+ h.name AS hotel,
+ r.name AS restaurant,
+ r.rating
+FROM hotels AS h
+INNER JOIN restaurants AS r
+ON ST_KNN(h.geometry, r.geometry, 3, false)
+WHERE barrier('rating > 4.0 AND stars >= 4',
+ 'rating', r.rating,
+ 'stars', h.stars)
+```
+
+With the barrier function, this query first finds the 3 nearest restaurants to
each hotel (regardless of rating), then filters to keep only those pairs where
the restaurant has rating > 4.0 and the hotel has stars >= 4. Without the
barrier, an optimizer might push the filters down, changing the query to first
filter for high-rated restaurants and luxury hotels, then find the 3 nearest
among those filtered sets.