zhangfengcdt opened a new issue, #2356:
URL: https://github.com/apache/sedona/issues/2356

   We need to implement a similar barrier udf function in sedona similar to the 
one in sedona db.
   
   ## Optimization Barrier
   
   Use the `barrier` function to prevent filter pushdown and control predicate 
evaluation order in complex spatial joins. This function creates an 
optimization barrier by evaluating boolean expressions at runtime.
   
   The `barrier` function takes a boolean expression as a string, followed by 
pairs of variable names and their values that will be substituted into the 
expression:
   
   ```sql
   barrier(expression, var_name1, var_value1, var_name2, var_value2, ...)
   
   ```
   
   The placement of filters relative to KNN joins changes the semantic meaning 
of the query:
   
   - **Filter before KNN**: First filters the data, then finds K nearest 
neighbors from the filtered subset. This answers "What are the K nearest 
high-rated restaurants?"
   - **Filter after KNN**: First finds K nearest neighbors from all data, then 
filters those results. This answers "Of the K nearest restaurants, which ones 
are high-rated?"
   
   ### Example
   
   Find the 3 nearest high-rated restaurants to luxury hotels, ensuring the KNN 
join completes before filtering.
   
   ```sql
   SELECT
       h.name AS hotel,
       r.name AS restaurant,
       r.rating
   FROM hotels AS h
   INNER JOIN restaurants AS r
   ON ST_KNN(h.geometry, r.geometry, 3, false)
   WHERE barrier('rating > 4.0 AND stars >= 4',
                 'rating', r.rating,
                 'stars', h.stars)
   
   ```
   
   With the barrier function, this query first finds the 3 nearest restaurants 
to each hotel (regardless of rating), then filters to keep only those pairs 
where the restaurant has rating > 4.0 and the hotel has stars >= 4. Without the 
barrier, an optimizer might push the filters down, changing the query to first 
filter for high-rated restaurants and luxury hotels, then find the 3 nearest 
among those filtered sets.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to