(spark) branch master updated: [SPARK-55289][SQL][FOLLOWUP] Fix flaky test in-order-by.sql by disabling broadcast join

dongjoon Sat, 07 Mar 2026 10:29:11 -0800

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 411db440a446 [SPARK-55289][SQL][FOLLOWUP] Fix flaky test 
in-order-by.sql by disabling broadcast join
411db440a446 is described below

commit 411db440a446b9da06eded23a5e9c2af1aa87497
Author: Kent Yao <[email protected]>
AuthorDate: Sat Mar 7 10:28:51 2026 -0800

    [SPARK-55289][SQL][FOLLOWUP] Fix flaky test in-order-by.sql by disabling 
broadcast join
    
    ### What changes were proposed in this pull request?
    
    Same fix as #54072 (SPARK-55289) for `in-set-operations.sql`. Adds `--SET 
spark.sql.autoBroadcastJoinThreshold=-1` to `in-order-by.sql` to prevent OOM 
from BroadcastHashJoin accumulating hash tables on memory-constrained CI 
runners.
    
    ### Why are the changes needed?
    
    `in-order-by.sql` intermittently fails on CI with `SparkOutOfMemoryError` 
for the same root cause as `in-set-operations.sql` — complex correlated 
IN-subqueries with multiple BroadcastHashJoin operations exceeding JVM heap 
under memory pressure.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No. Test-only change.
    
    ### How was this patch tested?
    
    Golden files regenerated. Minor row reordering for rows with identical sort 
keys (expected when switching from BroadcastHashJoin to SortMergeJoin).
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Yes, co-authored with GitHub Copilot.
    
    Closes #54663 from yaooqinn/SPARK-55289-followup.
    
    Authored-by: Kent Yao <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 .../resources/sql-tests/inputs/subquery/in-subquery/in-order-by.sql     | 1 +
 .../sql-tests/results/subquery/in-subquery/in-order-by.sql.out          | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git 
a/sql/core/src/test/resources/sql-tests/inputs/subquery/in-subquery/in-order-by.sql
 
b/sql/core/src/test/resources/sql-tests/inputs/subquery/in-subquery/in-order-by.sql
index 8bf49a1c2d99..7fbb1c12924f 100644
--- 
a/sql/core/src/test/resources/sql-tests/inputs/subquery/in-subquery/in-order-by.sql
+++ 
b/sql/core/src/test/resources/sql-tests/inputs/subquery/in-subquery/in-order-by.sql
@@ -1,5 +1,6 @@
 -- A test suite for ORDER BY in parent side, subquery, and both predicate 
subquery
 -- It includes correlated cases.
+--SET spark.sql.autoBroadcastJoinThreshold=-1
 
 -- Test sort operator with codegen on and off.
 --CONFIG_DIM1 spark.sql.codegen.wholeStage=true
diff --git 
a/sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-order-by.sql.out
 
b/sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-order-by.sql.out
index d687b5938834..b06fd3dd6fc2 100644
--- 
a/sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-order-by.sql.out
+++ 
b/sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-order-by.sql.out
@@ -227,8 +227,8 @@ 
struct<t1a:string,t1b:smallint,t1c:int,t1d:bigint,t1e:float,t1f:double,t1g:decim
 -- !query output
 val1d  NULL    16      22      17.0    25.0    2600    2014-06-04 01:01:00     
NULL
 val1d  NULL    16      19      17.0    25.0    2600    2014-07-04 01:02:00.001 
NULL
-val1a  16      12      21      15.0    20.0    2000    2014-06-04 01:02:00.001 
2014-06-04
 val1a  16      12      10      15.0    20.0    2000    2014-07-04 01:01:00     
2014-07-04
+val1a  16      12      21      15.0    20.0    2000    2014-06-04 01:02:00.001 
2014-06-04
 
 
 -- !query


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-55289][SQL][FOLLOWUP] Fix flaky test in-order-by.sql by disabling broadcast join

Reply via email to