ZihuanLing commented on PR #1302:
URL: 
https://github.com/apache/datafusion-ballista/pull/1302#issuecomment-3239033243

   @milenkovicm  hi, I've applied this patch to my code, and tried to run tpcds 
query 44 and 54, I meet this error : 
   
   Error: ArrowError(ExternalError(Execution("Job mAxRhlg failed: Job failed 
due to stage 18 failed: Task failed due to runtime execution error: 
DataFusionError(Internal(\"Invalid NestedLoopJoinExec, the output partition 
count of the left child must be 1,consider using CoalescePartitionsExec or the 
EnforceDistribution rule\"))\n")), None)
   
   here's some logs maybe helpful: (just part of logs)
   ```
   =========UnResolvedStage[stage_id=18.0, children=2]=========
   Inputs{16: StageOutput { partition_locations: {}, complete: false }, 17: 
StageOutput { partition_locations: {}, complete: false }}
   ShuffleWriterExec: partitions:None
     NestedLoopJoinExec: join_type=Inner, filter=CAST(d_month_seq@0 AS Int64) 
>= date_dim.d_month_seq + Int64(1)@1, projection=[c_customer_sk@0, 
ss_ext_sales_price@1, d_month_seq@2]
       CoalescePartitionsExec
         UnresolvedShuffleExec: partitions=Hash([UnKnownColumn { name: 
"d_date_sk@3" }], 4)
       AggregateExec: mode=FinalPartitioned, gby=[date_dim.d_month_seq + 
Int64(1)@0 as date_dim.d_month_seq + Int64(1)], aggr=[]
         CoalesceBatchesExec: target_batch_size=4096
           UnresolvedShuffleExec: partitions=Hash([Column { name: 
"date_dim.d_month_seq + Int64(1)", index: 0 }], 4)
   
   ...
   
   =========UnResolvedStage[stage_id=18.0, children=2]=========
   Inputs{17: StageOutput { partition_locations: {}, complete: false }, 16: 
StageOutput { partition_locations: {}, complete: false }}
   ShuffleWriterExec: partitions:None
     NestedLoopJoinExec: join_type=Inner, filter=CAST(d_month_seq@0 AS Int64) 
>= date_dim.d_month_seq + Int64(1)@1, projection=[c_customer_sk@0, 
ss_ext_sales_price@1, d_month_seq@2]
       CoalescePartitionsExec
         UnresolvedShuffleExec: partitions=Hash([UnKnownColumn { name: 
"d_date_sk@3" }], 4)
       AggregateExec: mode=FinalPartitioned, gby=[date_dim.d_month_seq + 
Int64(1)@0 as date_dim.d_month_seq + Int64(1)], aggr=[]
         CoalesceBatchesExec: target_batch_size=4096
           UnresolvedShuffleExec: partitions=Hash([Column { name: 
"date_dim.d_month_seq + Int64(1)", index: 0 }], 4)
   
   ...
   
   =========ResolvedStage[stage_id=18.0, partitions=1]=========
   ShuffleWriterExec: partitions:Some(Hash([Column { name: "p_promo_sk", index: 
0 }], 4))
     DataSourceExec: file_groups={1 group: [[opt/lzh/10G/promotion.dat]]}, 
projection=[p_promo_sk], file_type=csv, has_header=false
   
   ...
   
   =========UnResolvedStage[stage_id=18.0, children=1]=========
   Inputs{17: StageOutput { partition_locations: {}, complete: false }}
   ShuffleWriterExec: partitions:None
     SortExec: expr=[return_ratio@1 ASC NULLS LAST], 
preserve_partitioning=[true]
       ProjectionExec: expr=[ss_item_sk@0 as item, 
CAST(sum(coalesce(sr.sr_return_quantity,Int64(0)))@1 AS Decimal128(15, 4)) / 
CAST(sum(coalesce(sts.ss_quantity,Int64(0)))@2 AS Decimal128(15, 4)) as 
return_ratio, CAST(sum(coalesce(sr.sr_return_amt,Int64(0)))@3 AS Decimal128(15, 
4)) / CAST(sum(coalesce(sts.ss_net_paid,Int64(0)))@4 AS Decimal128(15, 4)) as 
currency_ratio]
         AggregateExec: mode=FinalPartitioned, gby=[ss_item_sk@0 as 
ss_item_sk], aggr=[sum(coalesce(sr.sr_return_quantity,Int64(0))), 
sum(coalesce(sts.ss_quantity,Int64(0))), 
sum(coalesce(sr.sr_return_amt,Int64(0))), 
sum(coalesce(sts.ss_net_paid,Int64(0)))]
           CoalesceBatchesExec: target_batch_size=4096
             UnresolvedShuffleExec: partitions=Hash([Column { name: 
"ss_item_sk", index: 0 }], 4)
   
   ...
   
   
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to