date:20220930

Re: Depolying stage-level scheduling for Spark SQL

2022-09-30 Thread Chenghao Lyu

Thanks for the reply! To clarify, for issue 2, it could still break apart a query into multiple jobs without AQE — I have turned off the AQE in my posted example. For 1, an end user just needs to turn on/off a knob to use the stage-level scheduling for Spark SQL — I am considering adding a comp

Re: Depolying stage-level scheduling for Spark SQL

2022-09-30 Thread Tom Graves

see the original SPIP for as to why we only support RDD: https://issues.apache.org/jira/browse/SPARK-27495 The main problem is exactly what you are referring to. The RDD level is not exposed to the user when using SQL or Dataframe API. This is on purpose and user shouldn't have to know anythin

Re: Depolying stage-level scheduling for Spark SQL

2022-09-30 Thread Chenghao Lyu

Thanks for the clarification Tom! A bit more backgrounds for what we want to do: we have proposed a fine-grained (stage-level) resource optimization approach in VLDB22 https://www.vldb.org/pvldb/vol15/p3098-lyu.pdf and would like to try it over Spark. Our approach can recommend the resource con

Re: Why are hash functions seeded with 42?

2022-09-30 Thread Felix Cheung

+1 to doc, seed argument would be great if possible From: Sean Owen Sent: Monday, September 26, 2022 5:26:26 PM To: Nicholas Gustafson Cc: dev Subject: Re: Why are hash functions seeded with 42? Oh yeah I get why we love to pick 42 for random things. I'm guessin

Re: Depolying stage-level scheduling for Spark SQL

Re: Depolying stage-level scheduling for Spark SQL

Re: Depolying stage-level scheduling for Spark SQL

Re: Why are hash functions seeded with 42?

4 matches

Site Navigation

Mail list logo

Footer information