I'm not 100% sure I understand the question. Assuming you're referring
"both" as SPARK-26283 [1] and SPARK-29322 [2], if you ask about the fix
then yes, only master branch as fix for SPARK-26283 is not ported back to
branch-2.4. If you ask about the issue (problem) then maybe no, according
to the a
I just looked at the PR. I think there are some follow up work that needs to be
done, e.g. we shouldn't create a top level packageĀ
org.apache.spark.sql.dynamicpruning.
On Wed, Oct 02, 2019 at 1:52 PM, Maryann Xue < maryann@databricks.com >
wrote:
>
> There is no internal write up, but I t
There is no internal write up, but I think we should at least give some
up-to-date description on that JIRA entry.
On Wed, Oct 2, 2019 at 3:13 PM Reynold Xin wrote:
> No there is no separate write up internally.
>
> On Wed, Oct 2, 2019 at 12:29 PM Ryan Blue wrote:
>
>> Thanks for the pointers,
No there is no separate write up internally.
On Wed, Oct 2, 2019 at 12:29 PM Ryan Blue wrote:
> Thanks for the pointers, but what I'm looking for is information about the
> design of this implementation, like what requires this to be in spark-sql
> instead of spark-catalyst.
>
> Even a high-leve
The reason why it's in spark-sql is simply because HadoopFsRelation which
the rule tries to match is in spark-sql.
We should probably update the high-level description in the JIRA. I'll work
on that shortly.
On Wed, Oct 2, 2019 at 2:29 PM Ryan Blue wrote:
> Thanks for the pointers, but what I'm
Thanks for the pointers, but what I'm looking for is information about the
design of this implementation, like what requires this to be in spark-sql
instead of spark-catalyst.
Even a high-level description, like what the optimizer rules are and what
they do would be great. Was there one written up
> It lists 3 cases for how a filter is built, but nothing about the overall
approach or design that helps when trying to find out where it should be
placed in the optimizer rules.
The overall idea/design of DPP can be simply put as using the result of one
side of the join to prune partitions of a
Whoever created the JIRA years ago didn't describe dpp correctly, but the
linked jira in Hive was correct (which unfortunately is much more terse than
any of the patches we have in SparkĀ
https://issues.apache.org/jira/browse/HIVE-9152 ). Henry R's description was
also correct.
On Wed, Oct 02,
Where can I find a design doc for dynamic partition pruning that explains
how it works?
The JIRA issue, SPARK-11150, doesn't seem to describe dynamic partition
pruning (as pointed out by Henry R.) and doesn't have any comments about
the implementation's approach. And the PR description also doesn'
Thank you for the investigation and making a fix.
So, both issues are on only master (3.0.0) branch?
Bests,
Dongjoon.
On Wed, Oct 2, 2019 at 00:06 Jungtaek Lim
wrote:
> FYI: patch submitted - https://github.com/apache/spark/pull/25996
>
> On Wed, Oct 2, 2019 at 3:25 PM Jungtaek Lim
> wrote:
dynamic partition pruning rule generates "hidden" filters that will be
converted to real predicates at runtime, so it doesn't matter where we run
the rule.
For PruneFileSourcePartitions, I'm not quite sure. Seems to me it's better
to run it before join reorder.
On Sun, Sep 29, 2019 at 5:51 AM Rya
FYI: patch submitted - https://github.com/apache/spark/pull/25996
On Wed, Oct 2, 2019 at 3:25 PM Jungtaek Lim
wrote:
> I need to do full manual test to make sure, but according to experiment
> (small UT) "closeFrameOnFlush" seems to work.
>
> There was relevant change on master branch SPARK-2628
Hi Jungtaek,
Thanks a lot for your very prompt response!
> Looks like it's missing, or intended to force custom streaming source
implemented as DSv2.
That's exactly my understanding = no more DSv1 data sources. That however
is not consistent with the official message, is it? Spark 2.4.4 does not
13 matches
Mail list logo