r 27, 2025 at 6:53 PM Asif Shahid wrote:
>
>> Hi Experts,
>> Could you please allow me to pick your brain on the following:
>>
>> For Hive Tables ( managed), the scan operator is FileSourceScanExec.
>> Is there any particular reason why its underlying HadoopFSRelat
Hi Experts,
Could you please allow me to pick your brain on the following:
For Hive Tables ( managed), the scan operator is FileSourceScanExec.
Is there any particular reason why its underlying HadoopFSRelations'
field, FileFormat does not implement an interface like
SupportsRuntimeFiltering ?
Li
I am not 100% sure, but I think you should run :
./dev/lint-scala .
Some time back I also ran the target which you said, and resulted in
plethora of files being modified.
Then at that time realized, that target was run only for some specific
modules ( I think connect..).
Regards
Asif
On Fri
end functional test , the bugrepro,patch and bugTest attached ,
can be used, but cannot productize it due to nature of the code.
The bugreprod.patch with BugTest will pass, if both the above PRs are
included.
Regards
Asif
On Sun, Feb 16, 2025 at 9:43 AM Asif Shahid wrote:
> Hi.
> Ok . d
Hi.
Ok . did the final checkin. Pls feel free to review.
Regards
Asif
On Sat, Feb 15, 2025 at 6:42 PM Asif Shahid wrote:
> Pls hold on reviewing the patch, as I need to do one more checkin.
> I have still left a window of race , by releasing the read lock early, for
> the case of f
utor sides..? Then there
> might be condition where shuffle files could be lost before
> driver/executors are communicated checksum ?
> Regards
> Asif
>
>
> On Thu, Feb 13, 2025 at 7:39 PM Asif Shahid wrote:
>
>> The bugrepro patch , when applied on current mas
condition where shuffle files could be lost before
driver/executors are communicated checksum ?
Regards
Asif
On Thu, Feb 13, 2025 at 7:39 PM Asif Shahid wrote:
> The bugrepro patch , when applied on current master, will show failure
> with incorrect results.
> While on the PR branch , it
nfirm my understanding of the
>> problem, plus educate internal users and platform team:
>> https://issues.apache.org/jira/browse/SPARK-38388. Checksum approach was
>> brought up in that JIRA too and I feel that is the balanced way to look at
>> this problem.
>>
The bugrepro patch , when applied on current master, will show failure with
incorrect results.
While on the PR branch , it will pass.
The number of iterations in the test is 100.
Regards
Asif
On Thu, Feb 13, 2025 at 7:35 PM Asif Shahid wrote:
> Hi,
> Following up on this issue.
> The
race.
Regards
Asif
On Sun, Jan 26, 2025 at 11:19 PM Asif Shahid wrote:
> Shouldn't it be possible to determine with static data , if output will be
> deterministic ?. Expressions already have deterministic flag. So when an
> attribute is created from alias, it will be possible to kno
er.
>- ...
>
>
> On Tue, Jan 28, 2025 at 1:53 PM Asif Shahid wrote:
>
>> I am genuinely curious to know, as to how do those commits which are
>> reliably failing the build, end up in master ? Is there some window of race
>> where two conflicting PRs in terms o
I am genuinely curious to know, as to how do those commits which are
reliably failing the build, end up in master ? Is there some window of race
where two conflicting PRs in terms of logic ,tend to mess up the final
state in master ?
I have seen in past few months, while synching up my open PRs, f
t;
> Maybe we should do it at runtime: if Spark retries a shuffle stage but the
> data becomes different (e.g. use checksum to check it), then Spark should
> retry all the partitions of this stage. I'll look into this repro after I'm
> back from the national holiday.
>
> On
G, Encoders.STRING)).toDF("pkRight",
> "strright")
>
>
> innerDf.write.format("parquet").partitionBy("strright").saveAsTable("inner")
>
> val innerInnerDf = spark.createDataset(
> Seq((1L, "111"), (2L, &qu
personal note.. thanks for your interest.. this is very rare
attitude.
Regards
Asif
On Sun, Jan 26, 2025, 9:45 PM Ángel wrote:
> Hi Asif,
>
> Could you provide an example (code+dataset) to analize this? Looks
> interesting ...
>
>
> Regards,
> Ángel
>
> El dom, 26 ene
as that an issue is incorrect.
But I think that AttributeRef should have a boolean method which tells,
whether the value it represents is from an indeterminate source or not.
Regards
Asif
On Fri, Jan 24, 2025 at 5:18 PM Asif Shahid wrote:
> Hi,
> While testing a use case where the query
Hi,
While testing a use case where the query had an outer join such that
joining key of left outer table either had a valid value or a random value(
salting to avoid skew).
The case was reported to have incorrect results in case of node failure,
with retry.
On debugging the code, have found followi
18 matches
Mail list logo