Hi All,
> Also I am investigating a performance regression in some TPC-DS queries
(q88 for instance) that is caused by a recent commit in 3.1 ...
I have found that the perf regression is caused by the Hadoop config:
io.file.buffer.size = 4096
Before the commit
https://github.com/apache/spark/comm
Yeah, agree. I changed. Thanks for the heads up. Tom.
2021년 2월 3일 (수) 오전 8:31, Tom Graves 님이 작성:
> ok thanks for the update. That is marked as an improvement, if its a
> blocker can we mark it as such and describe why. I searched jiras and
> didn't see any critical or blockers open.
>
> Tom
> On
ok thanks for the update. That is marked as an improvement, if its a blocker
can we mark it as such and describe why. I searched jiras and didn't see any
critical or blockers open.
TomOn Tuesday, February 2, 2021, 05:12:24 PM CST, Hyukjin Kwon
wrote:
There is one here: https://github
There is one here: https://github.com/apache/spark/pull/31440. There look
several issues being identified (to confirm that this is an issue in OSS
too), and fixed in parallel.
There are a bit of unexpected delays here as several issues more were
found. I will try to file and share relevant JIRAs as
Just curious if we have an update on next rc? is there a jira for the tpcds
issue?
Thanks,Tom
On Wednesday, January 27, 2021, 05:46:27 PM CST, Hyukjin Kwon
wrote:
Just to share the current status, most of the known issues were resolved. Let
me know if there are some more.
One thing le
Hi devs,
In Spark structured streaming, we need state store for state management for
stateful operators such streaming aggregates, joins, etc. We have one and
only one state store implementation now. It is in-memory hashmap which was
backed up in HDFS complaint file system at the end of every micr