+1, the late response... :(
Anyway, happy new year, all!
Bests,
Takeshi
On Tue, Jan 7, 2020 at 2:50 AM Dongjoon Hyun
wrote:
> Thank you all.
>
> I'll start to check and prepare the 2.4.5 release.
>
> Bests,
> Dongjoon.
>
> On Sun, Jan 5, 2020 at 22:51 Xiao Li wrote:
>
>> +1
>>
>> Xiao
>>
>> On
Can this perhaps exist as an utility function outside Spark?
On Tue, Jan 07, 2020 at 12:18 AM, Enrico Minack < m...@enrico.minack.dev >
wrote:
>
>
>
> Hi Devs,
>
>
>
> I'd like to get your thoughts on this Dataset feature proposal. Comparing
> datasets is a central operation when regressio
“Where can I find information on how to run standard performance
tests/benchmarks?“
The grand standard is spark-sql-perf and in particular the tpc-ds benchmark.
Most of the big optimization teams are using this as the primary benchmark.
One word of warning is that most groups have also extende
We can use R version 3.6.1, if we have a concern about the quality of 3.6.2?
On Thu, Dec 26, 2019 at 8:14 PM Hyukjin Kwon wrote:
> I was randomly googling out of curiosity, and seems indeed that's the
> problem (
> https://r.789695.n4.nabble.com/Error-in-rbind-info-getNamespaceInfo-env-quot-S3me
I think it's simply because as[T] is lazy. You will see the right schema if
you do `df.as[T].map(identity)`.
On Tue, Jan 7, 2020 at 4:42 PM Enrico Minack wrote:
> Hi Devs,
>
> I'd like to propose a stricter version of as[T]. Given the interface def
> as[T](): Dataset[T], it is counter-intuitiv
Hi Devs,
I'd like to propose a stricter version of as[T]. Given the interface def
as[T](): Dataset[T], it is counter-intuitive that the schema of the
returned Dataset[T] is not agnostic to the schema of the originating
Dataset. The schema should always be derived only from T.
I am proposing
Hi Devs,
I'd like to get your thoughts on this Dataset feature proposal.
Comparing datasets is a central operation when regression testing your
code changes.
It would be super useful if Spark's Datasets provide this transformation
natively.
https://github.com/apache/spark/pull/26936
Regar
1. Where can I find information on how to run standard performance
tests/benchmarks?
2. Are performance degradations to existing queries that are fixable by new
equivalent queries not allowed for a new major spark version?
On Thu, Jan 2, 2020 at 3:05 PM Brett Marcott
wrote:
> Thanks for the resp