log4j2.properties file load fails when upgrading from 3.3.0 to 3.5.1

2024-06-25 Thread Edgar H
Hi all, Hope you can shed some light on this matter since I've been trying to workaround the issue for some time already but I can't make it click. I've been trying to go from Spark 3.3.0 up to 3.5.1 and logs are currently a mess. Using log4j2 fails when loading the file and still using the log4j

Building a ML pipeline with no training

2022-07-20 Thread Edgar H
Morning everyone, The question may seem to broad but will try to synth as much as possible: I'm used to work with Spark SQL, DFs and such on a daily basis, easily grouping, getting extra counters and using functions or UDFs. However, I've come to an scenario where I need to make some predictions

Re: [Spark SQL] Null when trying to use corr() with a Window

2022-02-28 Thread Edgar H
t a correlation per group right? > because it's over the sums by ID within the group. Then currentRow is > wrong; needs to be unbounded preceding and following. > > > On Mon, Feb 28, 2022 at 9:22 AM Edgar H wrote: > >> The window is defined as you said yes, unboundedPrecedi

Re: [Spark SQL] Null when trying to use corr() with a Window

2022-02-28 Thread Edgar H
't you want the correlation across the group? otherwise > this answer is 'right' for what you're doing it seems. > > On Mon, Feb 28, 2022 at 7:49 AM Edgar H wrote: > >> My bad completely, missed the example by a mile sorry for that, let me >> chang

Re: [Spark SQL] Null when trying to use corr() with a Window

2022-02-28 Thread Edgar H
t of the correlation calculate itself? El lun, 28 feb 2022 a las 14:12, Sean Owen () escribió: > You're computing correlations of two series of values, but each series has > one value, a sum. Correlation is not defined in this case (both variances > are undefined). This is sample correl

[Spark SQL] Null when trying to use corr() with a Window

2022-02-28 Thread Edgar H
Morning all, been struggling with this for a while and can't really seem to understand what I'm doing wrong... Having the following DataFrame I want to apply the corr function over the following DF; val sampleColumns = Seq("group", "id", "count1", "count2", "orderCount") val sampleSet =

[Spark SQL] Null when trying to use corr() with a Window

2022-02-28 Thread Edgar H
Morning all, been struggling with this for a while and can't really seem to understand what I'm doing wrong... Having the following DataFrame I want to apply the corr function over the following DF; val sampleColumns = Seq("group", "id", "count1", "count2", "orderCount") val sampleSet =