Re: Dataframe.fillna from 1.3.0

2015-04-20 Thread Reynold Xin
Ah ic. You can do something like df.select(coalesce(df("a"), lit(0.0))) On Mon, Apr 20, 2015 at 1:44 PM, Olivier Girardot < o.girar...@lateral-thoughts.com> wrote: > From PySpark it seems to me that the fillna is relying on Java/Scala code, > that's why I was wondering. > Thank you for answerin

Re: Dataframe.fillna from 1.3.0

2015-04-20 Thread Olivier Girardot
>From PySpark it seems to me that the fillna is relying on Java/Scala code, that's why I was wondering. Thank you for answering :) Le lun. 20 avr. 2015 à 22:22, Reynold Xin a écrit : > You can just create fillna function based on the 1.3.1 implementation of > fillna, no? > > > On Mon, Apr 20, 20

Re: Dataframe.fillna from 1.3.0

2015-04-20 Thread Reynold Xin
You can just create fillna function based on the 1.3.1 implementation of fillna, no? On Mon, Apr 20, 2015 at 2:48 AM, Olivier Girardot < o.girar...@lateral-thoughts.com> wrote: > a UDF might be a good idea no ? > > Le lun. 20 avr. 2015 à 11:17, Olivier Girardot < > o.girar...@lateral-thoughts.co

Re: How to use Spark Streaming .jar file that I've built using a different branch than master?

2015-04-20 Thread Emre Sevinc
Apparently, after *only* building Spark Streaming, I also have to: mvn --projects assembly/ -DskipTests clean install so that my test project uses the new version when I pass it to spark-submit. -- Emre Sevinç On Mon, Apr 20, 2015 at 10:58 AM, Emre Sevinc wrote: > Hello, > > I'm building

Re: [sql] Dataframe how to check null values

2015-04-20 Thread Ted Yu
I found: https://issues.apache.org/jira/browse/SPARK-6573 > On Apr 20, 2015, at 4:29 AM, Peter Rudenko wrote: > > Sounds very good. Is there a jira for this? Would be cool to have in 1.4, > because currently cannot use dataframe.describe function with NaN values, > need to filter manually al

Re: [sql] Dataframe how to check null values

2015-04-20 Thread Peter Rudenko
Sounds very good. Is there a jira for this? Would be cool to have in 1.4, because currently cannot use dataframe.describe function with NaN values, need to filter manually all the columns. Thanks, Peter Rudenko On 2015-04-02 21:18, Reynold Xin wrote: Incidentally, we were discussing this yeste

Re: Dataframe.fillna from 1.3.0

2015-04-20 Thread Olivier Girardot
a UDF might be a good idea no ? Le lun. 20 avr. 2015 à 11:17, Olivier Girardot < o.girar...@lateral-thoughts.com> a écrit : > Hi everyone, > let's assume I'm stuck in 1.3.0, how can I benefit from the *fillna* API > in PySpark, is there any efficient alternative to mapping the records > myself ?

Dataframe.fillna from 1.3.0

2015-04-20 Thread Olivier Girardot
Hi everyone, let's assume I'm stuck in 1.3.0, how can I benefit from the *fillna* API in PySpark, is there any efficient alternative to mapping the records myself ? Regards, Olivier.

Re: Addition of new Metrics for killed executors.

2015-04-20 Thread Archit Thakur
Hi Twinkle, We have a use case in where we want to debug the reason of how n why an executor got killed. Could be because of stackoverflow, GC or any other unexpected scenario. If I see the driver UI there is no information present around killed executors, So was just curious how do people usually

Re: How to use Spark Streaming .jar file that I've built using a different branch than master?

2015-04-20 Thread Emre Sevinc
I thought it was spark-submit that was configuring and arranging everything related to classpath (am I wrong?), e.g. that's how I used Spark so far. Is there a way to do it using spark-submit? -- Emre On Mon, Apr 20, 2015 at 11:06 AM, Akhil Das wrote: > I think you can override the SPARK_CLASSP

Re: How to use Spark Streaming .jar file that I've built using a different branch than master?

2015-04-20 Thread Akhil Das
I think you can override the SPARK_CLASSPATH with your newly built jar. Thanks Best Regards On Mon, Apr 20, 2015 at 2:28 PM, Emre Sevinc wrote: > Hello, > > I'm building a different version of Spark Streaming (based on a different > branch than master) in my application for testing purposes, bu

How to use Spark Streaming .jar file that I've built using a different branch than master?

2015-04-20 Thread Emre Sevinc
Hello, I'm building a different version of Spark Streaming (based on a different branch than master) in my application for testing purposes, but it seems like spark-submit is ignoring my newly built Spark Streaming .jar, and using an older version. Here's some context: I'm on a different branch:

Re: Addition of new Metrics for killed executors.

2015-04-20 Thread twinkle sachdeva
Hi Archit, What is your use case and what kind of metrics are you planning to add? Thanks, Twinkle On Fri, Apr 17, 2015 at 4:07 PM, Archit Thakur wrote: > Hi, > > We are planning to add new Metrics in Spark for the executors that got > killed during the execution. Was just curious, why this in