Referencing a scala/java PipelineStage from pyspark - constructor issues with HasInputCol

2020-08-17 Thread Aviad Klein
Hi, I've referenced the same problem on stack overflow and can't seem to find answers. I have custom spark pipelinestages written in scala that are specific to my organization. They work well on scala-spark. However, when I try to wrap them as shown here, so I can use them in pyspark, I get weird

Block fetching fails due to change in local address

2020-08-17 Thread Samik R
Hello, Recently faced a strange problem. I was running a job on my laptop with deploy mode as client and context as local[*]. In between I lost connection to my router, and when I got back the connection, the laptop was assigned a different internal IP address. The j

Driver Information

2020-08-17 Thread Amit Sharma
Hi, I have 20 node clusters. I run multiple batch jobs. in spark submit file ,driver memory=2g and executor memory=4g and I have 8 GB worker. I have below questions 1. Is there any way I know in each batch job which worker is the driver node? 2. Will the driver node be part of one of the executors

Re: Referencing a scala/java PipelineStage from pyspark - constructor issues with HasInputCol

2020-08-17 Thread Sean Owen
Looks like you are building vs Spark 3 and running on Spark 2, or something along those lines. On Mon, Aug 17, 2020 at 4:02 AM Aviad Klein wrote: > Hi, I've referenced the same problem on stack overflow and can't seem to > find answers. > > I have custom spark pipelinestages written in scala tha

Re: Referencing a scala/java PipelineStage from pyspark - constructor issues with HasInputCol

2020-08-17 Thread Aviad Klein
Hi Owen, it's omitted from what I pasted but I'm using spark 2.4.4 on both. On Mon, Aug 17, 2020 at 4:37 PM Sean Owen wrote: > Looks like you are building vs Spark 3 and running on Spark 2, or > something along those lines. > > On Mon, Aug 17, 2020 at 4:02 AM Aviad Klein > wrote: > >> Hi, I've

Re: Referencing a scala/java PipelineStage from pyspark - constructor issues with HasInputCol

2020-08-17 Thread Sean Owen
Hm, next guess: you need a no-arg constructor this() on FooTransformer? also consider extending UnaryTransformer. On Mon, Aug 17, 2020 at 9:08 AM Aviad Klein wrote: > Hi Owen, it's omitted from what I pasted but I'm using spark 2.4.4 on both. > > On Mon, Aug 17, 2020 at 4:37 PM Sean Owen wrote:

Re: Referencing a scala/java PipelineStage from pyspark - constructor issues with HasInputCol

2020-08-17 Thread chris
Hi, I took your code and ran it on spark 2.4.5 and it works ok for me. My first though, like Sean, is that you have some Spark ML version mismatch somewhere. Chris > On 17 Aug 2020, at 16:18, Sean Owen wrote: > >  > Hm, next guess: you need a no-arg constructor this() on FooTransformer? als

How to migrate DataSourceV2 into Spark 3.0.0

2020-08-17 Thread Rafael Kyrdan
Hey guys, I’m trying to migrate my package where I’m using DataSourceV2 into Spark 3.0.0 Unfortunately, neither migration guide nor JIRA ticket under which this API was refactored says nothing about how to do it. https://issues.apache.org/jira/browse/SPARK-25390 I was suggested to send my ques