RE: Accessing DataFrame inside UserDefinedFunction.

2017-11-05 Thread knowsnothing
Thank you for your response Anurag. I am not sure if I get your point. Are you suggesting that UDF somehow serializes not only reference to Dataset, but also all the data? -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ --

RE: Accessing DataFrame inside UserDefinedFunction.

2017-11-05 Thread Anurag Verma
This is expected. You are not accessing the DataSet Dict when calling UDF countPositiveSimilarity. The dict dataframe as it existed when udf was created is encoded into udf. If you change dict later on the changes will not get automatically picked up in UDF countPositiveSimilarity. Sent from