Re: S3 read/write from PySpark

2020-08-06 Thread Stephen Coy
Hi Daniel, It looks like …BasicAWSCredentialsProvider has become org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider. However, the way that the username and password are provided appears to have changed so you will probably need to look in to that. Cheers, Steve C On 6 Aug 2020, at 11:15 a

Re: S3 read/write from PySpark

2020-08-06 Thread Daniel Stojanov
Hi, Thanks for your help. Problem solved, but I thought I should add something in case this problem is encountered by others. Both responses are correct; BasicAWSCredentialsProvider is gone, but simply making the substitution leads to the traceback just below. java.lang.NoSuchMethodError: 'void c

join doesn't work

2020-08-06 Thread nt
I've using streamline pulsar connector, each dataset receives the data properly but cannot make join to be working Dataset datasetPolicyWithWtm = datasetPolicy.withWatermark("__publishTime", "5 minutes").as("pol"); Dataset datasetPhoneWithWtm = datasetPhone.withWatermark("__publishTime", "5 minute

[SPARK-SQL] How to return GenericInternalRow from spark udf

2020-08-06 Thread Amit Joshi
Hi, I have a spark udf written in scala that takes couuple of columns and apply some logic and output InternalRow. There is spark schema of StructType also present. But when I try to return the InternalRow from UDF there is exception java.lang.

Re: [SPARK-SQL] How to return GenericInternalRow from spark udf

2020-08-06 Thread Sean Owen
The UDF should return the result value you want, not a whole Row. In Scala it figures out the schema of the UDF's result from the signature. On Thu, Aug 6, 2020 at 7:56 AM Amit Joshi wrote: > > Hi, > > I have a spark udf written in scala that takes couuple of columns and apply > some logic and o