Re: Issue with UDF Int Conversion - Str to Int

2020-03-23 Thread Vipul Rajan
Hi Ayan, You don't have to bother with conversion at all. All functions that should work on number columns would still work as long as all values in the column are numbers: scala> df2.printSchema root |-- id: string (nullable = false) |-- id2: string (nullable = false) scala> df2.show +---+---

Re: Issue with UDF Int Conversion - Str to Int

2020-03-23 Thread ayan guha
AwesomeDid not know about conv function so thanks for that On Tue, 24 Mar 2020 at 1:23 am, Enrico Minack wrote: > Ayan, > > no need for UDFs, the SQL API provides all you need (sha1, substring, conv > ): > https://spark.apache.org/docs/2.4.5/api/python/pyspark.sql.html > > >>> df.select(conv

Re: Issue with UDF Int Conversion - Str to Int

2020-03-23 Thread Enrico Minack
Ayan, no need for UDFs, the SQL API provides all you need (sha1, substring, conv): https://spark.apache.org/docs/2.4.5/api/python/pyspark.sql.html >>> df.select(conv(substring(sha1(col("value_to_hash")), 33, 8), 16, 10).cast("long").alias("sha2long")).show() +--+ |  sha2long| +

Re: Issue with UDF Int Conversion - Str to Int

2020-03-23 Thread ayan guha
Thanks a lot. Will try. On Mon, Mar 23, 2020 at 8:16 PM Jacob Lynn wrote: > You are overflowing the integer type, which goes up to a max value > of 2147483647 (2^31 - 1). Change the return type of `sha2Int2` to > `LongType()` and it works as expected. > > On Mon, Mar 23, 2020 at 6:15 AM ayan guh

Re: Issue with UDF Int Conversion - Str to Int

2020-03-23 Thread Jacob Lynn
You are overflowing the integer type, which goes up to a max value of 2147483647 (2^31 - 1). Change the return type of `sha2Int2` to `LongType()` and it works as expected. On Mon, Mar 23, 2020 at 6:15 AM ayan guha wrote: > Hi > > I am trying to implement simple hashing/checksum logic. The key lo

Issue with UDF Int Conversion - Str to Int

2020-03-22 Thread ayan guha
Hi I am trying to implement simple hashing/checksum logic. The key logic is - 1. Generate sha1 hash 2. Extract last 8 chars 3. Convert 8 chars to Int (using base 16) Here is the cut down version of the code: ---