Hi Ayan,
You don't have to bother with conversion at all. All functions that should
work on number columns would still work as long as all values in the column
are numbers:
scala> df2.printSchema
root
|-- id: string (nullable = false)
|-- id2: string (nullable = false)
scala> df2.show
+---+---
AwesomeDid not know about conv function so thanks for that
On Tue, 24 Mar 2020 at 1:23 am, Enrico Minack
wrote:
> Ayan,
>
> no need for UDFs, the SQL API provides all you need (sha1, substring, conv
> ):
> https://spark.apache.org/docs/2.4.5/api/python/pyspark.sql.html
>
> >>> df.select(conv
Ayan,
no need for UDFs, the SQL API provides all you need (sha1, substring, conv):
https://spark.apache.org/docs/2.4.5/api/python/pyspark.sql.html
>>> df.select(conv(substring(sha1(col("value_to_hash")), 33, 8), 16,
10).cast("long").alias("sha2long")).show()
+--+
| sha2long|
+
Thanks a lot. Will try.
On Mon, Mar 23, 2020 at 8:16 PM Jacob Lynn wrote:
> You are overflowing the integer type, which goes up to a max value
> of 2147483647 (2^31 - 1). Change the return type of `sha2Int2` to
> `LongType()` and it works as expected.
>
> On Mon, Mar 23, 2020 at 6:15 AM ayan guh
You are overflowing the integer type, which goes up to a max value
of 2147483647 (2^31 - 1). Change the return type of `sha2Int2` to
`LongType()` and it works as expected.
On Mon, Mar 23, 2020 at 6:15 AM ayan guha wrote:
> Hi
>
> I am trying to implement simple hashing/checksum logic. The key lo
Hi
I am trying to implement simple hashing/checksum logic. The key logic is -
1. Generate sha1 hash
2. Extract last 8 chars
3. Convert 8 chars to Int (using base 16)
Here is the cut down version of the code:
---