Thanks Enrico. I meant one hash of each single row in extra column
something like this.. val newDs = typedRows.withColumn("hash", hash(
typedRows.columns.map(col): _*))
On Mon, Mar 2, 2020 at 3:51 PM Enrico Minack wrote:
> Well, then apply md5 on all columns:
>
> ds.select(ds.columns.map(col) ++
Well, then apply md5 on all columns:
ds.select(ds.columns.map(col) ++ ds.columns.map(column =>
md5(col(column)).as(s"$column hash")): _*).show(false)
Enrico
Am 02.03.20 um 11:10 schrieb Chetan Khatri:
Thanks Enrico
I want to compute hash of all the columns value in the row.
On Fri, Feb 28,
Thanks Enrico
I want to compute hash of all the columns value in the row.
On Fri, Feb 28, 2020 at 7:28 PM Enrico Minack
wrote:
> This computes the md5 hash of a given column id of Dataset ds:
>
> ds.withColumn("id hash", md5($"id")).show(false)
>
> Test with this Dataset ds:
>
> import org.apach
This computes the md5 hash of a given column id of Dataset ds:
ds.withColumn("id hash", md5($"id")).show(false)
Test with this Dataset ds:
import org.apache.spark.sql.types._
val ds = spark.range(10).select($"id".cast(StringType))
Available are md5, sha, sha1, sha2 and hash:
https://spark.apa
Hi Chetan,
Would the sql function `hash` do the trick for your use-case ?
Best,
On Fri, Feb 28, 2020 at 1:56 PM Chetan Khatri
wrote:
> Hi Spark Users,
> How can I compute Hash of each row and store in new column at Dataframe,
> could someone help me.
>
> Thanks
>
Hi Spark Users,
How can I compute Hash of each row and store in new column at Dataframe,
could someone help me.
Thanks