Thanks Russell for checking this out!
This is a good example of a replace which is available in the Sapk SQL but not in the PySpark API nor in Scala API unfortunately.
Another alternative to this is mentioned regexp_replace, but as a developer looking for replace function we tend to ignore regex version as it's not what we usually look for and then realise there is not built in replace utility function and have to use regexp alternative.
So, to give an example, it is possible now to do something like this:
scala> val df = Seq("aaa zzz").toDF
df: org.apache.spark.sql.DataFrame = [value: string]
scala> df.select(expr("replace(value, 'aaa', 'bbb')")).show()
+------------------------+
|replace(value, aaa, bbb)|
+------------------------+
| bbb zzz|
+------------------------+
But not this:
df.select(replace('value, "aaa", "ooo")).show()
as replace function is not available in functions modules both PySpark and Scala.
And this is the output from my local prototype which would be good to see in the official API:
scala> df.select(replace('value, "aaa", "ooo")).show()
+----------------------------------+
|regexp_replace(value, aaa, ooo, 1)|
+----------------------------------+
| ooo zzz|
+----------------------------------+
WDYT?