bzadeh
Cc: user @spark
Subject: Re: Hive REGEXP_REPLACE use or equivalent in Spark
You might be better off using the CSV loader in this case.
https://github.com/databricks/spark-csv
Input:
[csingh ~]$ hadoop fs -cat test.csv
360,10/02/2014,"?2,500.00",?0.00,"?2,500.00”
ries or their employees, unless expressly so stated. It is the
> responsibility of the recipient to ensure that this email is virus free,
> therefore neither Peridale Technology Ltd, its subsidiaries nor their
> employees accept any responsibility.
>
>
> From: And
ressly so stated. It is
> the responsibility of the recipient to ensure that this email is virus
> free, therefore neither Peridale Technology Ltd, its subsidiaries nor their
> employees accept any responsibility.
>
>
>
>
>
> *From:* Andrew Ehrlich [mailto:and...@aehrlich.
ent: 19 February 2016 01:22
To: Mich Talebzadeh
Cc: User
Subject: Re: Hive REGEXP_REPLACE use or equivalent in Spark
Use the scala method .split(",") to split the string into a collection of
strings, and try using .replaceAll() on the field with the "?" to remove it.
On
Use the scala method .split(",") to split the string into a collection of
strings, and try using .replaceAll() on the field with the "?" to remove it.
On Thu, Feb 18, 2016 at 2:09 PM, Mich Talebzadeh
wrote:
> Hi,
>
> What is the equivalent of this Hive statement in Spark
>
>
>
> select "?2,500.0
Hi,
What is the equivalent of this Hive statement in Spark
select "?2,500.00", REGEXP_REPLACE("?2,500.00",'[^\\d\\.]','');
++--+--+
|_c0 | _c1|
++--+--+
| ?2,500.00 | 2500.00 |
++--+--+
Basically I want to get rid of