I would like to remove unicode chars that are outside the Basic
Multilingual Plane [1]

I thought
select regexp_replace(some_column,"[^\\u0000-\\uffff]","\ufffd") from
my_table
would work but while the regexp does work the replacement str does not (I
can paste in the literal �, which you may or may not be able to see here
but it somehow did not fell right)

I saw Deans previous post on using octals [2] but I think \ufffd is outside
the allowable range.

Cheers,
Tom


[1]
http://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane
[2] http://grokbase.com/t/hive/dev/131a4n562y/unicode-character-as-delimiter

Reply via email to