I would like to remove unicode chars that are outside the Basic Multilingual Plane [1]
I thought select regexp_replace(some_column,"[^\\u0000-\\uffff]","\ufffd") from my_table would work but while the regexp does work the replacement str does not (I can paste in the literal �, which you may or may not be able to see here but it somehow did not fell right) I saw Deans previous post on using octals [2] but I think \ufffd is outside the allowable range. Cheers, Tom [1] http://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane [2] http://grokbase.com/t/hive/dev/131a4n562y/unicode-character-as-delimiter