Re: Compare string with German umlauts

Steven Schveighoffer via Digitalmars-d-learn Mon, 18 May 2020 07:30:57 -0700

On 5/18/20 9:44 AM, Martin Tschierschke wrote:

Hi,
I have to find a certain line in a file, with a text containing umlauts.


How do you do this?

The following was not working:

foreach(i,line; file){
  if(line=="My text with ö oe, ä ae or ü"){
    writeln("found it at line",i)
  }
}

I ended up using line.canFind("with part of the text without umlaut").

It solved the problem, but what is the right way to use umlauts (encodethem) inside the program?

using == on strings is going to compare the exact bits for equality. Inunicode, things can be encoded differently to make the same grapheme.For example, ö is a code unit that is the o with a diaeresis (U+00F6).But you could encode it with 2 code points -- a standard o, and then andiaeresis combining character (U+006F, U+0308)

What you need is to normalize the data for comparison:https://dlang.org/phobos/std_uni.html#normalize


For more reference: https://en.wikipedia.org/wiki/Combining_character

-Steve

Re: Compare string with German umlauts

Reply via email to