Hi,

What's the easiest way to split a stream into words ?
Words are just that: words, but - here is the caveat - they must support 
unicode.
So Michael and Michaƫl are both words.

Tried regexpr unit (the obvious choice), but that does not seem to do the trick:

{$mode objfpc}
{$H+}
uses cwstring, sysutils, classes, regexpr;

Var
  Split : TStringList;
  S : String;
  R : TRegexpr;

begin
  Split:=TStringList.Create;
  Split.LoadFromFile(ParamStr(1));
  S:=Split.Text;
  Split.Clear;
  r := TRegExpr.Create;
  try
    r.Expression :='[\w]+';
    r.Split (S, Split);
    for S in Split do
      Writeln('Found: ',S);
  finally
    r.Free;
  end;
end.

Prints simply nonsense...

Michael.
_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Reply via email to