Re: Workaround to a Unicode bug needed

2010-09-14 Thread Dilbert
On 6 sep, 15:25, shawnhco...@gmail.com (Shawn H Corey) wrote: > On Mon, 2010-09-06 at 15:10 +0200, Pierre Nugues wrote: > > > I wrote a simple tokenizer for texts containing Latin9 characters. It > > does not behave as expected with the Swedish text below and I would > > like to find a workaround.

Re: Workaround to a Unicode bug needed

2010-09-06 Thread Pierre Nugues
Dear Shawn, Thank you for you answer. However, this does not seem to work. I used two versions of Perl, the standard Mac installation 5.8.8 and the Active Perl 5.12.1 and neither produces the correct output. Here is what the output should be, one word per line. I only show the first words. Some

Re: Workaround to a Unicode bug needed

2010-09-06 Thread Shawn H Corey
On Mon, 2010-09-06 at 15:10 +0200, Pierre Nugues wrote: > > I wrote a simple tokenizer for texts containing Latin9 characters. It > does not behave as expected with the Swedish text below and I would > like to find a workaround. Add these lines to top of your program: use strict; use warnings;