On 2/17/06, Baskaran Sankaran <[EMAIL PROTECTED]> wrote:

> File: Sample_Hin.txt
>
> दूसरे राज्य पुनर्गठन आयोग के गठन का यही सही वक्त है।

> The sample files were created in Windows in Unicode (both English & Hindi)
> and I am able to open then in notepad and wordpad. But, the output as you
> see is garbage and somehow it misses the utf8. This apart, a blank space is
> added for every character in both English and Hindi.

I've done a little experimenting, and I think you're right and Perl is
wrong here. At least, Perl seemingly disagrees with some common tools
about what a utf8 file is. I confess that I don't know enough about
utf8 to be certain.

If you don't get any better responses soon, you could use perlbug to
file a bug report. It is best if you can include a (small) utf8 file,
such as the first few lines of your Sample_Hin.txt file. But it's
important that the exact file contents be part of the bug report, not
just the text. One way would be if you can include a URL where the
files could be downloaded. But if the files are small enough, you can
convert them to a textual form (such as a hex dump) and include them
with your bug report.

Good luck with it!

--Tom Phoenix
Stonehenge Perl Training

Reply via email to