subject:"Questions regarding utf\-8"

Re: [debian-devel] Questions regarding utf-8

2003-05-17 Thread Rüdiger Kuhlmann

>--[Andreas Metzler]--<[EMAIL PROTECTED]> > Bob Hilliard <[EMAIL PROTECTED]> wrote: > > Andreas Metzler <[EMAIL PROTECTED]> writes: > > glyphs iconv returns? My locale is C. What locale are you using? > [...] > de_AT (uses ISO-8859-1 as charset). > LANG=de_AT, everything else is unset: > *promp

Re: Questions regarding utf-8

2003-05-16 Thread Andreas Metzler

Bob Hilliard <[EMAIL PROTECTED]> wrote: > Andreas Metzler <[EMAIL PROTECTED]> writes: >> *prompt* echo ö§ | recode latin1..ascii >> "oSS >> *prompt* echo ö§ | iconv -f latin1 -t >> ascii//TRANSLIT ; echo $? >> oe? >> -- >> »oe« is much better than »"o« and »SS« is no usable replacement

Re: Questions regarding utf-8

2003-05-16 Thread Bob Hilliard

Andreas Metzler <[EMAIL PROTECTED]> writes: > *prompt* echo ö§ | recode latin1..ascii > "oSS > *prompt* echo ö§ | iconv -f latin1 -t > ascii//TRANSLIT ; echo $? > oe? > -- > »oe« is much better than »"o« and »SS« is no usable replacement for > »§« (I do not think there is one), it wou

Re: Questions regarding utf-8

2003-05-16 Thread Matthias Urlichs

Hi, John Darrington wrote: > Given a text file, it will attempt to guess the natural language in > which it was written. I'm sure it would be fairly simple to modify it to > guess the charset. If you point me to a reasonably large set of example > files, I'll see what I can do. You could use you

Re: Questions regarding utf-8

2003-05-15 Thread era eriksson

On Fri, 09 May 2003 02:31:43 +0200, Martin v. Löwis wrote: > Bob Hilliard wrote: > > 1. How can I determine what character encoding is used in a > > document without manually scanning the entire file? First off, for the examples you mentioned (foldoc and the jargon file) the iso-8859-

Re: Questions regarding utf-8

2003-05-15 Thread John Darrington

I have a neural net program ( http://www.nongnu.org/libann/doc/libann_6.html#SEC26 ) which does something similar: Given a text file, it will attempt to guess the natural language in which it was written. I'm sure it would be fairly simple to modify it to guess the charset. If you point me to a

Re: Questions regarding utf-8

2003-05-15 Thread Andreas Metzler

Bob Hilliard <[EMAIL PROTECTED]> wrote: > Thanks to all who replied to my recent question on this subject. > Andreas Metzler <[EMAIL PROTECTED]> wrote: >> With glibc I'd use >> iconv --from=SRC-ENCODING --to=DST-ENCODING//TRANSLIT >> if it is acceptable to change the length of strings. This

Re: Questions regarding utf-8

2003-05-14 Thread Bob Hilliard

Thanks to all who replied to my recent question on this subject. Andreas Metzler <[EMAIL PROTECTED]> wrote: > With glibc I'd use > iconv --from=SRC-ENCODING --to=DST-ENCODING//TRANSLIT > if it is acceptable to change the length of strings. This will replace > e.g. the Euro-Symbol with "

Re: [debian-devel] Questions regarding utf-8

Re: Questions regarding utf-8

Re: Questions regarding utf-8

Re: Questions regarding utf-8

Re: Questions regarding utf-8

Re: Questions regarding utf-8

Re: Questions regarding utf-8

Re: Questions regarding utf-8

8 matches

Site Navigation

Mail list logo

Footer information