>--[Andreas Metzler]--<[EMAIL PROTECTED]>
> Bob Hilliard <[EMAIL PROTECTED]> wrote:
> > Andreas Metzler <[EMAIL PROTECTED]> writes:
> > glyphs iconv returns? My locale is C. What locale are you using?
> [...]
> de_AT (uses ISO-8859-1 as charset).
> LANG=de_AT, everything else is unset:
> *promp
Bob Hilliard <[EMAIL PROTECTED]> wrote:
> Andreas Metzler <[EMAIL PROTECTED]> writes:
>> *prompt* echo ö§ | recode latin1..ascii
>> "oSS
>> *prompt* echo ö§ | iconv -f latin1 -t
>> ascii//TRANSLIT ; echo $?
>> oe?
>> --
>> »oe« is much better than »"o« and »SS« is no usable replacement
Andreas Metzler <[EMAIL PROTECTED]> writes:
> *prompt* echo ö§ | recode latin1..ascii
> "oSS
> *prompt* echo ö§ | iconv -f latin1 -t
> ascii//TRANSLIT ; echo $?
> oe?
> --
> »oe« is much better than »"o« and »SS« is no usable replacement for
> »§« (I do not think there is one), it wou
Hi, John Darrington wrote:
> Given a text file, it will attempt to guess the natural language in
> which it was written. I'm sure it would be fairly simple to modify it to
> guess the charset. If you point me to a reasonably large set of example
> files, I'll see what I can do.
You could use you
On Fri, 09 May 2003 02:31:43 +0200, Martin v. Löwis wrote:
> Bob Hilliard wrote:
> > 1. How can I determine what character encoding is used in a
> > document without manually scanning the entire file?
First off, for the examples you mentioned (foldoc and the jargon file)
the iso-8859-
I have a neural net program (
http://www.nongnu.org/libann/doc/libann_6.html#SEC26 ) which does something
similar:
Given a text file, it will attempt to guess the natural language in which it was written.
I'm sure it would be fairly simple to modify it to guess the charset. If you point me
to a
Bob Hilliard <[EMAIL PROTECTED]> wrote:
> Thanks to all who replied to my recent question on this subject.
> Andreas Metzler <[EMAIL PROTECTED]> wrote:
>> With glibc I'd use
>> iconv --from=SRC-ENCODING --to=DST-ENCODING//TRANSLIT
>> if it is acceptable to change the length of strings. This
Thanks to all who replied to my recent question on this subject.
Andreas Metzler <[EMAIL PROTECTED]> wrote:
> With glibc I'd use
> iconv --from=SRC-ENCODING --to=DST-ENCODING//TRANSLIT
> if it is acceptable to change the length of strings. This will replace
> e.g. the Euro-Symbol with "
8 matches
Mail list logo