Hi, first of all you seem to have misunderstandings about what UTF-8 and
the other Unicode encodings are. If you're interested and confident with
low-level things I advise you to learn exactly what they are. The
relevant portions of the Unicode specification (unicode.org) are not
very long or exceedingly hard to understand, but maybe you can find some
more accessible description.
Most of all, UTF-8 is (normally) absolutely indistinguishable from
normal US-ASCII until you use characters that were not in US-ASCII; so
for example most English files will be bit-per-bit identical whether
written in US-ASCII or UTF-8.
Then, there are many fairly complex issues in how files are read,
converted and written by the various parts of the system. Vim is an
especially problematic part, I had made an attempt of understanding it
in the message
https://www.mail-archive.com/[email protected]/msg57383.html and
the others of that thread. But you probably won't make much out of it
until you know how at least UTF-8 is encoded.
Finally, if you really want to be sure of having all your files encoded
in Unicode (in UTF-8 or other encodings), then I applaud you and agree
with your concern, and I suggest the way I do it (yes there actually is
a way):
https://www.mail-archive.com/[email protected]/msg57385.html .
The BOM mentioned there is a byte sequence that can be placed at the
beginning of text files and will be interpreted by unicode-aware
software as a sort of invisible declaration that the file is in a
certain Unicode encoding.
By the way, all of this means that it's not ascii that is "deprecated",
but the various complimentary or alternative encodings that were (and
still partly are) used to support non-English characters.
Kind regards,
Gabriele
P.S. I'm not sure I'll be able to further reply in the next days, I'm in
a complex situation
'Johannes Köhler' via vim_use wrote:
Beloved vim'er!
until shortly before... I never came up with
the idea of doing: "thinking about the text file encoding
of my files@hdd"
I used unicode like a definition at my locales. Still in
mind that my files are utf-8 encoded.
BUT, after a file crash - during the system play with an
old ext2 filesystem and gnu tar, i had an file header
without file in my inodes. Like an condensor without
payload :) AND, out of curiosity i probed a bit with vim
files, and utf-8 (but btrfs) and an up-to-date archlinux.
Then, I realized that there are three encoding views:
keyboard, display(terminal), vim. Like, decoding pipes to
an encoded socket. The encoded socket, the file itself,
works partly inconsistent together with vim, xterm and
the unixtool file.
Setting: I create an file using xterm console and touch.
Then, i open it with vim.
Vim: enc & fenc = utf-8
BUT file -i: us-ascii
The file results with 2-byte per Character, yet like
us-ascii inside of an unicode container. However, i
like to have real unicode and not an endianness
of us-ascii using 2-byte instead of 1-byte.
Then @vim, i change the encoding to ucs-2 with :set fenc=ucs-2. I
read@vimdoku ucs-2 and utf-8 is similar@linux
Now :write, vim tells me [converted] and
file (sometimes) tells me utf-8 like expected. The file
size increases to 4-byte per character, like expected
for ucs-4. Then reread @vim, shows me unreadable content.
I have to ++enc it back to ucs-2. So, inside vim ucs-2 and utf-8 seems
to be different. And @linux ucs-2 using
filespace like ucs-4.
Imaginary reasoning: my system wide (or kernel working)
utf-8 differs from real unicode utf-8 by endianness
abuse. Maybe because of compatibility...
That is why the file tool works inconsistent
(partly tells binary stuff instead of text encoding).
Is there a way to ensure working with true utf-8
or better utf-16 files? Aim is to work with source
files in unicode to exclude the deprecated ascii...
Sincerly
-kefko
--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/vim_use/fd8feffe-891b-5a14-223c-9ebdf99841ac%40tiscali.it.