Hello,

On 08.03.2015 02:53, Xiaodong Gong wrote:
> the encoding type of parent location is must be utf 8,utf16e,according
> to the draft

Yes, the SPEC for VPC/VHD specifies the character encoding to use, which
is good for being portable.

> ascii is the encoding type to store the string of parent location in
> memery and to use fopen()

No: For the (Linux) kernel the filename is a sequence of 8 bit bytes,
where only '\0'=end_of_string and '/'=path_separator are handled
specially. All other bytes have no special meaning and are passed in and
out as is.

Only the applications are doing the character encoding. Normally this is
not a problem as you setup your system once with one encoding (nowadays
UTF-8) and use that consistently: If you enter ä on the keyboard,
the kernels input layer returns \u00E4 as the two-byte UTF-8 sequence
> $ echo -n ä | xxd -g 1
> 0000000: c3 a4
Any application can either just pass the byte sequence around as a CLOB
(or use any other encoding internally - but then it must know that the
input-encoding is UTF-8), but when again doing any system call, they
will again pass that same byte sequence as the file-name, which the
kernel will store on disk.
If you take that disk to another computer, which does NOT use Unicode,
you have a problem: If, for example, that one is still using the old
ISO-8859-1 encoding used in western Europe, you file will be named
differently:
> $ echo -n ä | iconv -f ISO-8859-1 -t UTF-8
> ä

(The reverse is even more painful, as not any ISO-8859-1 character
sequence is a valid UTF-8 byte sequence - several years back when I
moved from my old ISO-8859-1 to a more modern UTF-8 setup, I had to
rename lots of files to be readable again)

You can even test that locally on one system by creating a file
containing an umlaut in its name and then to display that in a non-UTF-8
terminal / environment:
> $ touch ä
> $ LANG=C ls -NQ
> "\303\244"

> ascii need to translate to other encoding type according to LANG when to
> show the information of the vhd file using the qemu-info and so on

No: your assumption that ASCII is used is IMHO wrong: ASCII is only 7
bit, but the kernel interface is 8 bit. The terminal input- and output
layer nowadays are UTF-8, so as long as you're working on the console
everything is fine. If you mix in GUIs and libraries doing their own
encoding/decoding, things get more interesting.

But when you do explicit character conversion like you do for VHD, you
must honor the user configured character encoding of the environment
yourself, that is use LC_CTYPE for any conversion from input, for output
which includes file names.

I checked xen/tools/blktap2/vpc/lib/libvhd.c #
vhd_initialize_header_parent_name()
which also (wrongly) assumes ASCII. Because of the creating a snapshot
using vhd-utils is also broken:

> $ /usr/bin/vhd-util create -n ä.vhd -s 1
> $ /usr/bin/vhd-util snapshot -n snap.vhd -p ä.vhd ; echo $?
> 84

Next I checked
<https://technet.microsoft.com/de-de/library/gg318052%28v=ws.10%29.aspx>
to create a VHD using umlauts with Windows 7:

> cmd # as Admin
> diskpart
> create vdisk file="C:\ä.vhd" maximum=2000 type=expandable
> create vdisk file="C:\snap.vhd" parent="C:\ä.vhd"

But vhd-utils from Xen is broken:

> $ /usr/bin/vhd-util read -n snap.vhd -p
> VHD Header Summary:
...
> Parent name         : failed to read name
...
> VHD Parent Locators:
> --------------------
> locator:            : 0
....
> failed to read parent name

With the attached patch it works:

> VHD Header Summary:
> -------------------
...
> Parent name         : /ä.vhd
...
> VHD Parent Locators:
> --------------------
> locator:            : 0
>        code         : PLAT_CODE_W2KU
...
>        decoded name : /ä.vhd
> 
> locator:            : 1
>        code         : PLAT_CODE_W2RU
...
>        decoded name : ./ä.vhd

Hope that clarified things.

Philipp
conectixŽÓ½win Wi2k}}ß?ÿÿì×» ÄMéN¿å¦Ÿ¢I"ºcxsparseÿÿÿÿÿÿÿÿè ÿÿ㈚(ƒûøG’ā¨~nÇ5ŽÓ5C:\ä.vhdW2kuW2ruC:\ä.vhdÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ.\ä.vhdconectixŽÓ½win Wi2k}}ß?ÿÿì×» ÄMéN¿å¦Ÿ¢I"º
From a6acdfff6a48dccf7df4c3d3fab94ea548fa1ff2 Mon Sep 17 00:00:00 2001
Message-Id: <a6acdfff6a48dccf7df4c3d3fab94ea548fa1ff2.1425811906.git.h...@univention.de>
From: Philipp Hahn <h...@univention.de>
Date: Sun, 8 Mar 2015 10:55:38 +0100
Subject: [PATCH] VHD: Fix locale aware character encoding handling
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Organization: Univention GmbH, Bremen, Germany
To: xen-de...@lists.xen.org

ASCII is 7 bit only, which does not work in UTF-8 environments:
> failed to read parent name

Setup locale in vhd-util to parse LC_CTYPE and use the right codeset
when doing file name encoding and decoding.

Increase allocation for UTF-8 buffer as one UTF-16 character might use
twice as much space in UTF-8 (or more).

Don't check outbytesleft==0 as one UTF-8 characters get encoded into
1..8 bytes, so it's perfectly fine (and expected) for the output to have
remaining bytes left.

Test-case:
$ ./vhd-util create -n ä.vhd -s 1
$ ./vhd-util snapshot -n snap.vhd -p ä.vhd ; echo $?

See
<http://unix.stackexchange.com/questions/48689/effect-of-lang-on-terminal>
for more information about the details of handling the encoding right.

Signed-off-by: Philipp Hahn <h...@univention.de>
---
 tools/blktap2/vhd/lib/libvhd.c | 27 +++++++++++++++++++--------
 tools/blktap2/vhd/vhd-util.c   |  3 +++
 2 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/tools/blktap2/vhd/lib/libvhd.c b/tools/blktap2/vhd/lib/libvhd.c
index 95eb5d6..1fd5b4e 100644
--- a/tools/blktap2/vhd/lib/libvhd.c
+++ b/tools/blktap2/vhd/lib/libvhd.c
@@ -37,6 +37,7 @@
 #include <iconv.h>
 #include <sys/mman.h>
 #include <sys/stat.h>
+#include <langinfo.h>
 
 #include "libvhd.h"
 #include "relative-path.h"
@@ -1296,6 +1297,7 @@ vhd_macx_encode_location(char *name, char **out, int *outlen)
 	size_t ibl, obl;
 	char *uri, *uri_utf8, *uri_utf8p, *ret;
 	const char *urip;
+	char *codeset;
 
 	err     = 0;
 	ret     = NULL;
@@ -1304,7 +1306,7 @@ vhd_macx_encode_location(char *name, char **out, int *outlen)
 	len     = strlen(name) + strlen("file://");
 
 	ibl     = len;
-	obl     = len;
+	obl     = len * 2;
 
 	urip = uri = malloc(ibl + 1);
 	uri_utf8 = uri_utf8p = malloc(obl);
@@ -1312,7 +1314,8 @@ vhd_macx_encode_location(char *name, char **out, int *outlen)
 	if (!uri || !uri_utf8)
 		return -ENOMEM;
 
-	cd = iconv_open("UTF-8", "ASCII");
+	codeset = nl_langinfo(CODESET);
+	cd = iconv_open("UTF-8", codeset);
 	if (cd == (iconv_t)-1) {
 		err = -errno;
 		goto out;
@@ -1325,7 +1328,7 @@ vhd_macx_encode_location(char *name, char **out, int *outlen)
 	    (char **)
 #endif
 	    &urip, &ibl, &uri_utf8p, &obl) == (size_t)-1 ||
-	    ibl || obl) {
+	    ibl) {
 		err = (errno ? -errno : -EIO);
 		goto out;
 	}
@@ -1357,6 +1360,7 @@ vhd_w2u_encode_location(char *name, char **out, int *outlen)
 	size_t ibl, obl;
 	char *uri, *uri_utf16, *uri_utf16p, *tmp, *ret;
 	const char *urip;
+	char *codeset;
 
 	err     = 0;
 	ret     = NULL;
@@ -1404,7 +1408,8 @@ vhd_w2u_encode_location(char *name, char **out, int *outlen)
 	 * MICROSOFT_COMPAT
 	 * little endian unicode here 
 	 */
-	cd = iconv_open("UTF-16LE", "ASCII");
+	codeset = nl_langinfo(CODESET);
+	cd = iconv_open("UTF-16LE", codeset);
 	if (cd == (iconv_t)-1) {
 		err = -errno;
 		goto out;
@@ -1415,7 +1420,7 @@ vhd_w2u_encode_location(char *name, char **out, int *outlen)
 	    (char **)
 #endif
 	    &urip, &ibl, &uri_utf16p, &obl) == (size_t)-1 ||
-	    ibl || obl) {
+	    ibl) {
 		err = (errno ? -errno : -EIO);
 		goto out;
 	}
@@ -1447,11 +1452,13 @@ vhd_macx_decode_location(const char *in, char *out, int len)
 	iconv_t cd;
 	char *name;
 	size_t ibl, obl;
+	char *codeset;
 
 	name = out;
 	ibl  = obl = len;
 
-	cd = iconv_open("ASCII", "UTF-8");
+	codeset = nl_langinfo(CODESET);
+	cd = iconv_open(codeset, "UTF-8");
 	if (cd == (iconv_t)-1) 
 		return NULL;
 
@@ -1479,11 +1486,13 @@ vhd_w2u_decode_location(const char *in, char *out, int len, char *utf_type)
 	iconv_t cd;
 	char *name, *tmp;
 	size_t ibl, obl;
+	char *codeset;
 
 	tmp = name = out;
 	ibl = obl  = len;
 
-	cd = iconv_open("ASCII", utf_type);
+	codeset = nl_langinfo(CODESET);
+	cd = iconv_open(codeset, utf_type);
 	if (cd == (iconv_t)-1) 
 		return NULL;
 
@@ -2450,6 +2459,7 @@ vhd_initialize_header_parent_name(vhd_context_t *ctx, const char *parent_path)
 	size_t ibl, obl;
 	char *ppath, *dst;
 	const char *pname;
+	char *codeset;
 
 	err   = 0;
 	pname = NULL;
@@ -2459,7 +2469,8 @@ vhd_initialize_header_parent_name(vhd_context_t *ctx, const char *parent_path)
 	 * MICROSOFT_COMPAT
 	 * big endian unicode here 
 	 */
-	cd = iconv_open(UTF_16BE, "ASCII");
+	codeset = nl_langinfo(CODESET);
+	cd = iconv_open(UTF_16BE, codeset);
 	if (cd == (iconv_t)-1) {
 		err = -errno;
 		goto out;
diff --git a/tools/blktap2/vhd/vhd-util.c b/tools/blktap2/vhd/vhd-util.c
index 944a59e..13f1835 100644
--- a/tools/blktap2/vhd/vhd-util.c
+++ b/tools/blktap2/vhd/vhd-util.c
@@ -28,6 +28,8 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <langinfo.h>
+#include <locale.h>
 
 #include "libvhd.h"
 #include "vhd-util.h"
@@ -114,6 +116,7 @@ main(int argc, char *argv[])
 	if (setrlimit(RLIMIT_CORE, &rlim) < 0)
 		fprintf(stderr, "setrlimit failed: %d\n", errno);
 #endif
+	setlocale(LC_CTYPE, "");
 
 	ret = 0;
 
-- 
1.9.1

Reply via email to