Hi, Martin!

First, specify UTF-8 binmode for STDOUT, it's good practice if you printing
unicode characters.

Second and main, problem here is that your umlaut character has not ord
195. More over, the way you construct umlaut character give you not a
single character but unicode grapheme.

You can test it with this simple program
https://gist.github.com/elcamlost/e44616785cf475bea10d

This problem accurately described in Effective Perl Programming book (see
http://www.effectiveperlprogramming.com/2011/06/treat-unicode-strings-as-grapheme-clusters/
).

So, your tests are correct and they fail by the reason.

If you will construct your umlaut symbol like suggested in gist (my
$CHAR_UMLAUT => "\N{LATIN SMALL LETTER U WITH DIAERESIS}";) your tests will
work as expected.

Completed example you can find in that gist
https://gist.github.com/elcamlost/007c398c901881763c0b






ср, 23 сент. 2015 г. в 12:26, Martin Barth <mar...@senfdax.de>:

> Hello,
>
> i'm struggling around with umlauts in my xml files, which i want to
> parse with XML::Rabbit.
> I've got the same behaviour with __DATA__ or when i'm reading a xml file
> via MyNode->new(file => ....);
>
> And i've got non idea what i am doing wrong :(
> (ps: yes, the testcase is utf8 encoded acording to the file command)
>
>  % perl xml_rabbit.t
> #
> # 195
> not ok 1 - umlaut in xml
> #   Failed test 'umlaut in xml'
> #   at xml_rabbit.t line 18.
> #          got: '�'
> #     expected: 'ü'
> not ok 2 - ord of umlaut
> #   Failed test 'ord of umlaut'
> #   at xml_rabbit.t line 19.
> #          got: '195'
> #     expected: '252'
> 1..2
> # Looks like you failed 2 tests of 2.
>
>
>  % cat xml_rabbit.t
> #!/usr/bin/env perl
>
> package MyNode;
> use XML::Rabbit::Root;
> has_xpath_value umlaut  => '/x/umlaut';
>
> package main;
> use Test::More;
>
> my $xml = do{local $/; <DATA>};
> my $node = MyNode->new(xml => $xml);
>
> diag $node->umlaut;
> diag ord "ü";
> is($node->umlaut, "ü", "umlaut in xml");
> is(ord("ü"), ord($node->umlaut), "ord of umlaut");
>
> done_testing(2);
>
> __DATA__
> <?xml version="1.0" encoding="UTF-8"?>
> <x>
>     <umlaut>ü</umlaut>
> </x>
>  % perl -v
>
> This is perl 5, version 20, subversion 1 (v5.20.1) built for x86_64-linux
>
>
>
> --
> To unsubscribe, e-mail: beginners-unsubscr...@perl.org
> For additional commands, e-mail: beginners-h...@perl.org
> http://learn.perl.org/
>
>
>

Reply via email to