Hello,
On 2/5/20 7:11 PM, H. Andrew Black wrote:
Hi, Hussein.
Using XXE version 9.2.0 (or earlier versions, too, like 8.3.0 and 7.5.0), we've noticed
an odd thing with graphic image file names. In the Attributes Editor tool, we use the
Choose File option to find the image file. Some file names that contain an acute i (í)
display correctly and some do not. The ones that do not give the red message of
"Cannot display image: (file location) (The system cannot find the file
specified)." In both cases, the acute i is converted to %C3%AD which is the UTF-8
code for an acute i.
I'm attaching a minimal working example of this problem for a DocBook section
file. Unzip it and load the XML file into XXE and, hopefully, you'll see that
the first image renders fine but the second does not.
Is there something we're doing incorrectly with the second file?
Thanks,
--Andy
On 2/6/20 5:40 PM, H. Andrew Black wrote:
On 2/6/2020 2:47 AM, Hussein Shafie wrote:
Sorry but we have never received your initial post
That must be because I included a zip file as an attachment that had a
minimal working example. I've put it here:
https://drive.google.com/file/d/1QyuP5LwJheEr-jsD_zQqXvCAc6bcBmpJ/view?usp=sharing.
Thank you. I've reproduced this Java bug both on my Linux box and on my
VirtualBox running Windows 8.1 32-bit.
Yes, this is a Java bug because we basically use File.toURI to convert a
file path to an URL and because File.toURI translates "nzír.png"
to "nzrí.png" which are clearly two different filenames.
("On 2/5/2020 11:11 AM, H. Andrew Black wrote:").
I'm not sure to be able to reproduce this issue on any of our computers.
Which operating system/file system (name, version) sometimes uses NFD
and sometimes NFC to represent its file names? Seems to be a weird
idea inevitably leading to a lot of troubles (not just with XXE).
Mine is Windows 10. A user is on some form of Linux. He just wrote to
me the following:
"The decomposed (NFD) form is preferred for our work, because High tone
is not a part of a vowel, but rather something that goes with it. This
is reflected in our virtual keyboards, and allows searching for high
tone, irrespective of vowel ... This is different from the normal NFC
case, like French é, which is it's own vowel, not e and high tone, etc."
So for him, he always uses NFD at least internally in his programs (and
maybe in his text files so he can do searches). The file name at issue
was automatically generated by a program called Praat
(http://www.fon.hum.uva.nl/praat/) based on the information he keyed
into it. That is, Praat created the file name using NFD and his Linux
file system accepted it without converting it to NFC. (In the NFD
sample file in the zip file above, I copied just a small portion of the
Praat-generated file name and renamed my PNG file to use it.)
So, what to do? I see at least two possibilities:
1. Our user finds a way to convert his *file names* to use NFC and not
NFD. Then they always load correctly in XXE. If, however, he needs to
be able to do searches on file names where the tone (NFD) is crucial,
then that will be a problem for him. If, instead, he only needs to
search on the *contents* of files, then converting file names to NFD
should work well for him (assuming he can figure out how to do that).
2. See if XXE can keep the NFD/NFC distinction when getting the file
name. One possibility might be at
https://stackoverflow.com/questions/43380362/java-differentiate-between-files-in-unicode-nfc-and-nfd.
I've implemented the above Java bug workaround (that is, use Path.toUri,
not File.toURI) in XXE 9.3.0alpha. It works fine on on my Linux box and
not at all on my VirtualBox running Windows 8.1 32-bit.
Of course, I'll test this workaround on a real Windows 10 computer (and
also on a real Mac) and see it works. If it just work on Linux and not
on Windows, I'm not sure we'll keep this workaround (because this would
mean that it's not a real workaround).
Thanks,
--Andy
--
XMLmind XML Editor Support List
xmleditor-support@xmlmind.com
https://www.xmlmind.com/mailman/listinfo/xmleditor-support