On Sunday, March 15, 2020 3:08:05 PM CDT Jack Ostroff wrote: > Something seems odd. I only get a result with > https://l10n.kde.org/docs/doc-primer/check-docs.html (note the added > "l" at then end)
I took a look at the html source in page https://l10n.kde.org/docs/doc-primer/index.html About half-way through the document I see this: <a href="check-docs.htm✺l">Checking and Viewing the Documents</a> where "✺" is actually an unprintable character -- I'm just guessing here, but I think it's X'EFBFBD', based on John Haynes's original post in this thread. That's not a valid code point in UTF-8. I have no idea how it got inserted into the html code, which is generated by an XML processor from a .docbook source document. All that stuff really ought to be straight 7-bit ASCII characters, for the most part. The URL for the destination page was rendered correctly (without 3 bytes of hex garbage) by the same XML processor. Cosmic rays, maybe? I posted a picture of the way Firefox renders this bit of code on my web site, just in case anyone wants to look at it. Visit https://davidcbryant.net/images/WeirdUTF-8.png if you're curious. Interestingly, I can't even copy and paste the unprintable character. I suppose that's because it's not a valid UTF-8 character ... when scanning text input, the "copy" function probably stops when it hits something unrecognizable. David Bryant Canyon Lake, Texas
