> On 30 Dec 2024, at 14:06, Hans Hagen <j.ha...@xs4all.nl> wrote:
> 
> On 12/28/2024 3:51 AM, Bruce Horrocks wrote:
>>> On 27 Dec 2024, at 19:21, Hans Hagen <j.ha...@xs4all.nl> wrote:
>>> 
>>> On 12/27/2024 8:11 PM, Bruce Horrocks wrote:
>>>>> On 27 Dec 2024, at 10:03, Wolfgang Schuster 
>>>>> <wolfgang.schuster.li...@gmail.com> wrote:
>>>>> 
>>>>> Bruce Horrocks schrieb am 27.12.2024 um 00:09:
>>>>>> Trying to include a URL as a clickable link using \goto but in the 
>>>>>> generated PDF the underlying link is corrupted.
>>>>>> 
>>>>>> \setupinteraction[state=start]
>>>>>> \starttext
>>>>>> \goto{https://www.mclpcb.com/blog/polyimide-pcb-material-information-fr4-vs-polyimide-pcb/}
>>>>>> [https://www.mclpcb.com/blog/polyimide-pcb-material-information-fr4-vs-polyimide-pcb/)]
>>>>>> \stoptext
>>>>>> 
>>>>>> Becomes: 
>>>>>> https://www.mclpcb.com/blog/polyimide%C2%ADpcb%C2%ADmaterial%C2%ADinformation%C2%ADfr4-vs%C2%ADpolyimide-
>>>>>> Note hyphens changed and the URL has been truncated.
>>>>> 
>>>>> To get working links use url(...) for the second argument and to ensure 
>>>>> it breaks at proper points in text use \hypehantedurl{...} for the first 
>>>>> argument.
>>>>> 
>>>>> \goto{\hyphenatedurl{...}}[url(...)]
>>>> Thanks Wolfgang and Hans - it was the missing url(...) where I seem to be 
>>>> incapable of cutting and pasting properly anymore!
>>>> I'm surprised that anything got put into the PDF - if it had failed to do 
>>>> anything then I think I would have realised. Oh well, live and learn. :-)
>>> 
>>> try without the url ... and \nopdfcompression .. there is no annot in the 
>>> pdf file ... it's likely your pdf viewer trying to be smart by interpreting 
>>> the page (test) stream (i really don't like such features)
>> You're right except that it is the Mac's built-in "url detector" that is 
>> going wrong as several PDF viewers get it wrong in the same way.
>> But there is something quirky about the typesetting that Context is doing 
>> because, in the following MWE, link-detection goes wrong for the first but 
>> works for the second.
>> \setuppapersize[A3]
>> \starttext
>> https://www.mclpcb.com/blog/polyimide-pcb-material-information-fr4-vs-polyimide-pcb/
>> \par
>> \setupalign[nothyphenated]
>> https://www.mclpcb.com/blog/polyimide-pcb-material-information-fr4-vs-polyimide-pcb/
>> \stoptext
>> If I open the pdf created by the above in LibreOffice Writer then all the 
>> hyphens in the first url are removed apart from the one after "fr4". These 
>> removed hyphens are the same ones that get converted to %C2%AD in the bad 
>> URL created by the Mac. Similarly a select-all, copy then paste into a text 
>> editor also has the hyphens missing as per LibreWriter. The second URL is 
>> fine.
>> It can go to the very end of your "things to look at" list as it's hardly 
>> urgent and the work-around is as shown. I just note it here for the mailing 
>> list in case someone in the future wonders why urls copied from their PDFs 
>> sometimes don't work.
> 
> The problem is the following:
> 
> 1 - when we hyphenate hyphens get injected based on patterns
> 2 - when we use an - it becomes a discretionary unless we disbale hyphenation
> 3 - a \- becomes a discretionary anyway
> 
> so that's the input. Then we render a - or not and there are two vartiants 
> then: 0x2D and 0xAD. Currently we mark your -'s as 0xAD in the pdf but i can 
> limit that to syllable discretionaties only case 1 so i'll do that instead.
> 
> Then there is the cut'n'paste as well as interpretation as url (or whatever 
> we don't know about) in viewers ... and that is a real inconstent mess. So, 
> in the end, what works at your end can be different at mine. One can argue 
> for that soft hyphens should connect snippets when cutting (so across lines) 
> and disappear when in an url.
> 
> Whatever we do, it will, never be perfect.
> 

It hadn't occurred to me that the hyphenation mechanism caused *all* hard 
hyphens present in the source to be replaced with the soft-hyphen character. 
Aside from URLs this means that any text with a natural hyphen such as a name 
like Olivia Newton-John won't copy and paste from the PDF.

I appreciate that when TeX was being designed, cutting and pasting from a PDF 
wasn't something that could be foreseen, but is there any reason why 0x2Ds 
can't be left unchanged and hyphens added by Context use 0xAD? Seems too simple.

—
Bruce Horrocks
Hampshire, UK

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / 
https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage  : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive  : https://github.com/contextgarden/context
wiki     : https://wiki.contextgarden.net
___________________________________________________________________________________

Reply via email to