from:"Hiltjo Posthuma"

[Lynx-dev] fix for decoding utf-8 in CDATA sections

2023-07-27 Thread Hiltjo Posthuma

Hi, I use lynx to convert HTML to plain-text, but noticed an issue where part of the output is missing with UTF-8 in CDATA sections. Below is a small test-case to reproduce it: Works correctly: a’b Doesn't work correctly: This byte sequence for the UTF-8 codepoint is: printf '\342\200\231'

Re: [Lynx-dev] fix for decoding utf-8 in CDATA sections

2023-10-03 Thread Hiltjo Posthuma

On Thu, Jul 27, 2023 at 10:25:13PM +0200, Hiltjo Posthuma wrote: > Hi, > > I use lynx to convert HTML to plain-text, but noticed an issue where part of > the output is missing with UTF-8 in CDATA sections. > > Below is a small test-case to reproduce it: > > Works corre