On 2016-11-07, Jean-Marc Lasgouttes wrote:
> Le 06/11/2016 à 14:30, Jean-Marc Lasgouttes a écrit :
>> This is a more radical approach that what I have in mind, and I do not
>> know whether it is safe. My idea was to modify the Row building code and
>> replace the character with some visual cue (in addition with the row
>> breaking), because I am not confident in sending this character to Qt
>> string rendering functions.

>> I'll propose something shortly.

> Finally, I convinced myself that your approach is correct if we want to 
> keep the breaks. In the following patch I add some one screen hints of 
> what is going on. I could use a color of the characters, but I am not 
> sure what to do, these are actual characters, not insets. A solution 
> could be to add a frame around the characters.

I'd rather convert them to the "usual LyX representations" 

  \begin_inset Newline newline
  \end_inset

and

  \begin_layout <type of the last layout>
  
whenever possible. In my understanding,
http://unicode.org/versions/Unicode5.2.0/ch05.pdf recommends just this:
interpret these characters as unambiguous representations of a line preak
and paragraph break.

> The next problem is running LaTeX. By default, these characters are not 
> accepted. Could our local latex+unicode experts tell us whether it makes 
> any sense to handle these characters in LaTeX of whether nobody cares 
> and they should be ignored on output?

> I suspect that adding them to lib/unicodesymbols would do more harm than 
> good.

We could handle them similar to other characters that have a corresponding
LyX inset, e.g. spaces:

\begin_inset space ~
\end_inset

corresponds to

0x00a0 "~"                        "" "notermination=both" "~" "" # NO-BREAK 
SPACE

or to other special characters like
\SpecialChar nobreakdash or \SpecialChar softhyphen

0x2011 "\\nobreakdash-"           "amsmath" "notermination=text" "" "" # 
NON-BREAKING HYPHEN

0x00ad "\\-"                      "" "notermination=text" "" "" # SOFT HYPHEN

In both cases, lib/unicodesymbols has "fallback definitions" in case the
literal Unicode character is still in the document.

As the meaning of LINE SEPARATOR and PARAGRAPH SEPARATOR is clear from
http://unicode.org/versions/Unicode5.2.0/ch05.pdf
we can transform them to the corresponding LaTeX representation:

0x2028 "\\\\"                      "" "" "" "" # LINE SEPARATOR
0x2029 "\\par"                     "" "" "" "" # PARAGRAPH SEPARATOR


Günter

Reply via email to