Re: What should $\delta$ be exported as for plain text?

Georg Baum Tue, 23 Jun 2015 13:59:42 -0700

Guenter Milde wrote:

> On 2015-06-21, Georg Baum wrote:
> 
>> If the user entered \delta in a math formula, then this is a small
>> mathematical delta. If he rather needs a text greek delta, he should
>> use δ U+03b4 in the first place.
> 
> This is not as straightforward: we need to be consistent. I can only agree
> to the above, if the following is made true, too:
> 
>   If the user entered x in a math formula, then this is a small
>   mathematical x (1D465 MATHEMATICAL ITALIC SMALL X).
>   If he rather needs a text x, he should use x in a \mathrm or textrm box
>   in the first place.


Yes, why not do this as well?

> However, practicality beats purity. We need to consider the pros and cons
> of using mathematical alphanumerical characters:
> 
>   +1 clear distinction of variables from other symbols (sin x vs. sin 𝑥)
> 
>   -1 poor support for these "exotic" characters in many
>   fonts/applications:
>      while writing this text, I see just "sin ?" in my editor, while with
>      "sin α", "what I see is what I mean".

This could be fixed by an external converter. If LyX leaves out the 
additional information that this alpha is a math alpha in the first place, 
then it cannot be recovered for the cases where it is needed.

>> Therefore my preference would be:
> 
>> If a math symbol is representable as a math symbol in Unicode
>> (mathcommand in unicodesymbols is not empty, and textcommand is empty),
>> then use 6, else 2.
> 
> Why would should this depend on whether texcommand is empty?
> What would you write for
>  0x00b1 "\\textpm" "textcomp" "force" "\pm" # ± PLUS-MINUS SIGN
> say?

I had in mind the different ocurrences of \delta as mathcommand. Which one 
should be chosen by an automatic algorithm? But you are probably right, we 
need a different algorithm for multiple occurances.

>> It has been decided a long time ago that our plain text output is utf8,
>> and many insets do already take advantage of that.
> 
> However, there are limits:

Of course. Trying to be too clever does never work.

> We could, e.g., emulate overstriking, underlines etc. with combining
> Unicode characters but this may stand in the way in some use cases.
> 
> We could write TIPA either in the Latin transkription (tipa shortcuts) or
> using Unicode characters.

Unicode transports more information. If somebody needs TIPA shortcuts, he 
can use an external converter. This is not possible if LyX outputs TIPA 
shortcuts in th first place, and unicode TIPA symbols are wanted.

> We could write complex formulas as pure ASCII LaTeX code, in mixed
> representatios or as "2d representation", with and without delimiters
> and/or backslashes:
> 
> a)    $\tan \alpha = \frac{\sin\alpha}{\cos\alpha}$
> 
> b1)   $\tan 𝛼 = \frac{\sin 𝛼}{\cos 𝛼}$
> 
> b2)   tan 𝛼 = \frac{sin 𝛼}{cos 𝛼}
> 
> b3)   tan 𝛼 = sin 𝛼/cos 𝛼
>    
> c1)                sin \alpha
>       tan \alpha = ----------
>                    cos \alpha
> 
> c2)           sin 𝛼
>       tan 𝛼 = -----
>               cos 𝛼
> 
> The most striking advance of a) is, that it is deterministic.

There is some dead code for c) (drawT and metricsT methods), but IMHO it is 
not surprising that it is dead, since it is very complicated to get it 
right.

> For b), an option would be to use "ASCIIMath" as output format.
> http://www1.chapman.edu/~jipsen/mathml/asciimath.html
> http://www.wjagray.co.uk/maths/ASCIIMathTutorial.html
> 
> However, there are too many cases, where b) or c) are too limited
> (cases, matrices, indices), so we would end up in a mixed representation
> with a lot of corner cases to decide.
> 
>> If the user needs more basic plain text, he can still define a new
>> output format and use e.g. recode to replace the unwanted symbols. This
>> is not something that needs to be done in LyX.
> 
> But this would be reverse-engineering.

No, I don't mean reverse engineering. I mean things like replacing a 
mathematical x with a plain one etc. This is a deterministic procedure, and 
I have seen such converters, but unfortunately forgot which one can do that.

> Post-processing the output becomes
> a nightmare if you cannot predict the format emitted by LyX. With a), you
> can use existing "LaTeX math to anything" converters.

I believe you try to make the complicated cases work, while I try to 
concentrate on the simple ones. IMHO, if a "LaTeX math to anything" 
converter is applied to plain text output from LyX, then the workflow is 
suboptimal. In this case, the converter should be applied to LaTeX output in 
the first place, this would give better results. We should not forget that 
the LyX->plain text conversion is a lossy one. It is impossible to convert 
every possible LyX contents correctly.

My motivation for making the simple cases work is that I have been writing 
some text recently with a lot of rather simple formulas which would be 
perfectly representable in plain text.

Maybe we need a switch like for HTML export, so that the user can decide 
whether he wants unicode if possible or always LaTeX (but I would really 
like to see a use case for LaTeX in plain text where a true LaTeX export is 
_not_ better).


Georg

Re: What should $\delta$ be exported as for plain text?

Reply via email to