Mael Hilléreau wrote:
So inset-type would be a nice higher level, because it will allow me
to easily do what I usually want; but we still need to account for
exceptions, which inset-type can't do. (Don't say "we can have a
special 'ignore spelling' inset": I think it will be hard to correctly
implement the latex output method for such an inset.
It would be more simple than for branches, no?
You almost convinced me here. However, I just tried this with branches
now, and it turns out that exactly where I expected trouble, there are
in fact bugs with the interaction of the branch inset and BiDi text,
*because* the branch is in an inset. Nothing major, it's a convoluted
scenario that I tried, I don't really think anyone will want to try it
with branches. But nonetheless, I still contend that doing this with an
inset is more complicated and error-prone than doing it with character
attributes.
What should happen in terms of latex output in this case is absolutely
nothing: it should be as if the inset weren't there; but I think that
it would be hard to achieve this "nothing" in the current
architecture, because there are too many things which *do* happen when
a new inset is started --- just look at the relevant code...)
Is that true for e.g. notes or comments as well?
I think I didn't explain myself clearly. What I meant is, I'm thinking
of text which goes like "abc def ghi", and now someone comes and says:
"def" is not a word, and so decides to mark it as ignore-spelling. So he
puts an inset around it, and then we have "abc [def] ghi". But the
output of "abc [def] ghi" should look *exactly* the same as the output
of "abc def ghi". I think that so far we all agree on this.
The same thing, BTW, should hold for branches. Say that the 'def' is
only in a branch. Well, the output of that branch should look as if the
text were "abc def ghi", without the inset there. Right?
The problem is, this is not working --- even now with branches, as I
just found out thanks to your question --- in certain cases which
involve Bidi text (and maybe other kinds of transitions). In other
words, in these situations, for "abc def ghi" I get one output, and for
"abc [def] ghi" (in which the branch is activated, of course) I get
*different* output. So I'm not saying that we should now go and
implement branches as character attributes rather than insets (though
that may actually not be a bad idea...;) ); but if we're doing this
again in another situation, I say we keep it simple this time.
Regarding character-styles, I have two half-objections to using this:
(a) I'm not really sure that character styles are where the concept of
ignoring the spell-checker belongs. I see character styles as a tool
for semantic markup, whereas ignore spelling is not, IMO --- although
agree this may be debatable --- semantic, but technical. And mixing
concepts is a bad idea, even if today I can't point to a specific
reason why.
Perhaps one could say that ignoring spellcheck shouldn't be confused
with a font attribute...
Very true, I think I said as much myself last night. But I still think
it is slightly more appropriate here than in an inset.
In the end, there's no one good reason why I prefer character attributes
--- if there were such a reason, it would be easy to convince everyone
of it ;) --- but it's just a feeling of "clunkiness" that surrounds the
more complicated solutions, once we start dealing with more and more
cases...
(b) I'm not familiar enough with character styles, so you'll have to
help me here: is it possible to define a character style which leaves
everything exactly as it is, and only changes a single attribute (in
our case: the spelling)?
Of course.
If so, then OK (that's why this is only a half-objection ;) ); if not,
though, then we'd need to theoretically double the number of character
styles: for every existing style, we would now have one with spelling
and one without...
Is embedding disallowed for them?
Right, that's actually the important question: whether or not more than
one character style can be applied to the same text. If not, then we're
still stuck...
If yes, forget about them, the special
inset could be displayed in an inlined form. I mean something like that:
________
This is a | word | not spellchecked.
--------
____________________________________________
The end of is | phrase is not spellchecked, and its length
_______________ _____________________________
makes it larger than screen. |
------------------------------
Screen redrawing doesn't need anything more, frame and background are
already visible enough. You'd like a green underline? Then what
happens for already blue underlined text, can we stack those lines?
If no, further coding will be necessary.
This is really a non-issue, Mael. Attached is a patch which will deal
with this part of the problem.
(The patch assumes a Font attribute and method called
ignore_spelling(), along the lines of what I suggested last night at
http://permalink.gmane.org/gmane.editors.lyx.devel/91874. To tell the
truth, I think I would prefer to have another new function, something
like 'paintSpecialMarkings' which would call both paintForeignMark and
paintIgnoreSpelling --- but this is just a proof-of-concept patch. And
no, don't expect it to compile! ;) ).
How can underlines overlapping be managed?
Oh, you can't see enough of the context in the patch I submitted; but if
you'll look at the context of the patch, you'll see that all I did was
to copy the functions for language to our case; and then you can see
that for the language, the line is drawn at: y = yo_ + 1 + desc, whereas
in this patch it's drawn at: y = yo_ + 2 + desc. We have to see how it
looks on screen, but really, this is not a major problem...
Then this is applied at note creation time too.
What about my old document that I created without this option?
Should I "remake" all notes?
It is usually ok that new features only are available after
you get the new LyX. :-)
Ok... but it's better if you can make new features available into new
documents. In fact, this could be possible with a per-character
approach by clicking on a document setting (find all notes into the
doc and mark them as not spellchecked)...
Almost any way we deal with this will entail a format change, and
together with that will be a lyx2lyx function (or XSLT?) to convert to
the new format. At that point, *if* we decide (and this is a big if,
I'm not at all suer I think this is correct) that every note should be
marked as ignore-spelling, then we could do that at the time of the
conversion.
Indeed. I see no difference between low and hight level approaches here.
Settings applied at inset creation time keeps
the spellchecker logic (and screen display logic) relatively simple.
Ok, but it makes creation less simple...
Indeed. It is all about "where to put the complexity/slowness".
Screen drawing should be kept fast and simple - it is slow enough
as it is already. At least on some machines. Actually,
put an ERT in a table cell and it is usually slow enough anywhere.
We could just use insets and layouts in a way similar to the way
they're used normally. The special inset wouldn't require much more
processing, even though its display could differ a bit from normal
insets (e.g. lines not broken).
The main window contents gets painted often - not the
place where you want any extra complexity.
I don't understand.
It's a question of adding complexity once (when the inset is created)
versus added complexity at screen painting, which happens every time
the cursor moves, i.e., all the time...
But the truth is, I'm much more worried about the next point than
about the efficiency.
Ok.
If something extra happens at note creation time, then the delay
will be small because only that one note inset gets a treatment.
If this have to happen at spellcheck time, then we get all these delays
at the same time. Perhaps this don't matter much for efficiency,
but there is code maintainability too.
But what you propose is to replace one type-level information
(nospellcheck), by two processes (one at creation time, the other at
click on a document setting), the spellchecker test being required in
all cases (however, doing this test on a type base would require less
computations than on a word, or character base). In addition, until
now we just considered notes. What about comments, branches and
layouts ? Degree of abstraction is just too low to deal with all
these things, and maintainability is clearly worse!! I'd prefer to
maintain 1 information, than n processes!
The point is, though, that the ignore spell check will have a very
clear, simple interface: a character attribute. When we later decide
that actually this or that situation should or should not be ignore
spelling, we don't have to touch the core again (the core, in this
case, being the spell-checker itself; the lyx buffer; the file format;
the painting; etc.) --- we just have small methods here and there
which are in a one-to-one relation with the new situation. That's much
more maintainable than adding longer and longer lists of situations to
what I called "the core".
Hence a low-level approach is more maintainable than a high-level
approach. I learned something today ;)
In general, I think that one of the most complicated areas in software
development, where the most errors creep in, is the interfaces between
different (sub)components. So as a general rule, keeping the interfaces
as simple as possible can usually help in this respect...
Anyway, don't forget that a char-based approach won't address all needs;
remember the 4 needs I mentioned in a previous post
(http://permalink.gmane.org/gmane.editors.lyx.devel/91840). Only point
2. is fully addressed.
Certainly. All along I've said that I like your patches, and we *will*
need some sort of higher level in order to make this *easy* to use. But
once the basis is in, I think this should be possible, and not much more
complicated than your original patch.
Mael.
Dov