Le 25/11/2022 à 20:28, Luca Fascione a écrit :
On Fri, 25 Nov 2022, 18:11 Jean Abou Samra, <j...@abou-samra.fr> wrote:What makes you think Pygments can’t do this? You can do (?<=\w+)\d+Nothing but my not remembering lookaheads/lookbehinds, which I may argue aren't very commom constructs. In fact aside from PERL I'm not even sure what precedent they have (no python doesn't count). Besides, this has nothing to do with pygments, this is the regex matching engine that does its thing, pygments just gratefully receives the benefit.
Well, reusing a feature found in the underlying tools is not bad design, it is good design that shares functionality instead of reinventing the wheel. (Sorry, I co-maintain Pygments, which is why I am a bit sensitive to this "bad design" criticism.)
and things like that. You could also arrange so that the regex parsing a pitch leaves you in a state of the lexer where something special will happen for \d+This does sound like pygments code. Interesting, I wasn't aware you could mess with the state of the lexer to that depth.
Hrrm... It's not an advanced feature, it's really the basic way Pygments lexers work. You have a set of states, the lexer has a state stack, each state tries regex-based rules in turn and a rule adds to or removes from the stack. This example would be done as tokens = { "root": [ ... (r"\w+", Token.Pitch, "after_note"), ... ], "after_note": [ (r"\d+", Token.Duration, "#pop"), ... default("#pop"), ], ... } In simple cases (if there is no complex stuff in the "after_note" state), you can get also along with tokens = { "root": [ ... (r"(\w+)(\d*)", bygroups(Token.Pitch, Token.Duration)), ... ], ... } which in hindsight may be closer to what you were thinking of originally.
However, durations don’t always follow a pitch, as in \tuplet 3/2 8. { … } which is the reason why we don’t want to do that.Does Lilypond's parser even know that's a duration? Isn't that just a bare string that \tuplet internally interprets as a duration?
\tuplet is defined (in ly/music-functions-init.ly) as tuplet = #(define-music-function (ratio tuplet-span music) (fraction? (ly:duration? '()) ly:music?) ...) When the parser sees "8", it notes that this could be either a number of a duration, so it tries the different variants against the predicate ly:duration? The function receives an argument of the right type thanks to the predicate it declares for this argument. If you wanted to do that in Pygments, you would have to know the signature of every LilyPond music function and which predicates match numbers or durations, not to mention the problem of user-written functions.
When implementing this kind of simplistic syntax highlighting (like, ones not assisted by being aware of the semantics of the language, like you'd have in Visual Studio or Qt Creator, say) there's always this problem of how much of the common libraries you reimplement by hand, I'm not sure how Frescobaldi does its thing, for example, a lot of it seems quite magic to me (or the result of a huge labour of love... I mean, that program is just brilliant).Anyways whatever Frescobaldi does, I wonder if we could mimic for Pygments...
What Frescobaldi does is here: https://github.com/frescobaldi/python-ly/blob/master/ly/lex/lilypond.py 1500+ lines of code, obviously a lot of work and dedication. Nevertheless, it has to make assumptions too. For example, if you enter this in Frescobaldi: \version "2.22.2" { \barNumberCheck 1 \tweak duration-log 2 c'1 } ... you will notice that the "1" after \barNumberCheck is highlighted in the same color as the duration in "c'1", in spite of it being a number like the "2" in "\tweak duration-log 2 ..." On the reasons not to reuse Frescobaldi's code for syntax highlighting in the documentation, see https://lists.gnu.org/archive/html/lilypond-devel/2022-10/msg00207.html Jean
OpenPGP_signature
Description: OpenPGP digital signature