Egun on Ekaitz, Ekaitz Zarraga <eka...@elenq.tech> skribis:
> - What it is true is if people were using `peg-as-peg` for their > things, which isn't even exported by the module, their code would > have problems. But we can't predict that. Since ‘peg-as-peg’ has always been private, that’s fine. > From 81944c2037e0d6d8c972def2ceb1aa4c51a61431 Mon Sep 17 00:00:00 2001 > From: Ekaitz Zarraga <eka...@elenq.tech> > Date: Wed, 11 Sep 2024 21:19:26 +0200 > Subject: [PATCH v3 1/2] PEG: Add full support for PEG + some extensions > > This commit adds support for PEG as described in: > > <https://bford.info/pub/lang/peg.pdf> > > It adds support for the missing features (comments, underscores in > identifiers and escaping) while keeping the extensions (dashes in > identifiers, < and <--). > > The naming system tries to be as close as possible to the one proposed > in the paper. > > * module/ice-9/peg/string-peg.scm: Rewrite PEG parser. > * test-suite/tests/peg.test: Fix import [...] > +(define (Primary->defn lst for-syntax) > + (let ((value (second lst))) > + (case (car value) > + ('DOT #'peg-any) > + ('Identifier (Identifier->defn value for-syntax)) > + ('Expression (Expression->defn value for-syntax)) > + ('Literal (Literal->defn value for-syntax)) > + ('Class (Class->defn value for-syntax))))) I get these compiler warnings: --8<---------------cut here---------------start------------->8--- ice-9/peg/string-peg.scm:258:7: warning: duplicate datum quote in clause ((quote NOT) (quasisyntax (not-followed-by (unsyntax (Suffix->defn (third lst) for-syntax))))) of case expression (case (car suffix) ((quote AND) (quasisyntax (followed-by (unsyntax (Suffix->defn (third lst) for-syntax))))) ((quote NOT) (quasisyntax (not-followed-by (unsyntax (Suffix->defn (third lst) for-syntax))))) (else (Suffix->defn suffix for-syntax))) ice-9/peg/string-peg.scm:277:9: warning: duplicate datum quote in clause ((quote STAR) (quasisyntax (* (unsyntax out)))) of case expression (case (caar extra) ((quote QUESTION) (quasisyntax (? (unsyntax out)))) ((quote STAR) (quasisyntax (* (unsyntax out)))) ((quote PLUS) (quasisyntax (+ (unsyntax out))))) ice-9/peg/string-peg.scm:278:9: warning: duplicate datum quote in clause ((quote PLUS) (quasisyntax (+ (unsyntax out)))) of case expression (case (caar extra) ((quote QUESTION) (quasisyntax (? (unsyntax out)))) ((quote STAR) (quasisyntax (* (unsyntax out)))) ((quote PLUS) (quasisyntax (+ (unsyntax out))))) ice-9/peg/string-peg.scm:284:7: warning: duplicate datum quote in clause ((quote Identifier) (Identifier->defn value for-syntax)) of case expression (case (car value) ((quote DOT) (syntax peg-any)) ((quote Identifier) (Identifier->defn value for-syntax)) ((quote Expression) (Expression->defn value for-syntax)) ((quote Literal) (Literal->defn value for-syntax)) ((quote NotInClass) (NotInClass->defn value for-syntax)) ((quote Class) (Class->defn value for-syntax))) ice-9/peg/string-peg.scm:285:7: warning: duplicate datum quote in clause ((quote Expression) (Expression->defn value for-syntax)) of case expression (case (car value) ((quote DOT) (syntax peg-any)) ((quote Identifier) (Identifier->defn value for-syntax)) ((quote Expression) (Expression->defn value for-syntax)) ((quote Literal) (Literal->defn value for-syntax)) ((quote NotInClass) (NotInClass->defn value for-syntax)) ((quote Class) (Class->defn value for-syntax))) ice-9/peg/string-peg.scm:286:7: warning: duplicate datum quote in clause ((quote Literal) (Literal->defn value for-syntax)) of case expression (case (car value) ((quote DOT) (syntax peg-any)) ((quote Identifier) (Identifier->defn value for-syntax)) ((quote Expression) (Expression->defn value for-syntax)) ((quote Literal) (Literal->defn value for-syntax)) ((quote NotInClass) (NotInClass->defn value for-syntax)) ((quote Class) (Class->defn value for-syntax))) ice-9/peg/string-peg.scm:287:7: warning: duplicate datum quote in clause ((quote NotInClass) (NotInClass->defn value for-syntax)) of case expression (case (car value) ((quote DOT) (syntax peg-any)) ((quote Identifier) (Identifier->defn value for-syntax)) ((quote Expression) (Expression->defn value for-syntax)) ((quote Literal) (Literal->defn value for-syntax)) ((quote NotInClass) (NotInClass->defn value for-syntax)) ((quote Class) (Class->defn value for-syntax))) ice-9/peg/string-peg.scm:288:7: warning: duplicate datum quote in clause ((quote Class) (Class->defn value for-syntax)) of case expression (case (car value) ((quote DOT) (syntax peg-any)) ((quote Identifier) (Identifier->defn value for-syntax)) ((quote Expression) (Expression->defn value for-syntax)) ((quote Literal) (Literal->defn value for-syntax)) ((quote NotInClass) (NotInClass->defn value for-syntax)) ((quote Class) (Class->defn value for-syntax))) --8<---------------cut here---------------end--------------->8--- And indeed, the correct syntax is: (case value (DOT …) (Identifier …) …) or: (match value ('DOT …) ('Identifier …) …) The former returns *unspecified* when passed a value not matched by any clause, whereas the latter throws an error. So I would recommend ‘match’. At any rate, that makes me wonder whether this code path is tested at all. As written, ‘Primary->defn’ would always return *unspecified*. Is there a test we could add? > +(define (Range->defn lst for-syntax) > (cond > - ((list? el) > - (cond > - ((eq? (car el) 'peg-literal) > - (peg-literal->defn el for-syntax)) > - ((eq? (car el) 'peg-charclass) > - (peg-charclass->defn el for-syntax)) > - ((eq? (car el) 'peg-nonterminal) > - (datum->syntax for-syntax (string->symbol (cadr el)))))) > - ((string? el) > + ((= 2 (length lst)) > + (second (second lst))) > + ((= 3 (length lst)) > + #`(range > + #,(Char->defn (second lst) for-syntax) > + #,(Char->defn (third lst) for-syntax))))) Keep in mind that ‘length’ is O(N), and that car/first, second/cadr are frowned upon. I would write it as: (match lst ((x y) y) ((x y z) #`(range …))) (Ideally with identifiers more meaningful than x, y, and z. :-)) > ;; Transforms the nonterminals defined in the PEG parser written as a PEG to > the nonterminals defined in the PEG parser written with S-expressions. > (define (grammar-transform x) > @@ -69,7 +77,7 @@ > (peg:tree (match-pattern (@@ (ice-9 peg) peg-grammar) (@@ (ice-9 peg) > peg-as-peg))) > (tree-map > grammar-transform > - (peg:tree (match-pattern grammar (@@ (ice-9 peg) peg-as-peg))))))) > + (peg:tree (match-pattern (@@ (ice-9 peg) peg-grammar) (@@ (ice-9 peg) > peg-as-peg))))))) What happened to the ‘grammar’ binding? I can’t see where it was coming from. > From 64a17be08581465d11185b4a0ca636354d2f944c Mon Sep 17 00:00:00 2001 > From: Ekaitz Zarraga <eka...@elenq.tech> > Date: Fri, 11 Oct 2024 14:24:30 +0200 > Subject: [PATCH v3 2/2] PEG: Add support for `not-in-range` and [^...] > > Modern PEG supports inversed class like `[^a-z]` that would get any > character not in the `a-z` range. This commit adds support for that and > also for a new `not-in-range` PEG pattern for scheme. > > * module/ice-9/peg/codegen.scm (cg-not-in-range): New function. > * module/ice-9/peg/string-peg.scm: Add support for `[^...]` > * test-suite/tests/peg.test: Test it. > * doc/ref/api-peg.texi: Document accordingly. This one LGTM. In addition to the issues mentioned above, could you add an entry in the ‘NEWS’ file, probably under a new “New interfaces and functionality” heading? Thanks, Ludo’.