Hi Guilherme You wrote that Java spec? Cool! Also very nice example of the PHP EBNF! I think PHP needs a canonical one of those and that the parser should be rewritten to represent said EBNF. Thats what I'm dreaming of at least :)
Cheers Rune On Sat, Jan 1, 2011 at 5:23 PM, guilhermebla...@gmail.com <guilhermebla...@gmail.com> wrote: > As a final note, I'd like to mention that even PHP grammar being quite > simple, it is light-years more complex (due to the lack of > standardization) than other languages. > > You can compare this initial description I wrote to the Java > Specification and get your own conclusions: > http://java.sun.com/docs/books/jls/second_edition/html/syntax.doc.html > > > Cheers, > > On Sat, Jan 1, 2011 at 2:20 PM, guilhermebla...@gmail.com > <guilhermebla...@gmail.com> wrote: >> Hi all, >> >> PHP grammar is far from being complex. It is possible to describe most >> of the syntax with a simple explanation. >> Example: >> >> * We can separate a program into several statements. >> * There're a couple of items that cannot be declared into different >> places (namespace, use), so consider them as top-statements. >> * Also, Namespace declaration may contain multiple statements if you >> define them under brackets. >> * UseStatement can only be used inside a namespace or inside global scope. >> * Finally, we support Classes. >> >> Now we can describe a good portion of PHP grammar: >> >> /* Terminals */ >> identifier >> char >> string >> integer >> float >> boolean >> >> /* Grammar Rules */ >> Literal ::= string | char | integer | float | boolean >> >> Qualifier ::= ("private" | "public" | "protected") ["static"] >> >> /* Identifiers */ >> NamespaceIdentifier ::= identifier {"\" identifier} >> ClassIdentifier ::= identifier >> MethodIdentifier ::= identifier >> FullyQualifiedClassIdentifier ::= [NamespaceIdentifier] ClassIdentifier >> >> /* Root grammar */ >> Program ::= {TopStatement} {Statement} >> >> TopStatement ::= NamespaceDeclaration | UseStatement | CommentStatement >> Statement ::= ClassDeclaration | FunctionDeclaration | ... >> >> /* Namespace Declaration */ >> NamespaceDeclaration ::= InlineNamespaceDeclaration | >> ScopeNamespaceDeclaration >> InlineNamespaceDeclaration ::= SimpleNamespaceDeclaration ";" >> {UseDeclaration} {Statement} >> ScopeNamespaceDeclaration ::= SimpleNamespaceDeclaration "{" >> {UseDeclaration} {Statement} "}" >> SimpleNamespaceDeclaration ::= "namespace" NamespaceIdentifier >> >> /* Use Statement */ >> UseStatement ::= "use" SimpleUseStatement {"," SimpleUseStatement} ";" >> SimpleUseStatement ::= SimpleNamespaceUseStatement | SimpleClassUseStatement >> SimpleNamespaceUseStatement ::= NamespaceIdentifier ["as" >> NamespaceIdentifier] >> SimpleClassUseStatement ::= FullyQualifiedClassIdentifier ["as" >> ClassIdentifier] >> >> /* Comment Declaration */ >> CommentStatement ::= InlineCommentStatement | MultilineCommentStatement >> InlineCommentStatement ::= ("//" | "#") string >> MultilineCommentStatement ::= SimpleMultilineCommentStatement | >> DocBlockStatement >> SimpleMultilineCommentStatement ::= "/*" {"*" string} "*/" >> DocBlockStatement ::= "/**" {"*" string} "*/" >> >> /* Class Declaration */ >> ClassDeclaration ::= SimpleClassDeclaration "{" {ClassMemberDeclaration} "}" >> SimpleClassDeclaration ::= [abstract] "class" ClassIdentifier >> ["extends" FullyQualifiedClassIdentifier] ["implements" >> FullyQualifiedClassIdentifier {"," FullyQualifiedClassIdentifier}] >> >> ClassMemberDeclaration ::= ConstDeclaration | PropertyDeclaration | >> MethodDeclaration >> ConstDeclaration ::= [DocBlockStatement] "const" identifier "=" Literal ";" >> PropertyDeclaration ::= [DocBlockStatement] Qualifier Variable ["=" Literal] >> ";" >> MethodDeclaration ::= [DocBlockStatement] (PrototypeMethodDeclaration >> | ComplexMethodDeclaration) >> >> PrototypeMethodDeclaration ::= "abstract" Qualifier "function" >> MethodIdentifier "(" {ArgumentDeclaration} ");" >> ComplexMethodDeclaration ::= ["final"] Qualifier "function" >> MethodIdentifier "(" {ArgumentDeclaration} ")" "{" {Statement} "}" >> ArgumentDeclaration ::= SimpleArgumentDeclatation {"," >> SimpleArgumentDeclaration} >> SimpleArgumentDeclaration ::= [TypeHint] Variable ["=" Literal] >> TypeHint ::= ArrayTypeHint | FullyQualifiedClassIdentifier >> ArrayTypeHint ::= "array" >> >> >> Now it is easy to continue the work and add missing rules. =) >> >> >> >> Cheers, >> >> On Sat, Jan 1, 2011 at 12:46 PM, Rune Kaagaard <rumi...@gmail.com> wrote: >>>> There has never been a language grammar, so there's been nothing to refer >>>> to at all. As for why no one's made one more recently, for fun I snagged >>>> the .l and .y files from trunk and W3C's version of EBNF from XML. In two >>>> hours of hacking away, I managed to come up with this sort-of beginning to >>>> a grammar, which I'm certain contains several errors, and only hints at a >>>> syntax: >>> >>> I wanted to take your EBNF for a spin so I converted it to a format >>> that the python module "simpleparse" could read. I ironed out a couple >>> of kinks and fixed a bug. You can see it here: >>> >>> http://code.google.com/p/php-snow/source/browse/branches/php-ebnf/gwynne-raskind-example/php.ebnf >>> >>> Then I created a prettyprinter to output the parsetree of some very >>> simple PHP code. See it here: >>> >>> http://code.google.com/p/php-snow/source/browse/branches/php-ebnf/gwynne-raskind-example/parse_example.py >>> >>> and the output is here: >>> >>> http://code.google.com/p/php-snow/source/browse/branches/php-ebnf/gwynne-raskind-example/parse_example.output >>> >>>> Considering what it takes JUST to define namespaces, halt_compiler, basic >>>> blocks, and the idea of a conditional statement... well, suffice to say >>>> the "expr" production alone would be triple the size of this. It doesn't >>>> help that there's no way I'm immediately aware of to check whether a >>>> grammar like this is accurate. >>> >>> Thanks a lot for the example, that does not look so bad :) PHP syntax >>> is not simple so of course the EBNF will not be either. But still any >>> EBNF would be a lot better than none! >>> >>> Testability is a real issue and makes for a nice catch-22. A >>> hypothetical roadmap could _maybe_ look like this: >>> >>> 1) Create the EBNF and reference implementation while comparing it to >>> a stable release. >>> 2) Rewrite the Zend implementation to read from the EBNF. >>> 3) Repeat for all current releases. >>> >>> It's tough to try to guess about things you don't really understand. >>> Looks like major work though! >>> >>>> Nonetheless, it's a significant undertaking to deal with the complexity of >>>> the language. There are dozens of tiny little edge cases in PHP's parsing >>>> that require bunches of extra parser rules. An example from above is the >>>> difference between using "statement" and "inner-statement" for the two >>>> different forms of "if". Because "statement" includes basic blocks and >>>> labels, the rule disallows writing "if: { xyz; } endif;", since apparently >>>> Zend doesn't support arbitrary basic blocks. All those cases wreak havoc >>>> on the grammar. In its present form, it will never reduce down to >>>> something nearly as small as Python's. >>> >>> Just to have a solid, complete maintained EBNF would be a _major_ leap >>> forward! >>> >>> Thanks for your cool reply! >>> >>> Cheers >>> Rune >>> >>> -- >>> PHP Internals - PHP Runtime Development Mailing List >>> To unsubscribe, visit: http://www.php.net/unsub.php >>> >>> >> >> >> >> -- >> Guilherme Blanco >> Mobile: +55 (16) 9215-8480 >> MSN: guilhermebla...@hotmail.com >> São Paulo - SP/Brazil >> > > > > -- > Guilherme Blanco > Mobile: +55 (16) 9215-8480 > MSN: guilhermebla...@hotmail.com > São Paulo - SP/Brazil > -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php