As a final note, I'd like to mention that even PHP grammar being quite simple, it is light-years more complex (due to the lack of standardization) than other languages.
You can compare this initial description I wrote to the Java Specification and get your own conclusions: http://java.sun.com/docs/books/jls/second_edition/html/syntax.doc.html Cheers, On Sat, Jan 1, 2011 at 2:20 PM, guilhermebla...@gmail.com <guilhermebla...@gmail.com> wrote: > Hi all, > > PHP grammar is far from being complex. It is possible to describe most > of the syntax with a simple explanation. > Example: > > * We can separate a program into several statements. > * There're a couple of items that cannot be declared into different > places (namespace, use), so consider them as top-statements. > * Also, Namespace declaration may contain multiple statements if you > define them under brackets. > * UseStatement can only be used inside a namespace or inside global scope. > * Finally, we support Classes. > > Now we can describe a good portion of PHP grammar: > > /* Terminals */ > identifier > char > string > integer > float > boolean > > /* Grammar Rules */ > Literal ::= string | char | integer | float | boolean > > Qualifier ::= ("private" | "public" | "protected") ["static"] > > /* Identifiers */ > NamespaceIdentifier ::= identifier {"\" identifier} > ClassIdentifier ::= identifier > MethodIdentifier ::= identifier > FullyQualifiedClassIdentifier ::= [NamespaceIdentifier] ClassIdentifier > > /* Root grammar */ > Program ::= {TopStatement} {Statement} > > TopStatement ::= NamespaceDeclaration | UseStatement | CommentStatement > Statement ::= ClassDeclaration | FunctionDeclaration | ... > > /* Namespace Declaration */ > NamespaceDeclaration ::= InlineNamespaceDeclaration | > ScopeNamespaceDeclaration > InlineNamespaceDeclaration ::= SimpleNamespaceDeclaration ";" > {UseDeclaration} {Statement} > ScopeNamespaceDeclaration ::= SimpleNamespaceDeclaration "{" > {UseDeclaration} {Statement} "}" > SimpleNamespaceDeclaration ::= "namespace" NamespaceIdentifier > > /* Use Statement */ > UseStatement ::= "use" SimpleUseStatement {"," SimpleUseStatement} ";" > SimpleUseStatement ::= SimpleNamespaceUseStatement | SimpleClassUseStatement > SimpleNamespaceUseStatement ::= NamespaceIdentifier ["as" NamespaceIdentifier] > SimpleClassUseStatement ::= FullyQualifiedClassIdentifier ["as" > ClassIdentifier] > > /* Comment Declaration */ > CommentStatement ::= InlineCommentStatement | MultilineCommentStatement > InlineCommentStatement ::= ("//" | "#") string > MultilineCommentStatement ::= SimpleMultilineCommentStatement | > DocBlockStatement > SimpleMultilineCommentStatement ::= "/*" {"*" string} "*/" > DocBlockStatement ::= "/**" {"*" string} "*/" > > /* Class Declaration */ > ClassDeclaration ::= SimpleClassDeclaration "{" {ClassMemberDeclaration} "}" > SimpleClassDeclaration ::= [abstract] "class" ClassIdentifier > ["extends" FullyQualifiedClassIdentifier] ["implements" > FullyQualifiedClassIdentifier {"," FullyQualifiedClassIdentifier}] > > ClassMemberDeclaration ::= ConstDeclaration | PropertyDeclaration | > MethodDeclaration > ConstDeclaration ::= [DocBlockStatement] "const" identifier "=" Literal ";" > PropertyDeclaration ::= [DocBlockStatement] Qualifier Variable ["=" Literal] > ";" > MethodDeclaration ::= [DocBlockStatement] (PrototypeMethodDeclaration > | ComplexMethodDeclaration) > > PrototypeMethodDeclaration ::= "abstract" Qualifier "function" > MethodIdentifier "(" {ArgumentDeclaration} ");" > ComplexMethodDeclaration ::= ["final"] Qualifier "function" > MethodIdentifier "(" {ArgumentDeclaration} ")" "{" {Statement} "}" > ArgumentDeclaration ::= SimpleArgumentDeclatation {"," > SimpleArgumentDeclaration} > SimpleArgumentDeclaration ::= [TypeHint] Variable ["=" Literal] > TypeHint ::= ArrayTypeHint | FullyQualifiedClassIdentifier > ArrayTypeHint ::= "array" > > > Now it is easy to continue the work and add missing rules. =) > > > > Cheers, > > On Sat, Jan 1, 2011 at 12:46 PM, Rune Kaagaard <rumi...@gmail.com> wrote: >>> There has never been a language grammar, so there's been nothing to refer >>> to at all. As for why no one's made one more recently, for fun I snagged >>> the .l and .y files from trunk and W3C's version of EBNF from XML. In two >>> hours of hacking away, I managed to come up with this sort-of beginning to >>> a grammar, which I'm certain contains several errors, and only hints at a >>> syntax: >> >> I wanted to take your EBNF for a spin so I converted it to a format >> that the python module "simpleparse" could read. I ironed out a couple >> of kinks and fixed a bug. You can see it here: >> >> http://code.google.com/p/php-snow/source/browse/branches/php-ebnf/gwynne-raskind-example/php.ebnf >> >> Then I created a prettyprinter to output the parsetree of some very >> simple PHP code. See it here: >> >> http://code.google.com/p/php-snow/source/browse/branches/php-ebnf/gwynne-raskind-example/parse_example.py >> >> and the output is here: >> >> http://code.google.com/p/php-snow/source/browse/branches/php-ebnf/gwynne-raskind-example/parse_example.output >> >>> Considering what it takes JUST to define namespaces, halt_compiler, basic >>> blocks, and the idea of a conditional statement... well, suffice to say the >>> "expr" production alone would be triple the size of this. It doesn't help >>> that there's no way I'm immediately aware of to check whether a grammar >>> like this is accurate. >> >> Thanks a lot for the example, that does not look so bad :) PHP syntax >> is not simple so of course the EBNF will not be either. But still any >> EBNF would be a lot better than none! >> >> Testability is a real issue and makes for a nice catch-22. A >> hypothetical roadmap could _maybe_ look like this: >> >> 1) Create the EBNF and reference implementation while comparing it to >> a stable release. >> 2) Rewrite the Zend implementation to read from the EBNF. >> 3) Repeat for all current releases. >> >> It's tough to try to guess about things you don't really understand. >> Looks like major work though! >> >>> Nonetheless, it's a significant undertaking to deal with the complexity of >>> the language. There are dozens of tiny little edge cases in PHP's parsing >>> that require bunches of extra parser rules. An example from above is the >>> difference between using "statement" and "inner-statement" for the two >>> different forms of "if". Because "statement" includes basic blocks and >>> labels, the rule disallows writing "if: { xyz; } endif;", since apparently >>> Zend doesn't support arbitrary basic blocks. All those cases wreak havoc on >>> the grammar. In its present form, it will never reduce down to something >>> nearly as small as Python's. >> >> Just to have a solid, complete maintained EBNF would be a _major_ leap >> forward! >> >> Thanks for your cool reply! >> >> Cheers >> Rune >> >> -- >> PHP Internals - PHP Runtime Development Mailing List >> To unsubscribe, visit: http://www.php.net/unsub.php >> >> > > > > -- > Guilherme Blanco > Mobile: +55 (16) 9215-8480 > MSN: guilhermebla...@hotmail.com > São Paulo - SP/Brazil > -- Guilherme Blanco Mobile: +55 (16) 9215-8480 MSN: guilhermebla...@hotmail.com São Paulo - SP/Brazil -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php