As a final note, I'd like to mention that even PHP grammar being quite
simple, it is light-years more complex (due to the lack of
standardization) than other languages.

You can compare this initial description I wrote to the Java
Specification and get your own conclusions:
http://java.sun.com/docs/books/jls/second_edition/html/syntax.doc.html


Cheers,

On Sat, Jan 1, 2011 at 2:20 PM, guilhermebla...@gmail.com
<guilhermebla...@gmail.com> wrote:
> Hi all,
>
> PHP grammar is far from being complex. It is possible to describe most
> of the syntax with a simple explanation.
> Example:
>
> * We can separate a program into several statements.
> * There're a couple of items that cannot be declared into different
> places (namespace, use), so consider them as top-statements.
> * Also, Namespace declaration may contain multiple statements if you
> define them under brackets.
> * UseStatement can only be used inside a namespace or inside global scope.
> * Finally, we support Classes.
>
> Now we can describe a good portion of PHP grammar:
>
> /* Terminals */
> identifier
> char
> string
> integer
> float
> boolean
>
> /* Grammar Rules */
> Literal ::= string | char | integer | float | boolean
>
> Qualifier ::= ("private" | "public" | "protected") ["static"]
>
> /* Identifiers */
> NamespaceIdentifier ::= identifier {"\" identifier}
> ClassIdentifier ::= identifier
> MethodIdentifier ::= identifier
> FullyQualifiedClassIdentifier ::= [NamespaceIdentifier] ClassIdentifier
>
> /* Root grammar */
> Program ::= {TopStatement} {Statement}
>
> TopStatement ::= NamespaceDeclaration | UseStatement | CommentStatement
> Statement ::= ClassDeclaration | FunctionDeclaration | ...
>
> /* Namespace Declaration */
> NamespaceDeclaration ::= InlineNamespaceDeclaration | 
> ScopeNamespaceDeclaration
> InlineNamespaceDeclaration ::= SimpleNamespaceDeclaration ";"
> {UseDeclaration} {Statement}
> ScopeNamespaceDeclaration ::= SimpleNamespaceDeclaration "{"
> {UseDeclaration} {Statement} "}"
> SimpleNamespaceDeclaration ::= "namespace" NamespaceIdentifier
>
> /* Use Statement */
> UseStatement ::= "use" SimpleUseStatement {"," SimpleUseStatement} ";"
> SimpleUseStatement ::= SimpleNamespaceUseStatement | SimpleClassUseStatement
> SimpleNamespaceUseStatement ::= NamespaceIdentifier ["as" NamespaceIdentifier]
> SimpleClassUseStatement ::= FullyQualifiedClassIdentifier ["as" 
> ClassIdentifier]
>
> /* Comment Declaration */
> CommentStatement ::= InlineCommentStatement | MultilineCommentStatement
> InlineCommentStatement ::= ("//" | "#") string
> MultilineCommentStatement ::= SimpleMultilineCommentStatement |
> DocBlockStatement
> SimpleMultilineCommentStatement ::= "/*" {"*" string} "*/"
> DocBlockStatement ::= "/**" {"*" string} "*/"
>
> /* Class Declaration */
> ClassDeclaration ::= SimpleClassDeclaration "{" {ClassMemberDeclaration} "}"
> SimpleClassDeclaration ::= [abstract] "class" ClassIdentifier
> ["extends" FullyQualifiedClassIdentifier] ["implements"
> FullyQualifiedClassIdentifier {"," FullyQualifiedClassIdentifier}]
>
> ClassMemberDeclaration ::= ConstDeclaration | PropertyDeclaration |
> MethodDeclaration
> ConstDeclaration ::= [DocBlockStatement] "const" identifier "=" Literal ";"
> PropertyDeclaration ::= [DocBlockStatement] Qualifier Variable ["=" Literal] 
> ";"
> MethodDeclaration ::= [DocBlockStatement] (PrototypeMethodDeclaration
> | ComplexMethodDeclaration)
>
> PrototypeMethodDeclaration ::= "abstract" Qualifier "function"
> MethodIdentifier "(" {ArgumentDeclaration} ");"
> ComplexMethodDeclaration ::= ["final"] Qualifier "function"
> MethodIdentifier "(" {ArgumentDeclaration} ")" "{" {Statement} "}"
> ArgumentDeclaration ::= SimpleArgumentDeclatation {","
> SimpleArgumentDeclaration}
> SimpleArgumentDeclaration ::= [TypeHint] Variable ["=" Literal]
> TypeHint ::= ArrayTypeHint | FullyQualifiedClassIdentifier
> ArrayTypeHint ::= "array"
>
>
> Now it is easy to continue the work and add missing rules. =)
>
>
>
> Cheers,
>
> On Sat, Jan 1, 2011 at 12:46 PM, Rune Kaagaard <rumi...@gmail.com> wrote:
>>> There has never been a language grammar, so there's been nothing to refer 
>>> to at all. As for why no one's made one more recently, for fun I snagged 
>>> the .l and .y files from trunk and W3C's version of EBNF from XML. In two 
>>> hours of hacking away, I managed to come up with this sort-of beginning to 
>>> a grammar, which I'm certain contains several errors, and only hints at a 
>>> syntax:
>>
>> I wanted to take your EBNF for a spin so I converted it to a format
>> that the python module "simpleparse" could read. I ironed out a couple
>> of kinks and fixed a bug. You can see it here:
>>
>> http://code.google.com/p/php-snow/source/browse/branches/php-ebnf/gwynne-raskind-example/php.ebnf
>>
>> Then I created a prettyprinter to output the parsetree of some very
>> simple PHP code. See it here:
>>
>> http://code.google.com/p/php-snow/source/browse/branches/php-ebnf/gwynne-raskind-example/parse_example.py
>>
>> and the output is here:
>>
>> http://code.google.com/p/php-snow/source/browse/branches/php-ebnf/gwynne-raskind-example/parse_example.output
>>
>>> Considering what it takes JUST to define namespaces, halt_compiler, basic 
>>> blocks, and the idea of a conditional statement... well, suffice to say the 
>>> "expr" production alone would be triple the size of this. It doesn't help 
>>> that there's no way I'm immediately aware of to check whether a grammar 
>>> like this is accurate.
>>
>> Thanks a lot for the example, that does not look so bad :) PHP syntax
>> is not simple so of course the EBNF will not be either. But still any
>> EBNF would be a lot better than none!
>>
>> Testability is a real issue and makes for a nice catch-22. A
>> hypothetical roadmap could _maybe_ look like this:
>>
>> 1) Create the EBNF and reference implementation while comparing it to
>> a stable release.
>> 2) Rewrite the Zend implementation to read from the EBNF.
>> 3) Repeat for all current releases.
>>
>> It's tough to try to guess about things you don't really understand.
>> Looks like major work though!
>>
>>> Nonetheless, it's a significant undertaking to deal with the complexity of 
>>> the language. There are dozens of tiny little edge cases in PHP's parsing 
>>> that require bunches of extra parser rules. An example from above is the 
>>> difference between using "statement" and "inner-statement" for the two 
>>> different forms of "if". Because "statement" includes basic blocks and 
>>> labels, the rule disallows writing "if: { xyz; } endif;", since apparently 
>>> Zend doesn't support arbitrary basic blocks. All those cases wreak havoc on 
>>> the grammar. In its present form, it will never reduce down to something 
>>> nearly as small as Python's.
>>
>> Just to have a solid, complete maintained EBNF would be a _major_ leap 
>> forward!
>>
>> Thanks for your cool reply!
>>
>> Cheers
>> Rune
>>
>> --
>> PHP Internals - PHP Runtime Development Mailing List
>> To unsubscribe, visit: http://www.php.net/unsub.php
>>
>>
>
>
>
> --
> Guilherme Blanco
> Mobile: +55 (16) 9215-8480
> MSN: guilhermebla...@hotmail.com
> São Paulo - SP/Brazil
>



-- 
Guilherme Blanco
Mobile: +55 (16) 9215-8480
MSN: guilhermebla...@hotmail.com
São Paulo - SP/Brazil

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to