2017-03-24 19:35 GMT+01:00 Fleshgrinder <p...@fleshgrinder.com>:

> On 3/24/2017 4:23 PM, Andrea Faulds wrote:
> > Hi Nikita,
> >
> > Nikita Popov wrote:
> >
> >> I'd like to add a new TOKEN_AS_OBJECT flag to token_get_all(), which
> >> returns an array of PhpToken objects, rather than the mix of plain
> >> strings
> >> and arrays we currently have. The PhpToken class is defined as:
> >>
> >> class PhpToken {
> >>     public $type;
> >>     public $text;
> >>     public $line;
> >> }
> >
> > Rather than adding a flag to token_get_all() to return objects, you
> > could potentially instead make an equivalent static method on PhpToken
> > (PhpToken::getAll() perhaps). That would avoid mixing “object-oriented”
> > and “procedural” styles, though I don't know if it matters. It seems
> > cleaner to me.
> >
> > Thanks!
> >
>
> I am also against adding another flag because it violates the single
> responsibility principle. However, adding a `getAll` method to the
> `PhpToken` data class also seems very wrong, because it violates the
> single responsibility once again.
>
> The real and proper solution is to have an actual PHP Lexer that is
> capable of creating a token stream.
>
>     class Lexer {
>         public function __construct(string $source);
>         public static function fromFile(string $path): self;
>         public function tokenize(): TokenStream;
>     }
>

If PhpToken::getAll() violates the SRP, then Lexer::fromFile() is just the
same sort of violation.

Regards, Niklas


>     final class TokenStream implements IteratorAggregate {
>         public function getIterator(): Generator<Token>
>         public function toArray(): Token[]
>     }
>
>     final class Token {
>         // Ideally this would be an enum, but...
>         public const OPEN_TAG;
>         public const CLOSE_TAG;
>         // ...
>
>         // I hope these are not mutable!
>         public int $category;  // type
>         public string $lexeme; // text
>         public int $line;
>         public int $column;    // we don't have that :(
>
>         /** @see token_name */
>         public function getCategoryName(): string;
>
>         // We could add `is*` methods for the various categories here.
>         public function isOpenTag(): bool;
>         public function isCloseTag(): bool;
>         // ...
>     }
>
> With this in place, we're ready for the future. The `TokenStream` can
> easily use a generator over the internal array. Or we offer the
> `toArray` method only at the beginning. Users will have to call it, but
> that overhead is tiny, considering that we can extend upon it in the
> future without introducing any kind of breaking change.
>
> Obviously this could go into the namespace `PHP\Parser`, or we prefix
> everything with `PHP` (I'd be for the former).
>
> On a side note, going for `Php` instead of `PHP` is inconsistent with
> the naming that is currently dominating the PHP core:
>
> https://secure.php.net/manual/en/indexes.functions.php
>
> There are already some things that violate it, e.g. `Phar` instead of
> `PHAR`, but most things keep acronyms intact (`DOM`, `XML`, `PDO`, ...).
> Introducing even more inconsistency to PHP seems like a very bad idea to
> me. I know that this is considered bikeshedding by many people, but
> consistency is very important and we are already lacking it on too many
> levels as it is.
>
> --
> Richard "Fleshgrinder" Fussenegger
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

Reply via email to