2017-03-24 19:35 GMT+01:00 Fleshgrinder <p...@fleshgrinder.com>: > On 3/24/2017 4:23 PM, Andrea Faulds wrote: > > Hi Nikita, > > > > Nikita Popov wrote: > > > >> I'd like to add a new TOKEN_AS_OBJECT flag to token_get_all(), which > >> returns an array of PhpToken objects, rather than the mix of plain > >> strings > >> and arrays we currently have. The PhpToken class is defined as: > >> > >> class PhpToken { > >> public $type; > >> public $text; > >> public $line; > >> } > > > > Rather than adding a flag to token_get_all() to return objects, you > > could potentially instead make an equivalent static method on PhpToken > > (PhpToken::getAll() perhaps). That would avoid mixing “object-oriented” > > and “procedural” styles, though I don't know if it matters. It seems > > cleaner to me. > > > > Thanks! > > > > I am also against adding another flag because it violates the single > responsibility principle. However, adding a `getAll` method to the > `PhpToken` data class also seems very wrong, because it violates the > single responsibility once again. > > The real and proper solution is to have an actual PHP Lexer that is > capable of creating a token stream. > > class Lexer { > public function __construct(string $source); > public static function fromFile(string $path): self; > public function tokenize(): TokenStream; > } >
If PhpToken::getAll() violates the SRP, then Lexer::fromFile() is just the same sort of violation. Regards, Niklas > final class TokenStream implements IteratorAggregate { > public function getIterator(): Generator<Token> > public function toArray(): Token[] > } > > final class Token { > // Ideally this would be an enum, but... > public const OPEN_TAG; > public const CLOSE_TAG; > // ... > > // I hope these are not mutable! > public int $category; // type > public string $lexeme; // text > public int $line; > public int $column; // we don't have that :( > > /** @see token_name */ > public function getCategoryName(): string; > > // We could add `is*` methods for the various categories here. > public function isOpenTag(): bool; > public function isCloseTag(): bool; > // ... > } > > With this in place, we're ready for the future. The `TokenStream` can > easily use a generator over the internal array. Or we offer the > `toArray` method only at the beginning. Users will have to call it, but > that overhead is tiny, considering that we can extend upon it in the > future without introducing any kind of breaking change. > > Obviously this could go into the namespace `PHP\Parser`, or we prefix > everything with `PHP` (I'd be for the former). > > On a side note, going for `Php` instead of `PHP` is inconsistent with > the naming that is currently dominating the PHP core: > > https://secure.php.net/manual/en/indexes.functions.php > > There are already some things that violate it, e.g. `Phar` instead of > `PHAR`, but most things keep acronyms intact (`DOM`, `XML`, `PDO`, ...). > Introducing even more inconsistency to PHP seems like a very bad idea to > me. I know that this is considered bikeshedding by many people, but > consistency is very important and we are already lacking it on too many > levels as it is. > > -- > Richard "Fleshgrinder" Fussenegger > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > >