RFC: REPLACE THE FLEX-BASED SCANNER WITH AN RE2C [1] BASED LEXER Situation: The current flex-based lexer depends on an outdated and unsupported flex version. Alternatives include either updating to a newer version of flex or using re2c, which we already use for a variety of things (serializing, pdo sql scanning, date/time parsing). While moving towards a newer flex version would be much easier, switching to re2c promises a much faster lexer. Actually, without any specific re2c optimizations we already get around a 20% scanner performance increase. Running the tests gets an overall speedup of 2%. It is arguable whether this is enough, but re2c has more advantages. First of all, re2c allows one to scan any type of input (ASCII, UTF-8, UTF-16, UTF-32). Secondly, it allows for better integration with Lemon [2], which would be the next step. And thirdly we can switch to a reentrant scanner.
Current state: Flex has been fully replaced by re2c in Zend. We have also switched to an mmap-based lexer approach for now. However, we had to drop multibyte support as well as the encoding declare. The current state can be checked out from Scott's subversion repository [3] and you can follow the development on his Trac setup [4]. When you want to build php with re2c, then you need to grab re2c from its sourceforge subversion repository [5]. You can also check out the changes in a patch created Sunday 2nd March against a PHP checkout from 14th February [6]. Further steps: Commit this to PHP 5.3. Synch to HEAD. Add pecl/intl to 5.3. Discuss/recreate multibyte support with libintl. Future steps: Replace bison with lemon in PHP 5.4 or HEAD. Time Frame: Commit to 5.3 between the 5th and the 15th of March. Synch to HEAD a couple of days later. Moving pecl/libintl to ext (depends on the 5.3 RMs decision). After that is done, decide about multibyte support. Along with the commit to the 5.3 branch there will be a new re2c version available. Marcus Boerger Nuno Lopes Scott MacVicar [1] http://re2c.org/ [2] http://www.hwaci.com/sw/lemon/ [3] svn://whisky.macvicar.net/php-re2c [4] http://trac.macvicar.net/php-re2c/ [5] https://re2c.svn.sourceforge.net/svnroot/re2c/trunk/re2c [6] http://php.net/~helly/php-re2c-20080302.diff.txt -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php