Hi!
be much easier, switching to re2c promises a much faster lexer. Actually,
without any specific re2c optimizations we already get around a 20% scanner
I think 20% faster is very cool.
However, as I understand re2c is not a standard tool found everywhere.
So what happens if you wanted to use it on some exotic system where re2c
is not readily available as manintainer-supported software? Also, flex
is available on Windows for example as part of cygwin, while I don't see
re2c there.
I understand this can be of low importance since we keep generated files
in our repositories, but I think we still have to keep it in mind.
I understand also current patch requires non-release version of re2c -
maybe we should have some release version at least until we make PHP
depend on it?
Current state:
Flex has been fully replaced by re2c in Zend. We have also switched to an
mmap-based lexer approach for now. However, we had to drop multibyte support
Were the stream support issues solved?
as well as the encoding declare. The current state can be checked out from
Scott's subversion repository [3] and you can follow the development on his
Trac setup [4]. When you want to build php with re2c, then you need to grab
re2c from its sourceforge subversion repository [5]. You can also check out
the changes in a patch created Sunday 2nd March against a PHP checkout from
14th February [6].
Further steps:
Commit this to PHP 5.3. Synch to HEAD. Add pecl/intl to 5.3. Discuss/recreate
multibyte support with libintl.
Note - pecl/intl does nothing towards multibyte support etc., at least
for now. If there are voloteers to change that, it can be discussed, but
so far it is for doing entirely other things (locale-dependent
functionality mostly).
So, I think before re2c parser can be merged the issue with multibyte
compatibility must be solved - otherwise it will make the users that
rely on it unable to use newer PHP. As cool as 20% faster is, I think we
can't drop support for such feature, especially not in 5.3.
Commit to 5.3 between the 5th and the 15th of March. Synch to HEAD a couple
of days later. Moving pecl/libintl to ext (depends on the 5.3 RMs decision).
After that is done, decide about multibyte support. Along with the commit to
the 5.3 branch there will be a new re2c version available.
I think we first need to figure out what happens to multibyte support,
and not commit anything before we have it figured out. Multibyte support
is important piece of functionality for some PHP users, and it works
now. Breaking it without providing any alternative - especially that we
have now 5.3 mostly ready for the release cycle, and solving multibyte
problems with re2c may take undefined amount of time, as far as I
understand. I do not think it would be acceptable to release 5.3 without
multibyte support, so the option here either merge it now and have 5.3
waiting until MB is figured out, or try to figure it out before commit
and if we can't in a reasonable term, go forward with 5.3 and defer the
parser change for 5.4.
Again, while I think the speedup is great and congratulate Marcus, Nuno
and Scott on great work, I think we should keep in mind we have working
parser right now and changing it in an incompatible way is very
high-risk and should not be taken hastily.
--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED] http://www.zend.com/
(408)253-8829 MSN: [EMAIL PROTECTED]
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php