Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-23 Thread Marcus Boerger
Hello Stanislav, cool, care to change the code snippet into a test as I've done for Rui's snippet? marcus Sunday, March 23, 2008, 5:06:53 AM, you wrote: >> is broken code and not a single test. If this is not going to change as in >> we are not getting any .phpt files for this feature then

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-22 Thread Stanislav Malyshev
is broken code and not a single test. If this is not going to change as in we are not getting any .phpt files for this feature then there are two As I understand the theory of the thing should be pretty simple, you set input encoding (by config or declare) and internal encoding, and then when

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-22 Thread Marcus Boerger
Hello Alan, Andi, Rui, my impression still is that not a single person uses this crap. I only hear of people claiming they have heard that people use it. But what I see is broken code and not a single test. If this is not going to change as in we are not getting any .phpt files for this feature

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-06 Thread Marcus Boerger
Hello Marcus, Tuesday, March 4, 2008, 7:29:28 PM, you wrote: > Hello Andi, > Tuesday, March 4, 2008, 7:51:07 AM, you wrote: >> Hi Marcus, Johannes, and all, >> First of all let me say that I have no conceptual problem with replacing >> the scanner with re2c. If it's cleaner, performs better an

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-05 Thread Stanislav Malyshev
Hi! Even though I do agree that delaying the release every 2-3 months is bad, I believe this particular case deserves some special treatment. Why? We have perfectly working parser now and no immediate need to replace it. I agree that new parser is faster and better, but we are perfectly capa

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-05 Thread Jani Taskinen
On Tue, 2008-03-04 at 20:17 +0100, Hannes Magnusson wrote: > I'll hunt you all down and make you eat 1kg of vegetables each day > after the 5.3 release untill proper documentation and upgrade guides > have been written. I already eat that much vegetables a day..what's my punishment? :-p (and Pierr

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-05 Thread Antony Dovgal
On 04.03.2008 21:28, Stanislav Malyshev wrote: > Hi! > >> Right. >> Please take more time if needed, no need to rush and release something >> half-working. >> If it takes several months to prepare 5.3 release, let it be so. > > With this approach we would never release 5.3 - each couple of month

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-04 Thread Hannes Magnusson
On Tue, Mar 4, 2008 at 8:38 PM, Andi Gutmans <[EMAIL PROTECTED]> wrote: > Why do you say it's not documented? > http://www.aconus.com/~oyaji/www/apache_linux_php.htm > http://tinyurl.com/2o8pq2 According to the latter link, our windows binaries don't enable zend-multibyte, is this true? -Hanne

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-04 Thread Hannes Magnusson
On Tue, Mar 4, 2008 at 8:38 PM, Andi Gutmans <[EMAIL PROTECTED]> wrote: > OK just kidding and I agree it would be nice to have it better > documented in the mainstream docs. As it applies mostly to the Asian > users though (Chinese/Japanese) who usually seek localized docs it's > probably not a

RE: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-04 Thread Andi Gutmans
> -Original Message- > From: Hannes Magnusson [mailto:[EMAIL PROTECTED] > Sent: Tuesday, March 04, 2008 11:18 AM > To: Stas Malyshev > Cc: Antony Dovgal; Marcus Boerger; Andi Gutmans; > internals@lists.php.net > Subject: Re: [PHP-DEV] [RFC] Replace the flex-based scan

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-04 Thread Stanislav Malyshev
Hi! Improving on that statement: The coolest feature ever is worth absolutely nothing unless it is documented. I agree with the intent - documentation is *very* important. Even though, people use undocumented features too (probably cursing the lazy developers on the way ;) BTW, as far as I

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-04 Thread Hannes Magnusson
On Tue, Mar 4, 2008 at 7:28 PM, Stanislav Malyshev <[EMAIL PROTECTED]> wrote: > The best idea is worth nothing for the users unless it's part of the > release. Improving on that statement: The coolest feature ever is worth absolutely nothing unless it is documented. Don't care if its a new lan

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-04 Thread Marcus Boerger
Hello Andi, Tuesday, March 4, 2008, 7:51:07 AM, you wrote: > Hi Marcus, Johannes, and all, > First of all let me say that I have no conceptual problem with replacing > the scanner with re2c. If it's cleaner, performs better and a better > maintained piece of software (let's hope Marcus doesn't g

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-04 Thread Stanislav Malyshev
Hi! Right. Please take more time if needed, no need to rush and release something half-working. If it takes several months to prepare 5.3 release, let it be so. With this approach we would never release 5.3 - each couple of months somebody would have a cool idea which would only require init

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-04 Thread Stanislav Malyshev
Hi! We can definitely work towards RE2C in parallel and as Stas said the engine hasn't really been changing very much recently to make this hard (we finished our todos for 5.3). We could even branch off PHP 5.4 right Small correction - we still have a couple of todo items. I think we'll have

RE: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-04 Thread Andi Gutmans
> -Original Message- > From: Marcus Boerger [mailto:[EMAIL PROTECTED] > Sent: Tuesday, March 04, 2008 1:39 AM > To: Andi Gutmans > Cc: internals@lists.php.net > Subject: Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an > re2c [1] based lexer > > This

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-04 Thread Scott MacVicar
Marcus Boerger wrote: This sounds like we are going to do the same mistake over and over and over again. Who is forcing a hard time line on us? Why are we late in the develoment I don't get it at all. We haven't done all steps that were on our radar for 5.3. Now that we finally found time to add

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-04 Thread Antony Dovgal
On 04.03.2008 12:38, Marcus Boerger wrote: > This sounds like we are going to do the same mistake over and over and over > again. Who is forcing a hard time line on us? Why are we late in the > develoment I don't get it at all. Right. Please take more time if needed, no need to rush and release s

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-04 Thread Marcus Boerger
Hello Andi, Tuesday, March 4, 2008, 7:51:07 AM, you wrote: > Hi Marcus, Johannes, and all, > First of all let me say that I have no conceptual problem with replacing > the scanner with re2c. If it's cleaner, performs better and a better > maintained piece of software (let's hope Marcus doesn't g

RE: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Andi Gutmans
Hi Marcus, Johannes, and all, First of all let me say that I have no conceptual problem with replacing the scanner with re2c. If it's cleaner, performs better and a better maintained piece of software (let's hope Marcus doesn't get run over) then we can move to re2c. There are a few important thi

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Pierre Joye
Hi Stan, On Mon, Mar 3, 2008 at 10:27 PM, Stanislav Malyshev <[EMAIL PROTECTED]> wrote: > Hi! > > > > intl (and related changes) is almost the only why one will upgrade to > > 5.3.x. There is no core (as in zend engine) for 95% of our users. Sorry I was not clear. I did not say that there is no

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Stanislav Malyshev
Hi! intl (and related changes) is almost the only why one will upgrade to 5.3.x. There is no core (as in zend engine) for 95% of our users. From NEWS: - Added and improved PHP syntax and semantics: . Added NOWDOC. (Gwynne Raskind, Stas, Dmitry) . Added "?:" operator. (Marcus) . Added sup

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Stanislav Malyshev
Hi! In PHP 6, not 5.3. http://wiki.pooteeweet.org/PhP53#toc3 item 2. ITYM item 1. But that's *extension*, not *engine core*. I'm of course all for having pecl/intl joined :) CVS does merging on its own when ther are no conflicts. I am talking about real merge support. As I said, since

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Marcus Boerger
Hello Pierre, Monday, March 3, 2008, 9:31:37 PM, you wrote: > Hi Marcus, > On Mon, Mar 3, 2008 at 9:16 PM, Marcus Boerger <[EMAIL PROTECTED]> wrote: >> Hello Stanislav, >> >> >> Monday, March 3, 2008, 8:48:38 PM, you wrote: >> >> > Hi! >> >> >> It is clearer but it is not a problem. New featu

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Pierre Joye
Hi Marcus, On Mon, Mar 3, 2008 at 9:16 PM, Marcus Boerger <[EMAIL PROTECTED]> wrote: > Hello Stanislav, > > > Monday, March 3, 2008, 8:48:38 PM, you wrote: > > > Hi! > > >> It is clearer but it is not a problem. New features may introduce new > >> dependencies. Having a dependency on libicu wh

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Steph Fox
No one was considering any such move. Having pecl/intl shipped per default as symlinked into ext would be as much optional as --enable-zend-multibyte or --enable-mbstring are right now. This will be more like brining in zip to 5.2. However it is completely off-topic as it is just one possible cau

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Marcus Boerger
Hello Stanislav, Monday, March 3, 2008, 8:56:41 PM, you wrote: > Hi! >> interface but that wasn't really a good idea. So I came up with a new >> interface and all that this would break is stuff like Phar (well there is > If it breaks phar, it may break others too... Anyway, good description >

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Marcus Boerger
Hello Stanislav, Monday, March 3, 2008, 8:48:38 PM, you wrote: > Hi! >> It is clearer but it is not a problem. New features may introduce new >> dependencies. Having a dependency on libicu while we introduce intl >> and other features related to unicode or i18n. I would agree if we >> were talki

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Steph Fox
Is it clearer why I think PHP 5.x and 6 are different and why I think ICU dependency in the 5.3 core might be a problem? FWIW... I also think that bringing in ICU in 5.3 so late in the cycle - or actually at all in 5.3 - is not such a bright idea. 'so late in the cycle'? We haven't had a beta

[PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Marcus Boerger
Hello everyone, sorry for the crosspost. But recent discussions about: '[RFC] Replace the flex-based scanner with an re2c [1] based lexer' revealed one big issue. During the development of said RFC we dropped --enable-multibyte-support and interaction between engine and ext/mbstring using declar

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Derick Rethans
On Mon, 3 Mar 2008, Stanislav Malyshev wrote: > 4. We expect people to upgrade from 5.2.x to 5.3.x without changing their > systems. > > Is it clearer why I think PHP 5.x and 6 are different and why I think ICU > dependency in the 5.3 core might be a problem? FWIW... I also think that bringing i

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Pierre Joye
On Mon, Mar 3, 2008 at 8:48 PM, Stanislav Malyshev <[EMAIL PROTECTED]> wrote: > Hi! > > > > It is clearer but it is not a problem. New features may introduce new > > dependencies. Having a dependency on libicu while we introduce intl > > and other features related to unicode or i18n. I would agr

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Stanislav Malyshev
Hi! interface but that wasn't really a good idea. So I came up with a new interface and all that this would break is stuff like Phar (well there is If it breaks phar, it may break others too... Anyway, good description of what was changed won't hurt. 2. PHP 5.x is same-major branch, where y

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread George Schlossnagle
On Mar 3, 2008, at 2:48 PM, Stanislav Malyshev wrote: Hi! It is clearer but it is not a problem. New features may introduce new dependencies. Having a dependency on libicu while we introduce intl and other features related to unicode or i18n. I would agree if we were talking about 5.2.x. pe

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Stanislav Malyshev
Hi! It is clearer but it is not a problem. New features may introduce new dependencies. Having a dependency on libicu while we introduce intl and other features related to unicode or i18n. I would agree if we were talking about 5.2.x. pecl/intl is an extension, there's no surprise that you nee

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Marcus Boerger
Hello Stanislav, Monday, March 3, 2008, 7:59:41 PM, you wrote: > Hi! >> 1) If mmap is supported, then use it >> 2) If mmap is not supported or does not work then read the whole stream >> 3) If that is not possible read char by char > Why should it read the whole stream into memory? The file cou

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Pierre Joye
Hi, On Mon, Mar 3, 2008 at 7:59 PM, Stanislav Malyshev <[EMAIL PROTECTED]> wrote: > Just curious who you were answering to... Anyway, to be clear: > 1. PHP 6 is major version with its major feature being Unicode support. > 2. PHP 5.x is same-major branch, where you are not expected to have to

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Stanislav Malyshev
Hi! 1) If mmap is supported, then use it 2) If mmap is not supported or does not work then read the whole stream 3) If that is not possible read char by char Why should it read the whole stream into memory? The file could be very big, maybe it would make more sense to read it is some chunks?

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Stanislav Malyshev
Hi! Since there's no documentation about zend-multibyte stuff I spent some time searching for other resources about it, but except bug reports I found nothing whee it was required. I'm sure there are some but comments like "TODO: support widechars" in the code give me the impression that it does

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Alan Knowles
a few replaces with this file should be a good testcase - probably worth testing * comments with these character in them. both /* and // * string with these characters in them. lynx -source 'http://smontagu.damowmow.com/genEncodingTest.cgi?family=windows&codepage=950' | grep test | grep -v tes

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Lukas Kahwe Smith
On 03.03.2008, at 00:48, Alan Knowles wrote: Can you clarify the Multibyte issues: - I presume this means that it can handle ASCII/UTF8/16 etc. but will not handle things like BIG5/GB encoding in source code - this may be a bit of an issue around here.. At first I also thought that this

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Johannes Schlüter
Hi, On Sun, 2008-03-02 at 14:47 -0800, Stanislav Malyshev wrote: > Hi! > > > be much easier, switching to re2c promises a much faster lexer. Actually, > > without any specific re2c optimizations we already get around a 20% scanner > > I think 20% faster is very cool. > However, as I understand r

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Marcus Boerger
Hello Derick, ok, for now I changed to not issue any error at all. marcus Monday, March 3, 2008, 11:28:31 AM, you wrote: > On Mon, 3 Mar 2008, Marcus Boerger wrote: >> actually you get a message (E_COMPILE_WARNING) that this is not >> supported. Maybe we could turn this into an E_NOTICE th

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Marcus Boerger
Hello Alan, be my hero then :-) Could you generate a few tests for the multibyte support so that we know how it is used right now and what we need to take care of? marcus Monday, March 3, 2008, 12:48:44 AM, you wrote: > Can you clarify the Multibyte issues: > - I presume this means that it ca

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Marcus Boerger
Hello Stanislav, Monday, March 3, 2008, 5:39:35 AM, you wrote: > Hi! >>> Were the stream support issues solved? >> >> We completely dropped multibyte support. The reason is that the way we were > I wasn't asking about multibyte (that we discuss below), but about other > streams - I think I me

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Derick Rethans
On Mon, 3 Mar 2008, Marcus Boerger wrote: > actually you get a message (E_COMPILE_WARNING) that this is not > supported. Maybe we could turn this into an E_NOTICE though. No, I don't get any warning/notice/ whatever with PHP 5.3: [EMAIL PROTECTED]:~$ php-5.3dev -derror_reporting=65535 foo

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Marcus Boerger
Hello Derick, actually you get a message (E_COMPILE_WARNING) that this is not supported. Maybe we could turn this into an E_NOTICE though. marcus Monday, March 3, 2008, 9:28:01 AM, you wrote: > On Sun, 2 Mar 2008, Marcus Boerger wrote: >> However, we had to drop multibyte support as well as

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Johannes Schlüter
Hi Derick, On Mon, 2008-03-03 at 09:28 +0100, Derick Rethans wrote: > On Sun, 2 Mar 2008, Marcus Boerger wrote: > > > However, we had to drop multibyte support as well as the encoding > > declare. > > Just wondering, why did you have to drop the "declare(encoding=...)" ? > It's just ignored in

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-03 Thread Derick Rethans
On Sun, 2 Mar 2008, Marcus Boerger wrote: > However, we had to drop multibyte support as well as the encoding > declare. Just wondering, why did you have to drop the "declare(encoding=...)" ? It's just ignored in PHP 5.x - and it is useful to have for migrating php 5.3 apps to 6. So can you at

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-02 Thread Stanislav Malyshev
I don't think this part is a concern since we have required re2c for quite a while now to build many critical parts of PHP. People who Ok, great then - only issue remaining is the multibyte support. -- Stanislav Malyshev, Zend Software Architect [EMAIL PROTECTED] http://www.zend.com/ (408)2

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-02 Thread Stanislav Malyshev
Hi! Were the stream support issues solved? We completely dropped multibyte support. The reason is that the way we were I wasn't asking about multibyte (that we discuss below), but about other streams - I think I mentioned it on IRC last time re2c parser was discussed. I remember re2c used

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-02 Thread Alan Knowles
Can you clarify the Multibyte issues: - I presume this means that it can handle ASCII/UTF8/16 etc. but will not handle things like BIG5/GB encoding in source code - this may be a bit of an issue around here.. Regards Alan Marcus Boerger wrote: RFC: REPLACE THE FLEX-BASED SCANNER WITH AN RE2

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-02 Thread Marcus Boerger
Hello Rasmus, Monday, March 3, 2008, 12:25:52 AM, you wrote: > Stanislav Malyshev wrote: >> Hi! >> >>> be much easier, switching to re2c promises a much faster lexer. >>> Actually, >>> without any specific re2c optimizations we already get around a 20% >>> scanner >> >> I think 20% faster is ve

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-02 Thread Pierre Joye
Hi Stan, On Sun, Mar 2, 2008 at 11:47 PM, Stanislav Malyshev <[EMAIL PROTECTED]> wrote: > Hi! > > > > be much easier, switching to re2c promises a much faster lexer. Actually, > > without any specific re2c optimizations we already get around a 20% scanner > > I think 20% faster is very cool. >

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-02 Thread Marcus Boerger
Hello Stanislav, Sunday, March 2, 2008, 11:47:57 PM, you wrote: > Hi! >> be much easier, switching to re2c promises a much faster lexer. Actually, >> without any specific re2c optimizations we already get around a 20% scanner > I think 20% faster is very cool. > However, as I understand re2c is

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-02 Thread Rasmus Lerdorf
Stanislav Malyshev wrote: Hi! be much easier, switching to re2c promises a much faster lexer. Actually, without any specific re2c optimizations we already get around a 20% scanner I think 20% faster is very cool. However, as I understand re2c is not a standard tool found everywhere. So what

Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-02 Thread Stanislav Malyshev
Hi! be much easier, switching to re2c promises a much faster lexer. Actually, without any specific re2c optimizations we already get around a 20% scanner I think 20% faster is very cool. However, as I understand re2c is not a standard tool found everywhere. So what happens if you wanted to us

[PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer

2008-03-02 Thread Marcus Boerger
RFC: REPLACE THE FLEX-BASED SCANNER WITH AN RE2C [1] BASED LEXER Situation: The current flex-based lexer depends on an outdated and unsupported flex version. Alternatives include either updating to a newer version of flex or using re2c, which we already use for a variety of things (serializing, pd