That's what I call 'overquoting'. On 30.04.2009 16:15, Richard Quadling wrote: > 2009/4/30 Scott MacVicar <scott...@php.net>: >> [^] is a special case to write a portable match any character in re2c. >> >> Scott >> >> Dmitry Stogov wrote: >>> Hi Matt, >>> >>> Does this patch fix EOF handling issues related to mmap()? (e.g. parsing >>> of files with size 4096, 8192, ...). Now we have two dirty fixes to >>> handle them correctly. >>> >>> The patch is quite big to understand it quickly. I'll probably take a >>> look on weekend. >>> >>> -ANY_CHAR [^\x00] >>> +ANY_CHAR [^] >>> >>> Is [^] a correct regular expression? >>> >>> Thanks. Dmitry. >>> >>> Matt Wilmas wrote: >>>> Hi Dmitry, Brian, all, >>>> >>>> Here's a scanner patch that I mentioned awhile ago, with a possible >>>> way to work around the re2c EOF handling issues. >>>> >>>> The primary change is to do a "manual scan" like I talked about in >>>> areas that match large amounts and can contain NULL bytes >>>> (strings/comments, which are now scanned faster too), as is done for >>>> inline HTML. I called it a "diet" :-) because it removes my >>>> complicated string regex patterns from a couple years ago, which >>>> doesn't make the .l file much smaller after adding the manual scan >>>> code (easier to understand...?), but it does result in a ~34k >>>> reduction of 5.3's generated .c file... >>>> >>>> This fixes Bug #46817, as well as a better, more proper fix for the >>>> older Bug #42767, both related to ending comments. >>>> >>>> Now inline HTML chunks aren't broken up when a tag starting with "s" >>>> is encountered (<script> for JS, <span>, etc.), since it's unlikely to >>>> be a long PHP <script> tag. >>>> >>>> If an opening PHP <SCRIPT> tag was used with a capital "S", it was >>>> missed if it wasn't the first thing scanned: >>>> >>>> var_dump(token_get_all("HTML... <SCRIPT language=php>")); >>>> >>>> Single-line comments with a Windows newline didn't include the full \r\n: >>>> >>>> var_dump(token_get_all("<?php // Comment\r\n?>")); >>>> >>>> Finally, part of the optimized scanning is that, for double quoted >>>> strings, when the first variable is encountered (making it >>>> non-constant), the amount that's been scanned up to that point is >>>> remembered, which can then be skipped over (up to the variable) after >>>> returning the quote token. Previously that initial part of the string >>>> was rescanned -- the cost dependent on how far "into" the string the >>>> first var is. >>>> >>>> >>>> I think that's about all -- I'll send another message if I forgot to >>>> mention anything... Just wanted to send this along quick for to you >>>> guys to look at or whatever. It was basically done last week, I just >>>> had to do a couple finishing touches and verify that everything was OK. >>>> >>>> http://realplain.com/php/scanner_diet.diff (Merged changes, but didn't >>>> test yet.) >>>> http://realplain.com/php/scanner_diet_5_3.diff >>>> >>>> >>>> Thanks, >>>> Matt >>> >> >> -- >> PHP Internals - PHP Runtime Development Mailing List >> To unsubscribe, visit: http://www.php.net/unsub.php >> >> > > Aha - bottom of section at http://re2c.org/manual.html#lbAJ
-- Wbr, Antony Dovgal -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php