Edward Z. Yang wrote:
> My proposal is to introduce a new filter (for the filter extension)
> which performs codepoint sanitization appropriate for HTML/XML contexts
> (alternatively, this could be an option on the FILTER_DEFAULT filter,
> which would be for Unicode strings, I assume). This filter
Chris Stockton wrote:
> I think that internal string handling so be very respective to the
> specification as you said. Perhaps code points which are not valid for a
> separate specification, protocol etc, the conversion should be done in the
> functions dealing with those formats. Like if extensio
I think that internal string handling so be very respective to the
specification as you said. Perhaps code points which are not valid for a
separate specification, protocol etc, the conversion should be done in the
functions dealing with those formats. Like if extension family xmlfoo does
not like
In PHP 6, incoming user data will automatically be in (unicode) form.
(That is, assuming that the JIT functionality for converting gets
implemented).
One of the implementation details I'd like to consider involves non-XML
and/or non-SGML codepoints inside markup. As per the Unicode
specification,
imo, this would probably the easiest and best way to handle the conversions.
Rob
Andrei Zmievski wrote:
Maybe. An alternate way would be to add modifier to 's' that makes it
accept a converter to use for conversion.
if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s>", &str,
&str_len, U
Maybe. An alternate way would be to add modifier to 's' that makes it
accept a converter to use for conversion.
if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s>", &str,
&str_len, UG(utf8_conv)) == FAILURE) {
return;
}
This does mean that the caller will have to instantiate t
Hello Andrei,
don't we have a char left for UTF-8 (maybe 8) as it would be a case that
we will have to use very often and checking for a string in braces will
take some time.
best regards
marcus
Friday, July 21, 2006, 9:39:32 PM, you wrote:
> Awesome.
> I am planning to add "s(encoding)" supp
I probably won't get to it this weekend. Might have it done during
OSCON next week, so it's up to you.
-Andrei
On Jul 22, 2006, at 6:30 AM, Rob Richards wrote:
Andrei Zmievski wrote:
Awesome.
I am planning to add "s(encoding)" support to parameter parsing,
by the way, so getting strings
Andrei Zmievski wrote:
Awesome.
I am planning to add "s(encoding)" support to parameter parsing, by
the way, so getting strings in UTF-8 encoding will be a bit easier.
Would probably need to change the relevant portions of your commits.
Any idea when this should be ready, or should I just go a
Awesome.
I am planning to add "s(encoding)" support to parameter parsing, by
the way, so getting strings in UTF-8 encoding will be a bit easier.
Would probably need to change the relevant portions of your commits.
-Andrei
On Jul 21, 2006, at 5:45 PM, Rob Richards wrote:
Almost done with
Almost done with DOM (3 more files to go), so hopefully by Monday. This
one will need a lot of testing though.
Rob
Andrei Zmievski wrote:
Great! I'll put a slide about this into my talk for OSCON.
What're your plans for the rest of the XML extensions?
-Andrei
--
PHP Internals - PHP Runtim
Great! I'll put a slide about this into my talk for OSCON.
What're your plans for the rest of the XML extensions?
-Andrei
On Jul 20, 2006, at 6:15 PM, Rob Richards wrote:
Andrei Zmievski wrote:
Hey Rob,
Looks good. Have you tested the filesystem (filename) related
functions with non-ASCI
Andrei Zmievski wrote:
Hey Rob,
Looks good. Have you tested the filesystem (filename) related
functions with non-ASCII filenames? Try making a file called
"informaçon.xml" for example, set unicode.filesystem_encoding=utf-8
(or whatever encoding your filesystem uses) and see if you can read it
Hey Rob,
Looks good. Have you tested the filesystem (filename) related functions
with non-ASCII filenames? Try making a file called "informaçon.xml" for
example, set unicode.filesystem_encoding=utf-8 (or whatever encoding
your filesystem uses) and see if you can read it.
-Andrei
On Jul 19,
Andrei Zmievski wrote:
Rob,
I have not tested the patch, but it looks good to me on cursory
overview. I assume it passes your tests?
The only comment I have is regarding the usage of 't' and 'T'
specifiers. Since you always have to pass binary UTF-8 strings to
libxml, we should always use 's'
Rob,
I have not tested the patch, but it looks good to me on cursory
overview. I assume it passes your tests?
The only comment I have is regarding the usage of 't' and 'T'
specifiers. Since you always have to pass binary UTF-8 strings to
libxml, we should always use 's' specifier and let PHP d
Had some feedback about a problem with the attached file, so here's also
link to the diff.
http://www.ctindustries.net/patches/xmlunicode.diff.txt
Rob
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Attached is a patch for my initial cut for unicode and XML (made against
the /ext directory).
I started with XMLReader since it was the smallest.
The code can probably be optimized a bit, but I want to make sure this
is how it should be because the changes made here will be the changes
needed f
18 matches
Mail list logo