On December 15, 2003 06:36 pm, Rasmus Lerdorf wrote:
> As Stig says, the correct solution would be to always store the encoding
> of the string right alongside the length of the string in the guts of PHP.
> Anything short of that is going to be a hack. PHP6 here we come...
Then here is our first
On Mon, 15 Dec 2003, Derek Ford wrote:
> I see no example of him implying he wanted to "dismiss" multibyte users,
> he simply suggested mb_* versions of the string manipulation functions
> and pointed available facilities that people can use already. I support
> that idea, as having a mb_ versio
Stig S. Bakken wrote:
On Sun, 2003-12-14 at 00:28, Ilia Alshanetsky wrote:
On December 13, 2003 05:52 pm, Moriyoshi Koizumi wrote:
I haven't denied it. That said, multibyte facility is not so fancy
as XML, but quite essential so as to enable most applications to work
well under every envir
On December 15, 2003 10:36 am, Moriyoshi Koizumi wrote:
> Well, the legacy users of PHP4 will significantly suffer for
> PHP5's new features.
How so? PHP 5 does break BC (especially for objects) but this is something
that was talked about for years and the consensus is/was that the change is
for
On Tue, 16 Dec 2003, Moriyoshi Koizumi wrote:
>
> On 2003/12/16, at 0:42, Derick Rethans wrote:
>
> > On Tue, 16 Dec 2003, Moriyoshi Koizumi wrote:
> >
> >>> If you were designing a new language you wouldn't have legacy users
> >>> who'd suffer (significantly) because of features added for other
>
On 2003/12/16, at 0:42, Derick Rethans wrote:
On Tue, 16 Dec 2003, Moriyoshi Koizumi wrote:
If you were designing a new language you wouldn't have legacy users
who'd suffer (significantly) because of features added for other
users.
Well, the legacy users of PHP4 will significantly suffer for
PHP5
On Tue, 16 Dec 2003, Moriyoshi Koizumi wrote:
> > If you were designing a new language you wouldn't have legacy users
> > who'd suffer (significantly) because of features added for other
> > users.
>
> Well, the legacy users of PHP4 will significantly suffer for
> PHP5's new features.
Uh? Where d
On 2003/12/16, at 0:32, Ilia Alshanetsky wrote:
On December 15, 2003 05:37 am, Stig S. Bakken wrote:
So you think the right solution is to dismiss multibyte users and
direct
them to the hacks (mbstring etc) that have been used previously
instead
of thinking ahead?
IMHO calling multibyte a hack w
On December 15, 2003 05:37 am, Stig S. Bakken wrote:
> So you think the right solution is to dismiss multibyte users and direct
> them to the hacks (mbstring etc) that have been used previously instead
> of thinking ahead?
IMHO calling multibyte a hack would be great disservice to the developers o
On Sun, 2003-12-14 at 00:28, Ilia Alshanetsky wrote:
> On December 13, 2003 05:52 pm, Moriyoshi Koizumi wrote:
> > I haven't denied it. That said, multibyte facility is not so fancy
> > as XML, but quite essential so as to enable most applications to work
> > well under every environment.
>
> Bull
On Fri, 2003-12-12 at 23:28, Ilia Alshanetsky wrote:
> On December 12, 2003 04:18 pm, Moriyoshi Koizumi wrote:
> > I disagree, because of the following reasons:
> >
> > 1) Not a few people *actually* use fgetcsv() commonly
> > with multibyte characters indeed. Regarding this,
> > applicatio
On December 13, 2003 05:52 pm, Moriyoshi Koizumi wrote:
> I haven't denied it. That said, multibyte facility is not so fancy
> as XML, but quite essential so as to enable most applications to work
> well under every environment.
Bullshit. Only application that need to support multibyte strings nee
On 2003/12/14, at 7:33, Ilia Alshanetsky wrote:
Percentages aside you cannot deny the fact that not every application
needs
multibyte support (whether this is a majority or 50/50 does not
matter).
If a user needs to use multibyte they may need to do a little
searching to
find a provider that su
On December 13, 2003 04:46 pm, Moriyoshi Koizumi wrote:
> > The critical point of this entire discussion is about NOT forcing
> > choices on
> > people who do not want/need them. There is no good reason to force
> > multibyte
> > version of fgetcsv() on every single user, when there are not one but
On 2003/12/14, at 6:46, Moriyoshi Koizumi wrote:
I made is the best portable and the fastest code, it's probable
that there are a far better code.
s/there are/there'd be/
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On 2003/12/14, at 6:19, Ilia Alshanetsky wrote:
On December 13, 2003 03:53 pm, Moriyoshi Koizumi wrote:
Could a quarter be a minority?
Unless the rules of mathematics had changed 25% is still a minority.
You also
forget that there are plenty of people who compile extensions and
never end
up usin
On December 13, 2003 03:53 pm, Moriyoshi Koizumi wrote:
> Could a quarter be a minority?
Unless the rules of mathematics had changed 25% is still a minority. You also
forget that there are plenty of people who compile extensions and never end
up using them.
The critical point of this entire dis
On 2003/12/14, at 5:55, Ilia Alshanetsky wrote:
On December 13, 2003 03:27 pm, Moriyoshi Koizumi wrote:
As a sidenote, this unrealistic statistics appear to be quite unreal.
phpinfo() => 186,000 (pages) [1]
phpinfo() mbstring => 8,330
phpinfo() Server API Configure Command => 16,800
phpinfo() Ser
On December 13, 2003 03:27 pm, Moriyoshi Koizumi wrote:
> As a sidenote, this unrealistic statistics appear to be quite unreal.
>
> phpinfo() => 186,000 (pages) [1]
> phpinfo() mbstring => 8,330
> phpinfo() Server API Configure Command => 16,800
> phpinfo() Server API Configure Command mbstring =>
On 2003/12/13, at 9:36, Ilia Alshanetsky wrote:
There is a good chance you are correct. However my assumption is not
without
bases, please consider the following statistic:
Google finds 185,000 (or so) phpinfo() pages, when mbstring is added
to the
search query only 8150 pages are found. That l
On 2003/12/14, at 1:07, Rasmus Lerdorf wrote:
On Sat, 13 Dec 2003, Jan Schneider wrote:
I have to agree. While in the past it helped mb users to turn on
overloading
if they wanted to use our framework, it will now break it. This is
because
we now explicitely use the str*() function for byte-wise
On Sat, 13 Dec 2003, Jan Schneider wrote:
> With the current implemention and assuming that mbstring overloading is
> turned off, I can. This not documentated, but I'd still consider a change
> of this behaviour an huge bc break.
The documentation states "characters" and nowhere does it say the si
Zitat von Rasmus Lerdorf <[EMAIL PROTECTED]>:
> On Sat, 13 Dec 2003, Jan Schneider wrote:
> > Maybe. Due to PHP lacking byte stream functions, working with str* is
> the
> > only solution atm.
>
> And my contention is that there is no way to do this right now. If you
> rely on a str*() function t
On Sat, 13 Dec 2003, Jan Schneider wrote:
> Maybe. Due to PHP lacking byte stream functions, working with str* is the
> only solution atm.
And my contention is that there is no way to do this right now. If you
rely on a str*() function to do this your application is broken since you
cannot reas
Zitat von Rasmus Lerdorf <[EMAIL PROTECTED]>:
> On Sat, 13 Dec 2003, Jan Schneider wrote:
> > I have to agree. While in the past it helped mb users to turn on
> overloading
> > if they wanted to use our framework, it will now break it. This is
> because
> > we now explicitely use the str*() functi
On Sat, 13 Dec 2003, Jan Schneider wrote:
> I have to agree. While in the past it helped mb users to turn on overloading
> if they wanted to use our framework, it will now break it. This is because
> we now explicitely use the str*() function for byte-wise string
> manipulation and their mb_*() equ
Zitat von Derick Rethans <[EMAIL PROTECTED]>:
> On Sat, 13 Dec 2003, Moriyoshi Koizumi wrote:
>
> > Overloading is evil, because functions like substr() are often
> > used to splice a certain length of octets byte-wise while mb_substr()
> > treats the sequence of octets on a character-basis. And o
Zitat von Moriyoshi Koizumi <[EMAIL PROTECTED]>:
> > The cool thing that mbstring provides is transparent overloading of
> > some
> > of the common string manipulation functions. This means that at least
> > for
> > a subset of applications, even though they may not have been written
> > with
> >
On Fri, 12 Dec 2003, Ilia Alshanetsky wrote:
> On December 12, 2003 08:54 pm, Moriyoshi Koizumi wrote:
> > And overloading
> > cannot be turned on in scripts, this prevents us from writing portable
> > scripts.
>
> Not entirely true, while you cannot enable it from with a script you can
> enable i
On Fri, 12 Dec 2003, Rasmus Lerdorf wrote:
> We need to move towards a uniform platform that works for everyone without
> putting undue strain on either side.
Sure we do, but not at a 200-250% performance loss.
Derick
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visi
On Sat, 13 Dec 2003, Moriyoshi Koizumi wrote:
> Overloading is evil, because functions like substr() are often
> used to splice a certain length of octets byte-wise while mb_substr()
> treats the sequence of octets on a character-basis. And overloading
> cannot be turned on in scripts, this preven
On 2003/12/13, at 11:12, Rasmus Lerdorf wrote:
On Sat, 13 Dec 2003, Moriyoshi Koizumi wrote:
Overloading is evil, because functions like substr() are often
used to splice a certain length of octets byte-wise while mb_substr()
treats the sequence of octets on a character-basis.
I don't know about t
On Sat, 13 Dec 2003, Moriyoshi Koizumi wrote:
> Overloading is evil, because functions like substr() are often
> used to splice a certain length of octets byte-wise while mb_substr()
> treats the sequence of octets on a character-basis.
I don't know about this happening often. In singlebyte apps
On 2003/12/13, at 11:09, Ilia Alshanetsky wrote:
On December 12, 2003 08:54 pm, Moriyoshi Koizumi wrote:
And overloading
cannot be turned on in scripts, this prevents us from writing portable
scripts.
Not entirely true, while you cannot enable it from with a script you
can
enable it for a particu
On December 12, 2003 08:54 pm, Moriyoshi Koizumi wrote:
> And overloading
> cannot be turned on in scripts, this prevents us from writing portable
> scripts.
Not entirely true, while you cannot enable it from with a script you can
enable it for a particular directory via .htaccess or equivalent.
Here what you get are UTF-8 version of these, which
> is unwanted.
This is the part I don't understand.
You might say the products will suffice even if they
> are UTF-8 encoded, however the conversion is sometimes irreversible,
> so we then need to avoid any conversion stuff.
>
> ("irreversibl
On 2003/12/13, at 10:47, Rasmus Lerdorf wrote:
On Sat, 13 Dec 2003, Steph wrote:
If you get multibyte data from a form and want to perform string
operations on
it such as strlen(), ereg(), etc... would you not need mb_* or
iconv_*
functions?
In such a case, yes, I do. But I don't think that's dir
On 2003/12/13, at 10:43, Steph wrote:
The real question is, what does mbstring do that iconv fails to do?
And why
do we need to restore the initial form in any sense? And if iconv were
built-in, would mbstring still be needed, and if so, why and where?
I wrote we need to restore it to the initi
On Sat, 13 Dec 2003, Steph wrote:
> > > If you get multibyte data from a form and want to perform string
> > > operations on
> > > it such as strlen(), ereg(), etc... would you not need mb_* or iconv_*
> > > functions?
> >
> > In such a case, yes, I do. But I don't think that's directly related
> >
> > If you get multibyte data from a form and want to perform string
> > operations on
> > it such as strlen(), ereg(), etc... would you not need mb_* or iconv_*
> > functions?
>
> In such a case, yes, I do. But I don't think that's directly related
> to the issue..?
The real question is, what do
On 2003/12/13, at 10:35, Ilia Alshanetsky wrote:
On December 12, 2003 08:11 pm, Moriyoshi Koizumi wrote:
Which input?
If you get multibyte data from a form and want to perform string
operations on
it such as strlen(), ereg(), etc... would you not need mb_* or iconv_*
functions?
In such a case, ye
On December 12, 2003 08:11 pm, Moriyoshi Koizumi wrote:
> Which input?
If you get multibyte data from a form and want to perform string operations on
it such as strlen(), ereg(), etc... would you not need mb_* or iconv_*
functions?
Ilia
--
PHP Internals - PHP Runtime Development Mailing List
On 2003/12/13, at 10:11, Moriyoshi Koizumi wrote:
That is needed when the encoding in which a script is written and
the one the form uses to submit to the script.
A few more words were missing:
That is needed when the encoding in which a script is written and
the one the form uses to submit to th
On 2003/12/13, at 10:13, Ilia Alshanetsky wrote:
It also seems to me that if you are going to be multibyte inputs you'd
have
either iconv or mbstring extension enabled.
Which input? If you are talking about form inputs, we rarely need the
functionality (mbstring.encoding_conversion).
That is need
On December 12, 2003 07:54 pm, Moriyoshi Koizumi wrote:
> On 2003/12/13, at 9:47, Ilia Alshanetsky wrote:
> > Without mbstring enabled, you would not be able to effectively work
> > with
> > multibyte characters. Therefor even if fgetcsv() would work as you may
> > expect
> > with multibyte strings
On 2003/12/13, at 9:56, Ilia Alshanetsky wrote:
On December 12, 2003 03:15 pm, Moriyoshi Koizumi wrote:
If we limited the support to UTF-8 or EUC encoding only, we'd be
able to drastically gain much better performance. But it won't
actually solve practical problems where it is in action.
Could ico
On 2003/12/13, at 9:47, Ilia Alshanetsky wrote:
Without mbstring enabled, you would not be able to effectively work
with
multibyte characters. Therefor even if fgetcsv() would work as you may
expect
with multibyte strings, that data would not be manageable in most
cases.
That's a bogus argument
On December 12, 2003 03:15 pm, Moriyoshi Koizumi wrote:
> If we limited the support to UTF-8 or EUC encoding only, we'd be
> able to drastically gain much better performance. But it won't
> actually solve practical problems where it is in action.
Could iconv stream filters be used to convert vario
Before we get too far offtopic ( fgetcsv() ) let me just quickly summarize my
position. I think it's great that multi-byte support exists in PHP, we have
mbstring, iconv and recode extensions that all help make PHP work with
multibyte strings.
As with most things with PHP the user has a choice o
On December 12, 2003 07:02 pm, Rasmus Lerdorf wrote:
> Ilia, we need to try to avoid this sort of thinking. This "vast majority"
> is most likely only a "vocal majority" these days. It is very likely that
> the non-mb users are actually the "few" and if we continue along your way
> of thinking th
blimey, I just agreed with Rasmus..
> -Original Message-
> From: Rasmus Lerdorf [mailto:[EMAIL PROTECTED]
> Sent: 13 December 2003 00:03
> To: Ilia Alshanetsky
> Cc: Moriyoshi Koizumi; PHP Internals
> Subject: Re: [PHP-DEV] Re: Regarding the latest patch on fgetcsv(
Zitat von Ilia Alshanetsky <[EMAIL PROTECTED]>:
> On December 12, 2003 05:38 pm, Moriyoshi Koizumi wrote:
> > And I don't think fgetcsv() is an exception, since htmlentities() can
> > be referred to as an example that is placed in core and
> > supports multibyte strings. As I mentioned, purging th
> Why does a vast majority of users have to endure degredation in performance
> for functionality that are needed by a few? It's as simple as that. Same
> argument applies to basename().
Ilia, we need to try to avoid this sort of thinking. This "vast majority"
is most likely only a "vocal majo
On 2003/12/13, at 8:23, Ilia Alshanetsky wrote:
On December 12, 2003 05:38 pm, Moriyoshi Koizumi wrote:
And I don't think fgetcsv() is an exception, since htmlentities() can
be referred to as an example that is placed in core and
supports multibyte strings. As I mentioned, purging that kind of
fun
On December 12, 2003 05:38 pm, Moriyoshi Koizumi wrote:
> And I don't think fgetcsv() is an exception, since htmlentities() can
> be referred to as an example that is placed in core and
> supports multibyte strings. As I mentioned, purging that kind of
> functionality into the mbstring extension do
On 2003/12/13, at 7:28, Ilia Alshanetsky wrote:
On December 12, 2003 04:18 pm, Moriyoshi Koizumi wrote:
I disagree, because of the following reasons:
1) Not a few people *actually* use fgetcsv() commonly
with multibyte characters indeed. Regarding this,
applications made by those who don'
On December 12, 2003 04:18 pm, Moriyoshi Koizumi wrote:
> I disagree, because of the following reasons:
>
> 1) Not a few people *actually* use fgetcsv() commonly
> with multibyte characters indeed. Regarding this,
> applications made by those who don't use
> such characters don't (and w
On Fri, 12 Dec 2003, Derick Rethans wrote:
> On Sat, 13 Dec 2003, Moriyoshi Koizumi wrote:
>
> > On 2003/12/13, at 4:42, Ilia Alshanetsky wrote:
> >
> > > On a related note I should mention that fgetcsv() in 4.3.X is
> > > currently 2.5
> > > times faster then it's equivalent in 5.X.
> >
> > I don
On 2003/12/13, at 5:45, Derick Rethans wrote:
I would call that rather unacceptable actually. Isn't it possible
create
a new function for this which handles this MB 'crap' (and the same for
basename) so that we don't have to lose performance because of those
issues?
Don't you think "crap" sounds
On 2003/12/13, at 5:51, Ilia Alshanetsky wrote:
How about we add mb_fgetcsv(), which would have full multi-byte support
(including delimeters). I'd imagine for people who need to parse
multi-byte
csv files, full functionality is more important then speed. As for the
fgetcsv() in ext/standard/, we
On Sat, 13 Dec 2003, Moriyoshi Koizumi wrote:
> On 2003/12/13, at 4:42, Ilia Alshanetsky wrote:
>
> > On a related note I should mention that fgetcsv() in 4.3.X is
> > currently 2.5
> > times faster then it's equivalent in 5.X.
>
> I don't know why you're mentioning this at this time,
> but I can
How about we add mb_fgetcsv(), which would have full multi-byte support
(including delimeters). I'd imagine for people who need to parse multi-byte
csv files, full functionality is more important then speed. As for the
fgetcsv() in ext/standard/, we can port the 4.3.X code (copy & paste really)
On 2003/12/13, at 5:09, Ilia Alshanetsky wrote:
I mentioning this now because we are considering changes to the
function in
the development branch, which is a fine time to resolve any
deficiencies.
Okay, fine :)
The added functionality, which if I understand correctly is support for
multibyte d
On December 12, 2003 02:40 pm, Moriyoshi Koizumi wrote:
> I don't know why you're mentioning this at this time,
> but I can say it is a sort of necessary evil :) Because the HEAD
> version is capable of handling various encodings, and
> less intricate IMO. Rather, I was surprised about that result,
On 2003/12/13, at 4:42, Ilia Alshanetsky wrote:
On a related note I should mention that fgetcsv() in 4.3.X is
currently 2.5
times faster then it's equivalent in 5.X.
I don't know why you're mentioning this at this time,
but I can say it is a sort of necessary evil :) Because the HEAD
version is c
On December 12, 2003 02:02 pm, Rasmus Lerdorf wrote:
> I agree that it would be a good idea to provide a mechanism to do that,
> but at this point I don't think we should be changing the behaviour of
> fgetcsv() in neither the stable branch nor the HEAD branch. I'd add a new
> binary-safe version
On 2003/12/13, at 4:02, Rasmus Lerdorf wrote:
On Fri, 12 Dec 2003, Ilia Alshanetsky wrote:
That said, the whole space trimming behavior seems a little unusual
since it
will corrupt content especially if said content contains binary data.
IMHO
the data read by fgetcsv() should be fetched in such
On Fri, 12 Dec 2003, Ilia Alshanetsky wrote:
> That said, the whole space trimming behavior seems a little unusual since it
> will corrupt content especially if said content contains binary data. IMHO
> the data read by fgetcsv() should be fetched in such a manner so that the
> original string c
On December 12, 2003 01:36 pm, Moriyoshi Koizumi wrote:
> What do you think of this?
I'll apply a fix momentarily, it wouldn't do to break BC in stable branch.
That said, the whole space trimming behavior seems a little unusual since it
will corrupt content especially if said content contains b
69 matches
Mail list logo