Re: String representation

2000-12-21 Thread Nicholas Clark
On Thu, Dec 21, 2000 at 05:36:05PM +, Nick Ing-Simmons wrote: > Nicholas Clark <[EMAIL PROTECTED]> writes: > >> > >> where it is possible to get "smart" when one arg is a "special case" of > >> the other. > > > >> And similarly numbers must be convertable to "complex long double" or > >> wha

Re: String representation

2000-12-21 Thread Nick Ing-Simmons
Nicholas Clark <[EMAIL PROTECTED]> writes: >> >> where it is possible to get "smart" when one arg is a "special case" of >> the other. > >> And similarly numbers must be convertable to "complex long double" or >> what ever is the top if the built-in tree ? (NV I guess - complex is >> over-kill.)

Re: String representation

2000-12-21 Thread Nicholas Clark
On Wed, Dec 20, 2000 at 11:07:39PM +, Nick Ing-Simmons wrote: > The snag is that there are common pairs > e.g. concat(utf8,ascii) / concat(ascii,utf8) > or > plus(NV,IV) / plus(IV,NV) > > where it is possible to get "smart" when one arg is a "special case" of > the other. >

Re: String representation

2000-12-21 Thread Philip Newton
On 18 Dec 00, at 15:21, Nick Ing-Simmons wrote: > There needs to be a hierachy of _repertoires_ such that: > > ASCII is subset of Native is subset of wchar_t is subset of UNICODE. But we can't even rely on that. I can imagine a couple of Native encodings around that fiddle with ASCII (for exam

Re: String representation

2000-12-21 Thread Nick Ing-Simmons
Philip Newton <[EMAIL PROTECTED]> writes: >On 18 Dec 00, at 15:21, Nick Ing-Simmons wrote: > >> There needs to be a hierachy of _repertoires_ such that: >> >> ASCII is subset of Native is subset of wchar_t is subset of UNICODE. > >But we can't even rely on that. I can imagine a couple of Native

Re: String representation

2000-12-20 Thread Nick Ing-Simmons
David Mitchell <[EMAIL PROTECTED]> writes: >The problem is "what are the (types of) the arguments passed > >I dont really see why types af args are (in general) a problem. Hmm, you may be right at the level of your example, which may indeed be typical of pp_(). Perhaps PerlIO is so bother so

Re: String representation

2000-12-19 Thread Nicholas Clark
On Tue, Dec 19, 2000 at 06:11:06PM +, David Mitchell wrote: > Since in real life the types of args are often the same, this will usually > be a win. I found that you have to make an effort to make them the same, else generally enough of them aren't that decision making code outweighs speed ga

Re: String representation

2000-12-19 Thread David Mitchell
Nick Ing-Simmons <[EMAIL PROTECTED]> wrote: > David Mitchell <[EMAIL PROTECTED]> writes: > >Nick Ing-Simmons <[EMAIL PROTECTED]> wrote: > >> What are string functions in your view? > >> m// > >> s/// > >> join() > >> substr > >> index > >> lc, lcfirst, ... > >> & | ~ > >> ++ > >>

Re: String representation

2000-12-18 Thread Kai Henningsen
[EMAIL PROTECTED] (Jarkko Hietaniemi) wrote on 15.12.00 in <[EMAIL PROTECTED]>: > On Fri, Dec 15, 2000 at 12:13:01PM +, Simon Cozens wrote: > > IMHO, the first thing we need to design and code is the API and runtime > > library, since everything else builds on top of that, and we can design

Re: String representation

2000-12-18 Thread Jarkko Hietaniemi
> >> As I pointed out on p5p even EBCDIC machines can use that model - but > >> the downside is that ord('A') == 65 which will breaks backward compatibility > >> with EBCDIC scripts. > > > >Maybe we need $ENV{PERL_ENCODING} to control ord() and chr(), too? > > That was my suggestion last week

Re: String representation

2000-12-18 Thread Jarkko Hietaniemi
> At worst we have to write a "worst case" override entry for each op and > then work what it needs back - this is exemplified by PerlIO_getpos() > the "position" arg had to stop being an Fpos_t and become an SV * > so that stdio could stuff an Fpos_t in it, but a transcoding layer > could put th

Re: String representation

2000-12-18 Thread Nick Ing-Simmons
Jarkko Hietaniemi <[EMAIL PROTECTED]> writes: >On Mon, Dec 18, 2000 at 03:21:05PM +, Nick Ing-Simmons wrote: >> Simon Cozens <[EMAIL PROTECTED]> writes: >> > >> >So, before we start even thinking about what we need, it's time to look at the >> >vexed question of string representation. How do w

Re: String representation

2000-12-18 Thread Nick Ing-Simmons
David Mitchell <[EMAIL PROTECTED]> writes: >Nick Ing-Simmons <[EMAIL PROTECTED]> wrote: >> What are string functions in your view? >> m// >> s/// >> join() >> substr >> index >> lc, lcfirst, ... >> & | ~ >> ++ >> vec >> '.' >> '.=' >> >> It rapidly gets out of hand. > >Per

Re: String representation

2000-12-18 Thread Nick Ing-Simmons
Nicholas Clark <[EMAIL PROTECTED]> writes: >On Fri, Dec 15, 2000 at 11:18:00AM -0600, Jarkko Hietaniemi wrote: > >> As painful as it may sound (codingwise) I would urge to spare some >> thought to using (internally) UTF-32 for those encodings for which >> UTF-8 would be *longer* than the UTF-32 (m

Re: String representation

2000-12-18 Thread Nick Ing-Simmons
David Mitchell <[EMAIL PROTECTED]> writes: >> Personally I would not use such a beast > >But with different encodings implemented by different SV types - each with their >own vtable - surely most of this will "come out in the wash", by the correct >method automatically being called. I thought tha

Re: String representation

2000-12-18 Thread Jarkko Hietaniemi
On Mon, Dec 18, 2000 at 03:21:05PM +, Nick Ing-Simmons wrote: > Simon Cozens <[EMAIL PROTECTED]> writes: > > > >So, before we start even thinking about what we need, it's time to look at the > >vexed question of string representation. How do we do Unicode without getting > >into the horrendous

Re: String representation

2000-12-18 Thread David Mitchell
Nick Ing-Simmons <[EMAIL PROTECTED]> wrote: > e.g. > >if (SvENCODING(sv_a) != SvENCODING(sv_b)) > { > if (SvENCODING(sv_a)->is_superset_of(SvENCODING(sv_b)) > { >sv_upgrade_to(sv_b,SvENCODING(sv_a)); > } > elsif if (SvENCODING(sv_b)->is_superset_of(SvENCODIN

Re: String representation

2000-12-18 Thread Nicholas Clark
On Fri, Dec 15, 2000 at 11:18:00AM -0600, Jarkko Hietaniemi wrote: > As painful as it may sound (codingwise) I would urge to spare some > thought to using (internally) UTF-32 for those encodings for which > UTF-8 would be *longer* than the UTF-32 (mainly the Asian scripts). most CPUs can load a

Re: String representation

2000-12-18 Thread Jarkko Hietaniemi
On Mon, Dec 18, 2000 at 10:30:53AM -0500, Philip Newton wrote: > On Sat, 16 Dec 2000, Jarkko Hietaniemi wrote: > > > On Fri, Dec 15, 2000 at 03:10:16PM -0500, Dan Sugalski wrote: > > > At 11:18 AM 12/15/00 -0600, Jarkko Hietaniemi wrote: > > > > > > > >As painful as it may sound (codingwise) I wo

Re: String representation

2000-12-18 Thread David Mitchell
Nick Ing-Simmons <[EMAIL PROTECTED]> wrote: > What are string functions in your view? > m// > s/// > join() > substr > index > lc, lcfirst, ... > & | ~ > ++ > vec > '.' > '.=' > > It rapidly gets out of hand. Perhaps, but consider that somewhere within the perl internals

Re: String representation

2000-12-18 Thread Philip Newton
On Sun, 17 Dec 2000, Dan Sugalski wrote: > I'm thinking for speed that binary and UTF-32 should be our internal > representations, at least for the data that gets handed to the regex > engine. Or at least we use a constant-width character that's 8 and 32 bits, > if I'm misusing UTF-32. (UTF-8

Re: String representation

2000-12-18 Thread Philip Newton
On Sat, 16 Dec 2000, Jarkko Hietaniemi wrote: > On Fri, Dec 15, 2000 at 03:10:16PM -0500, Dan Sugalski wrote: > > At 11:18 AM 12/15/00 -0600, Jarkko Hietaniemi wrote: > > > > > >As painful as it may sound (codingwise) I would urge to spare some > > >thought to using (internally) UTF-32 for those

Re: String representation

2000-12-18 Thread Nicholas Clark
On Mon, Dec 18, 2000 at 02:43:14PM +, Nick Ing-Simmons wrote: > David Mitchell <[EMAIL PROTECTED]> writes: > > > >Personally I feel that that string part of the SV API should include most > >(if not all) string functions, including regex matching and substitution. [list of potential string op

Re: String representation

2000-12-18 Thread Nick Ing-Simmons
Simon Cozens <[EMAIL PROTECTED]> writes: > >So, before we start even thinking about what we need, it's time to look at the >vexed question of string representation. How do we do Unicode without getting >into the horrendous non-Latin1 cockups we're seeing on p5p right now? Well - my theorist's an

Re: String representation

2000-12-18 Thread Nick Ing-Simmons
David Mitchell <[EMAIL PROTECTED]> writes: > >Personally I feel that that string part of the SV API should include most >(if not all) string functions, including regex matching and substitution. What are string functions in your view? m// s/// join() substr index lc, lcfirst, ... &

Re: String representation

2000-12-18 Thread David Mitchell
Simon Cozens <[EMAIL PROTECTED]> > IMHO, the first thing we need to design and code is the API and runtime > library, since everything else builds on top of that, and we can design other > stuff in parallel with coding it. (A lot of it will be grunt work.) Personally I feel that that string part

Re: String representation

2000-12-17 Thread Dan Sugalski
At 11:13 AM 12/16/00 -0600, Jarkko Hietaniemi wrote: >On Fri, Dec 15, 2000 at 03:10:16PM -0500, Dan Sugalski wrote: > > At 11:18 AM 12/15/00 -0600, Jarkko Hietaniemi wrote: > > >On Fri, Dec 15, 2000 at 12:13:01PM +, Simon Cozens wrote: > > > > IMHO, the first thing we need to design and code i

Re: String representation

2000-12-16 Thread Jarkko Hietaniemi
On Fri, Dec 15, 2000 at 03:10:16PM -0500, Dan Sugalski wrote: > At 11:18 AM 12/15/00 -0600, Jarkko Hietaniemi wrote: > >On Fri, Dec 15, 2000 at 12:13:01PM +, Simon Cozens wrote: > > > IMHO, the first thing we need to design and code is the API and runtime > > > library, since everything else b

Re: String representation

2000-12-15 Thread Dan Sugalski
At 11:18 AM 12/15/00 -0600, Jarkko Hietaniemi wrote: >On Fri, Dec 15, 2000 at 12:13:01PM +, Simon Cozens wrote: > > IMHO, the first thing we need to design and code is the API and runtime > > library, since everything else builds on top of that, and we can design > other > > stuff in parallel

Re: String representation

2000-12-15 Thread Jarkko Hietaniemi
On Fri, Dec 15, 2000 at 11:18:00AM -0600, Jarkko Hietaniemi wrote: > On Fri, Dec 15, 2000 at 12:13:01PM +, Simon Cozens wrote: > > IMHO, the first thing we need to design and code is the API and runtime > > library, since everything else builds on top of that, and we can design other > > stuff

Re: String representation

2000-12-15 Thread Jarkko Hietaniemi
On Fri, Dec 15, 2000 at 12:13:01PM +, Simon Cozens wrote: > IMHO, the first thing we need to design and code is the API and runtime > library, since everything else builds on top of that, and we can design other > stuff in parallel with coding it. (A lot of it will be grunt work.) > > So, bef