mixed numeric and string SVs.

2000-12-20 Thread David Mitchell

Has anyone given thought to how an SV can contain both a numeric value
and string value in Perl6?
Given the arbitrary number of numeric and string types that the vatble
scheme of Perl6 support it will be unviable to to have special types
for all permuations (eg, utf8_nv, unicode32_iv, ascii_bitint, ad nauseum).

It seems to me the following options are poossible:

1. We no longer save conversions, so
$i="3"; $j+=$i for (...);
does an aton() or similar each time round the loop

2. Each SV has 2 vtable pointers - one for it's numeric representation
(if any), and one for its string represenation (if any). Flexible, but
may require an extra 4/8 bytes per SV.

3. We decree that all string to numeric conversions should return
a particular numeric type (eg NV), and that all numeric to string
conversions should similary convert to a fixed string type (eg utf8).
(Although I'm not sure that really helps.)

4. Err, that's it.

Any opinions?




Re: mixed numeric and string SVs.

2000-12-20 Thread Nicholas Clark

On Wed, Dec 20, 2000 at 02:26:12PM +, David Mitchell wrote:
> Has anyone given thought to how an SV can contain both a numeric value
> and string value in Perl6?
> Given the arbitrary number of numeric and string types that the vatble
> scheme of Perl6 support it will be unviable to to have special types
> for all permuations (eg, utf8_nv, unicode32_iv, ascii_bitint, ad nauseum).
> 
> It seems to me the following options are poossible:
> 
> 1. We no longer save conversions, so
>   $i="3"; $j+=$i for (...);
> does an aton() or similar each time round the loop

I fear this would be a performance hit. I'm told TCL pre version 8 was
like this - everything's a string and converted to a number each and every
time the number is needed.

> 2. Each SV has 2 vtable pointers - one for it's numeric representation
> (if any), and one for its string represenation (if any). Flexible, but
> may require an extra 4/8 bytes per SV.

It may not be terrible. How big is the average SV already anyway?

> 3. We decree that all string to numeric conversions should return
> a particular numeric type (eg NV), and that all numeric to string
> conversions should similary convert to a fixed string type (eg utf8).
> (Although I'm not sure that really helps.)

Feels like a bad plan, as it can be that no single intrinsic type
(ie one native to the compiler of the implementation language) is
a superset of all the others
(eg a platform with 64 bit integers, and the longest floating point type
being 64 bit)

> 4. Err, that's it.

If vtables are held in a common pool (garbage collected?) with the
flexibility to allow every scalar to potentially have its own, then I
think there's possibility 4.

vtables have subsections, so all the numeric operations are in a subsection,
all the string operations in another.

At most you have (number of numeric types) * (number of string types)
vtables, but actually you just create a new table with the appropriate
numeric & string subsections every time you cause a numeric or string
conversion to a pair this-sort-of-string,that-sort-of-number that you've
not seen before. So there's still only 1 vtable pointer per scalar.

This will slow things down if you attempt to add
double-ascii to double-UTF8, as the if (vtable_a == vtable_b) won't be
true, but it could be possible to store a token in each subsection to
say what it was, so then you get
if(vtable_a == vtable_b || vtable_a->num_type == vtable_b->numtype) {
  /* It's the same sort of number in each  */
} else {
  /* bah. generic logic needed */
}

numtype is NULL (or "string") or something token but different from all
real numbers if the scalar isn't a number, which will prompt numeric
conversion to the best sort of number as need-be. (so "3.1 + 5i" would do
the right thing. presumably complex floating point)
And (like perl5) if you alter a numeric scalar as a string, it
becomes just a string
[so {(3.1 + 5i) . ''} is a string]

Nicholas Clark




Re: mixed numeric and string SVs.

2000-12-20 Thread Jarkko Hietaniemi

> > 3. We decree that all string to numeric conversions should return
> > a particular numeric type (eg NV), and that all numeric to string
> > conversions should similary convert to a fixed string type (eg utf8).
> > (Although I'm not sure that really helps.)
> 
> Feels like a bad plan, as it can be that no single intrinsic type
> (ie one native to the compiler of the implementation language) is

struct {
IV whatitis;
union {
IV iv;
UV uv;
NV nv;  
   };
}

:-)

Yeah, doesn't help for complex numbers, quaternions, octonions, or
bignums...

> a superset of all the others
> (eg a platform with 64 bit integers, and the longest floating point type
> being 64 bit)

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen



Re: mixed numeric and string SVs.

2000-12-20 Thread Nicholas Clark

On Wed, Dec 20, 2000 at 09:00:47AM -0600, Jarkko Hietaniemi wrote:

> struct {
> IV whatitis;

more a perl5 question - why IV not int?
int might be smaller and "more natural" (your words)
eg why does looks_like_number return IV not int? and various other bits
of the perl API use IV?

Nicholas Clark



Re: mixed numeric and string SVs.

2000-12-20 Thread Jarkko Hietaniemi

On Wed, Dec 20, 2000 at 03:06:06PM +, Nicholas Clark wrote:
> On Wed, Dec 20, 2000 at 09:00:47AM -0600, Jarkko Hietaniemi wrote:
> 
> > struct {
> > IV whatitis;
> 
> more a perl5 question - why IV not int?

> int might be smaller and "more natural" (your words)

That's K&R's words, not mine... and that's only an ideal, not always
the real truth.  E.g. in Digital UNIX a long of 64 bits is very
natural, an int (32 bits) is a nice backward compatibility concession.

> eg why does looks_like_number return IV not int? and various other bits
> of the perl API use IV?
> 
> Nicholas Clark

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen



Re: mixed numeric and string SVs.

2000-12-20 Thread Nicholas Clark

On Wed, Dec 20, 2000 at 04:03:39PM +, David Mitchell wrote:
> > >1. We no longer save conversions, so
> > >   $i="3"; $j+=$i for (...);
> > >does an aton() or similar each time round the loop
> > 
> > Well just the 1st time - then it is a number...
> 
> Err, option (1) was explicity suggesting we *dont* save the result
> of the conversion, so aton() *would* have to be called each time.
> (I didnt think this was sensible, I was just suggesting it
> for completeness...)

I think Nick is suggesting that we convert it and lose the string
as a side effect. I may be wrong
This would give you

$a="cheese";
printf "%d\n", $a;
print  "$a\n";

0
0

because the %d would trigger a conversion to integer which then replaces
the string. Not what is expected.
The only benefit this would bring is that both TomC and Ilya would agree
on something - that this is not desirable behaviour
(TomC because it's not backwards compatible, Ilya because you can alter
a scalar's value as a side effect of accessing it, so what a scalar
appears to contain becomes a function of its access history, not simply
and solely what you assigned to it)

Nicholas Clark



Re: mixed numeric and string SVs.

2000-12-20 Thread David Mitchell

> > It seems to me the following options are poossible:
> > 
> > 1. We no longer save conversions, so
> > $i="3"; $j+=$i for (...);
> > does an aton() or similar each time round the loop
> 
> I fear this would be a performance hit. I'm told TCL pre version 8 was
> like this - everything's a string and converted to a number each and every
> time the number is needed.

Yes, I fear the same!
>
> > 2. Each SV has 2 vtable pointers - one for it's numeric representation
> > (if any), and one for its string represenation (if any). Flexible, but
> > may require an extra 4/8 bytes per SV.
> 
> It may not be terrible. How big is the average SV already anyway?

True, but I've just realised a complication with my suggestion. If
there are a multiple vtable ptrs per SV, which type 'owns' the SV carcass,
and is responsible for destruction, and has permission to put its
own stuff in the payload area etc? I think madness might this way lie.

So here's a modified suggestion. Rather than having 2 vtable ptrs per scalar,
we allow a string type to contain an optional pointer to another
subsidiary SV containing its numeric value. (And vice versa).

Then for example the getint() method for a utf8 string type might look like:

utf8_getint(SV *sv) {
if (sv->subsidiary_numeric_sv == NULL) {
sv->subsidiary_numeric_sv = Numeric->new(aton(sv->value));
}
return sv->subsidiary_numeric_sv->getint();
}

(uft8 stringgy methods that alter the string value of the SV are then
responsible for either destroying the subsidiary numeric SV, or for making
sure it's value gets updated, or for setting a flag warning that it's
value needs recalculating.)

Similarly, the stringy methods for numeric types are wrappers that
optionally create a subsidiary string SV, then pass the call onto that
object.

Or to avoid the conditional each time, there could be 2 vtables for each
type, containing 'with subsidiary' and 'without subsidiary' methods;
the role of the latter being to create the subsidiary SV and update the
type of the main SV to the 'with subsidiary' type.




Re: Garbage collector slowness

2000-12-20 Thread Nick Ing-Simmons

Mark-Jason Dominus <[EMAIL PROTECTED]> writes:
>> "The new version must be better because our gazillion dollar marketing
>> campaign said so.  (We didn't really *fix* anything.)  
>
>The part I found interesting was the part about elimination of the message.

printing messages can be surprisingly slow - if they go to 
unbuffered stderr which is an X window of some kind they can end up 
waiting for an ACK from the X server, which may have to wait for 
blanking and a move of a mega-pixel or two to do a scroll.

>
>Perceived slowness is also important.



-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.




Re: mixed numeric and string SVs.

2000-12-20 Thread Nick Ing-Simmons

David Mitchell <[EMAIL PROTECTED]> writes:
>Has anyone given thought to how an SV can contain both a numeric value
>and string value in Perl6?
>Given the arbitrary number of numeric and string types that the vatble
>scheme of Perl6 support it will be unviable to to have special types
>for all permuations (eg, utf8_nv, unicode32_iv, ascii_bitint, ad nauseum).
>
>It seems to me the following options are poossible:
>
>1. We no longer save conversions, so
>   $i="3"; $j+=$i for (...);
>does an aton() or similar each time round the loop

Well just the 1st time - then it is a number...

>
>2. Each SV has 2 vtable pointers - one for it's numeric representation
>(if any), and one for its string represenation (if any). Flexible, but
>may require an extra 4/8 bytes per SV.

This is my favourite.

>
>3. We decree that all string to numeric conversions should return
>a particular numeric type (eg NV), and that all numeric to string
>conversions should similary convert to a fixed string type (eg utf8).
>(Although I'm not sure that really helps.)

I can't see how that helps.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.




Re: mixed numeric and string SVs.

2000-12-20 Thread David Mitchell

> >1. We no longer save conversions, so
> > $i="3"; $j+=$i for (...);
> >does an aton() or similar each time round the loop
> 
> Well just the 1st time - then it is a number...

Err, option (1) was explicity suggesting we *dont* save the result
of the conversion, so aton() *would* have to be called each time.
(I didnt think this was sensible, I was just suggesting it
for completeness...)




Expressions and binding operator

2000-12-20 Thread Peter Scott

perlop:

>Binary ``=~'' binds a scalar expression to a pattern match. [...] The 
>right argument is a search pattern, substitution, or transliteration. [...]
>
>If the right argument is an expression rather than a search pattern, 
>substitution, or
>transliteration, it is interpreted as a search pattern at run time.

Should this second paragraph still be true for Perl 6?  I have at times 
wanted to do something of the form

perl -lwe '$x = "x"; $y = "y"; $y =~ ($x eq "x" ? s/y/z/ : s/y/a/); print $y'

but I have not wanted to make the right argument an expression to be 
interpreted as a search pattern (since I have qr//).

--
Peter Scott
Pacific Systems Design Technologies




Re: String representation

2000-12-20 Thread Nick Ing-Simmons

David Mitchell <[EMAIL PROTECTED]> writes:
>The problem is "what are the (types of) the arguments passed
>
>I dont really see why types af args are (in general) a problem.

Hmm, you may be right at the level of your example, which may indeed 
be typical of pp_(). Perhaps PerlIO is so bother some because it is 
lower level. If all the args are SV * (or whatver perl6 calls it)
then there is no big deal.

>pp_concat() {
>   SV *sv1, *sv2;
>   sv1 = POP; sv2 = POP;
>   sv1->concat(sv2);
>}
>
>the_type_of_sv1_concat(SV *sv1, *sv2) {
>   if (sv1->vtable == sv2->table) {
>   // both args are of this type:
>   // dive into the internals and do an efficient concat
>   sv1 = ;
>   } else {
>   generic_concat(sv1,sv2);
>   }
>}

The two "snags" with that are (and they may not be important):
  1. One point of the vtable scheme was to avoid conditionals following 
 memory fetches.
  2. The else branch may be very common, so we just added a function 
 call and a test to the "normal" case.

The snag is that there are common pairs 
 e.g. concat(utf8,ascii) / concat(ascii,utf8) 
or   
 plus(NV,IV) / plus(IV,NV)

where it is possible to get "smart" when one arg is a "special case" of 
the other.

>> True, but the messy details would now occur multiple times,
>> as soon as substr_utf8 exists then _ALL_ the other string ops 
>> _must_ be overridden as well because nothing but string_utf8 "class" 
>> knows what is going on.
>
>perhaps I'm being dim, but I dont really follow this. At the minimum,
>someone writes a generic substr function that works with any string types.
>Perhaps it achieves this by first converting all its args to UNICODE-32.

So we presume that all string types MUST (in standards-ese) support
a ->toUNICODE32() method?

And similarly numbers must be convertable to "complex long double" or
what ever is the top if the built-in tree ? (NV I guess - complex is
over-kill.)

It is the how do we do the generic case that worries me.

>Not very efficient or desirable, but it gets you there.
>Then the implementor of the utf8 code writes a substr_utf8 function that only
>knows how to cope if all its args are utf8. If not, it just
>hands the call on to the generic sub. 

I see that now it is more flexible that what we "sort of have" to-date
which is like:


>pp_concat() {
>   SV *sv1, *sv2;
>   sv1 = POP; sv2 = POP;
if (sv1->vtable == sv2->vtable)
> sv1->concat(sv2);
else
  generic_concat(sv1,sv2);
}

Or perhaps :
>pp_concat() {
>   SV *sv1, *sv2;
>   sv1 = POP; sv2 = POP;
if (sv1->knows_about(sv2->vtable))
> sv1->concat(sv2);
else
  generic_concat(sv1,sv2);
}

Both of which either pre-judge waht is allowed or put cost on the 
front of the simple case.


>> >In fact, I would argue that in general most if not all the operations 
>currently
>> >performed by pp_* should have vtable equivalents, both for numeric and string
>> >types (including unary ops, mutators, binops etc etc).
>> 
>> Hmm - that is indeed a logical position.
>
>logical as in "consistent" or logical as in "sensible" ??? :-)

Consistent. Consitent is usually sensible though.

>
>I was under the impression that it was pretty much agreed for numeric
>types that each SV type would have its own set of binary ops (eg add, sub
>etc), so I wasnt aware I proposing anything radical!

We have not solved (or I have not spotted) what you do 
when you get IV * NV. Something has to "know" to "upgrade" IV -> NV 
and call NV's '*' - which does not scale well. 


>I can't see why you get a code explosion. In perl5 you get the explosion -
>every part of perl needs to know about every SV type, and introducing a new
>type or subtype involves hacking in just about every nook and cranny within
>perl.
>If there was a bug in the + operator, it would be apparent fairly quickly
>where it lies (eg int+int and num+num gives right result,
>int+num goes wrong; therefore the Int->add[NUM]() function is suspect.)

All true.

The explosion is that you have 

  Int->add(Int)
  Int->add(Num)
  Int->add(Rational)
  Int->add(Monetarty)
  Int->add(NumComplex)
  Int->add(IntComplex)
  Int->add(RationalComplex)
  Int->add(InternationalMonetary)
  Int->add(String); # containing any of above
  Int->add(OverloadedObject)

Or if Int doesn't do that then generic_add() has to.

etc. - perl5 we have IV,(UV),NV so the problem is bounded.

>> In other words - string ops on strings of uniform type, math ops on 
>> well understood hierachies etc. are all easy enough - it is the 
>> combinations that get very messy very very quickly. 
>
>I couldnt agree more - however, I think that issue is mostly orthogonal
>to whether most pp_ functions should have vtable equivalents. If the
>functionality is built dirrectly into pp_XXX, you still have a combinatorial
>mess to cope with - hiving off into vtables *may* reduce the mess, or
>*mi

Re: Expressions and binding operator

2000-12-20 Thread Nicholas Clark

On Wed, Dec 20, 2000 at 03:36:48PM -0800, Peter Scott wrote:
> Should this second paragraph still be true for Perl 6?  I have at times 
> wanted to do something of the form
> 
> perl -lwe '$x = "x"; $y = "y"; $y =~ ($x eq "x" ? s/y/z/ : s/y/a/); print $y'
> 
> but I have not wanted to make the right argument an expression to be 
> interpreted as a search pattern (since I have qr//).

I presume that you don't find

perl -lwe '$x = "x"; $y = "y"; $x eq "x" ? $y =~ s/y/z/ : $y =~ s/y/a/; print $y'

does what you need because you actually want to do something a lot more
complex than simple "$y =~" in your expression.
Or do I guess wrong?

Nicholas Clark



Re: Expressions and binding operator

2000-12-20 Thread Peter Scott

At 11:39 PM 12/20/00 +, Nicholas Clark wrote:
>On Wed, Dec 20, 2000 at 03:36:48PM -0800, Peter Scott wrote:
> > Should this second paragraph still be true for Perl 6?  I have at times
> > wanted to do something of the form
> >
> > perl -lwe '$x = "x"; $y = "y"; $y =~ ($x eq "x" ? s/y/z/ : s/y/a/); 
> print $y'
> >
> > but I have not wanted to make the right argument an expression to be
> > interpreted as a search pattern (since I have qr//).
>
>I presume that you don't find
>
>perl -lwe '$x = "x"; $y = "y"; $x eq "x" ? $y =~ s/y/z/ : $y =~ s/y/a/; 
>print $y'
>
>does what you need because you actually want to do something a lot more
>complex than simple "$y =~" in your expression.
>Or do I guess wrong?

Oh, that certainly works, and I wouldn't have any complaint on grounds of 
brevity alone unless I wanted an lvalue expression like ($x ? $y : $z) 
instead of $y.  But since this is a Perl 6 list, I'm making an inquiry on 
syntactical convenience grounds.  What I wanted to write *feels* Perlish.

--
Peter Scott
Pacific Systems Design Technologies