Re: [perl #33751] [PATCH] Use INTERP in *.pmc files

2005-01-12 Thread Leopold Toetsch
Bernhard Schmalhofer (via RT) wrote:
I noticed the there is an interesting mix of 'interpreter' and 'INTERP'
Thanks, applied except dynclasses. I leave that part up for Sam - dunno 
if he got diffs there.

leo


Re: Dimension of slices; scalars versus 1-element arrays?

2005-01-12 Thread David Green
OK, so at issue is the difference between an element of an array ($p5[1]) 
and a slice (that might contain only one element, @p5[1]), only 
generalised to n dimensions.  (A problem which didn't exist in P5 because 
there were no higher dimensions!)  

And we don't want @B[4; 0..6] to reduce to a 1-D 6-array because then 
dimensions would just be disappearing into some other...dimension.  
(To lose one dimension is a misfortune, to lose two is plane careless)  
On the other hand, nobody wants to have to write @B[gone(4)] every time 
you need an array element.

Given $a=42 and @a=(42), what if @X[$a] returned a scalar and @[EMAIL 
PROTECTED] 
returned a slice?  Similarly, @X[$a; 0..6] would return a 6-array, and 
@[EMAIL PROTECTED]; 0..6] a 1x6-array -- scalar subscripts drop a given 
dimension and 
array subscripts keep it.  (I think this is almost what Craig was 
suggesting, only without giving up ! for factorial. =))

The list-of-lists semicolon can still turn the 42 into [42], so that if 
you have a list-of-listy function that doesn't care whether you started 
with scalars or not, it doesn't have to know.  But really the LoL 
semicolon would turn 42 into C<[42] but used_to_be_scalar>, so that 
something that *does* care (e.g. subscripting an array) can simply check 
for that property.

Using a scalar to get a scalar feels rather appropriate, and not too 
surprising, I think.  Most people would probably expect @X[$a] to return  
a scalar, and use @[EMAIL PROTECTED] to mean a slice (if @a happened to have 
only a 
single element, you're still probably using an array subscript to get 
possibly many elements -- otherwise why would you be using an array @a 
instead of just a plain scalar in the first place?)

Plus if you do want to force an array subscript instead of a scalar, or 
vice versa, you don't need any new keywords to do it: @[EMAIL PROTECTED]; 0..6] 
or 
@X[[42]; 0..6] (which is the same as @X[list 42; 0..6], right?  Which 
could also be written @X[*42; 0..6], which is kind of nice, because [*42] 
means "give me the 42nd slice" while [*] means "give me an unspecified 
slice, a slice of everything".)


Anyway, delving right into the can of wyrms, in P5 there were list, 
scalar, and void contexts (1-D, 0-D, and... uh... -1-D?), but now that we 
have real multidimensional arrays, we could have infinite contexts (ouch).  
Well, there must be some nice way to generalise that, but it raises a 
bunch of questions.  (I can imagine "table context" being reasonably 
popular.)

Various functions in P5 left the dimension of their arg untouched (take a 
list, return a list), or dropped it down one (take a list, return a 
scalar).  (Taking a scalar and returning a list is less common, but I can 
imagine a 2-D version of 'split' that turns a string into a table)

So in p6, should 'shift'ing an n-D array return a scalar or an array of 
n-1 dimensions?  It depends on whether you see it as a way to criss-cross 
through an array one element at a time, or as a way to take one 'layer' 
off something.  Both would be useful.

'grep' could return a list (1-D) of all matching individual elements, but 
perhaps more usefully it could preserve dimensionality:
my @board is shape(8;8);
#match alternating squares:
@checkerboard=grep { ($_.index[0] + $_.index[1])%2 } @board;

...to end up with a ragged 2-D array containing only the usable half of 
our checkerboard.  (I'm assuming we have something like .index to get the 
x, y co-ords of an element?)

'reverse' would presumably flip around the indices in all dimensions.  Ah, 
the fun of coming up with new multidimensional variations on all the old 
(or new) favourites! 


 - David "a head-scratcher no matter how you slice it" Green


Re: Dimension of slices; scalars versus 1-element arrays?

2005-01-12 Thread David Green
In article <[EMAIL PROTECTED]>, 
[EMAIL PROTECTED] (David Green) wrote:

>I can imagine "table context" being reasonably popular. [...] 
>(Taking a scalar and returning a list is less common, but I can 
>imagine a 2-D version of 'split' that turns a string into a table)

One way to generalise it might be to allow an array (ref) for the thing 
to split on.  Each element of the array could specify the splitter for 
the corresponding dimension:

@table_2D = split [//i, //i], $html_table;

I guess forcing "table context" on a list would effectively turn it from 
a 1-D n-array into a 2-D 1xn-array.  

Scalar context on a 2-D table should return some sort of count 
(analogous with a list in scalar context), but maybe not the number of 
elements in the table.  I think the number of records would typically be 
more useful. 

And list context on a table... it might return a list of array [refs], 
each containing a record -- in other words, convert the table into a 
p5-style nested data structure that simulates a true 2-D array.  On the 
other hand, maybe list context simply returns a single plain list 
consisting of the table "headers".

Actually, if we have "headings", that's very handy for DB modules, but 
we've gone beyond a plain array in two dimensions.  A table with named 
fields would really be more of a 2-D hash... 
 my %rec is shape(Int; );
 %rec<0;foo>="Silence is";
 %rec<1>=;  #assign whole record at once(?)

Except those Int keys are effectively used as strings that just happen 
to look like ints, right?  That is, I'm not getting all the arrayary 
goodness (like pushing or popping or ordering).  What I really want here 
is a hybrid hash-array.  I suspect that there's no way to do that though 
(other than creating my own class and overloading array stuff to handle 
it for the numeric key(s)).



- David "2-D or not 2-D" Green


taint mode generalization

2005-01-12 Thread Yuval Kogman
Hola...

I think taint mode should be made reusable somehow, by implementing
it in terms of contagious attribution... For example:

my $string : secret = "password"; # the "secret" attr is
# contagious, and causes memory to be overwritten before being
# returned to the OS

$foo = substr($secret, $x, $y); # $foo is also secret

system("echo", $foo); # fatal - secret data doesn't want to
# be shared. The role determines how it doesn't want to be used

Another idea is to enforce separation of data sets, a bit like
traditional tainting:  Data from user a is not allowed to interact
with data from user b. Anything A's input touches is now exclusively
owned by A, and cannot touch anything that is owned by B.

Perhaps a sane way to do this is to make certain roles say they are
contageous, and have them attach themselves in the same way that the
taint bit does, to affected strings, or members of the same
expression, or whatever.

This could also be a useful in debugging. I for one would like to
say

my $var : lexical_data = "blah";

and have data derived from "blah" not be allowed to be used (or even
to exist) outside the lexical scope it was created in.

I think a flexible notation of what is disallowed to certain roles
is also useful. For example, say I have a hash of sensitive data, I
don't ever want tainted data to be usable as keys/values.

Perhaps this intolerance of tainting is better defined in a
contagious role:

my %hash : pure; # doesn't like data which is tainted

and perhaps it is better defined by facilities the tainted role
provides.

Lastly, maybe more thoughtful people can think up what vaccinating
against contagious roles will look like... Like, say we have a
filehandle that we allow writing secret data to.

-- 
 ()  Yuval Kogman <[EMAIL PROTECTED]> 0xEBD27418  perl hacker &
 /\  kung foo master: /me climbs a brick wall with his fingers: neeyah!



pgp0HfYPlYeJy.pgp
Description: PGP signature


Scope exit and timely destruction

2005-01-12 Thread Leopold Toetsch
Given is a Perl snippet like:
  {
my $fh = IO::File->new;
$fh->open(">test.tmp");
print $fh "a";
  }
The filehandle is closed automatically at scope exit and the file
contains the expected contents.
That's quite easy in the current Perl implementation as it does
reference counting. At the closing bracket the last reference to
C<$fh> has gone and the file gets closed during object destruction.
As parrot isn't using reference counting for GC, we have to run a GC
cycle to detect that the IO object is actually dead. This is
accomplished by the "sweep 0" opcode, which does nothing if no object
needs timely destruction.
Above example could roughly translate to Parrot code like:
  new_pad -1  # new lexical scope
  $P0 = new ParrotIO
  store_lex -1, "$fh", $P0
  open $P0, "test.tmp", ">"
  print $P0, "a"
  pop_pad # get rid of $fh lexical
  sweep 0 # scope exit sequence
Alternatively a scope exithandler could be pushed onto the control
stack.
With such a sequence we got basically two problems:
1) Correctness of "sweep 0"
At scope exit we've somewhere in the PMC registers the ParrotIO
object still existing. This means that during marking the root set,
the ParrotIO object is found being alive, where it actually isn't.
The usage of registers keeps the object alive until this register
frame isn't referenced anymore, that is after that function was left.
2) Performance
The mark&sweep collector has basically a performance that is
proportional to the amount of all allocated objects (all live objects
are marked and live+dead objects are visited during sweep). So
imagine, the Perl example is preceded by:
  my @array;
  $array[$_] = $_ for (1..10);
  {
 ...
The GC cycle at scope exit has now to run through all objects to
eventually find the ParrotIO object being dead (given 1) is solved).
Performance would really suck for all non-trivial programs, i.e. for
all programs with some considerable amount of live data.
During the discussion WRT continuations I've already proposed a scheme
that would solve 1) too.
While 2) is "just" a performance problem, addressing it early can't
harm as a solution might impact overall interpreter layout. But before
going more into that issue, I'd like to hear some other opinions.
leo



How to check external library dependencies in 'dynclasses'?

2005-01-12 Thread Bernhard Schmalhofer
Hi,
I'm working on a dynamic PMC that ties into the GNU dbm library, 
http://www.ugcs.caltech.edu/info/gdbm/gdbm_toc.html.

The implementation is fairly straightforward, and simple test cases are 
already working. However I'm not sure about how to check for the 
availability of 'libgdbm.so'. If 'libgdbm.so' is not there I do not want 
to compile and I want to skip the tests.

I could check for 'libgdbm.so' in the core Parrot configure step. But 
this is not nice, as core Parrot shouldn't need to know about funky 
extensions.

So the real question is, wether something like 'h2xs', Module::Build, 
'extrb' is planned for Parrot.

CU, Bernhard
--
**
Dipl.-Physiker Bernhard Schmalhofer
Senior Developer
Biomax Informatics AG
Lochhamer Str. 11
82152 Martinsried, Germany
Tel: +49 89 895574-839
Fax: +49 89 895574-825
eMail: [EMAIL PROTECTED]
Website: www.biomax.com
**


Re: [perl #33751] [PATCH] Use INTERP in *.pmc files

2005-01-12 Thread Sam Ruby
Leopold Toetsch wrote:
Bernhard Schmalhofer (via RT) wrote:
I noticed the there is an interesting mix of 'interpreter' and 'INTERP'
Thanks, applied except dynclasses. I leave that part up for Sam - dunno 
if he got diffs there.
Applied.
- Sam Ruby


CIA

2005-01-12 Thread Robert Spier

Parrot is now listed on CIA

http://cia.navi.cx/stats/project/parrot

This will track all future commits, making them available as a RSS
feed, etc.

This slows down commits a little because of some work the script does
to try and merge requests.  I can tweak a little if people start
noticing a delay.

-R
(insert-at-point (spook))