Re: [perl #33751] [PATCH] Use INTERP in *.pmc files
Bernhard Schmalhofer (via RT) wrote: I noticed the there is an interesting mix of 'interpreter' and 'INTERP' Thanks, applied except dynclasses. I leave that part up for Sam - dunno if he got diffs there. leo
Re: Dimension of slices; scalars versus 1-element arrays?
OK, so at issue is the difference between an element of an array ($p5[1]) and a slice (that might contain only one element, @p5[1]), only generalised to n dimensions. (A problem which didn't exist in P5 because there were no higher dimensions!) And we don't want @B[4; 0..6] to reduce to a 1-D 6-array because then dimensions would just be disappearing into some other...dimension. (To lose one dimension is a misfortune, to lose two is plane careless) On the other hand, nobody wants to have to write @B[gone(4)] every time you need an array element. Given $a=42 and @a=(42), what if @X[$a] returned a scalar and @[EMAIL PROTECTED] returned a slice? Similarly, @X[$a; 0..6] would return a 6-array, and @[EMAIL PROTECTED]; 0..6] a 1x6-array -- scalar subscripts drop a given dimension and array subscripts keep it. (I think this is almost what Craig was suggesting, only without giving up ! for factorial. =)) The list-of-lists semicolon can still turn the 42 into [42], so that if you have a list-of-listy function that doesn't care whether you started with scalars or not, it doesn't have to know. But really the LoL semicolon would turn 42 into C<[42] but used_to_be_scalar>, so that something that *does* care (e.g. subscripting an array) can simply check for that property. Using a scalar to get a scalar feels rather appropriate, and not too surprising, I think. Most people would probably expect @X[$a] to return a scalar, and use @[EMAIL PROTECTED] to mean a slice (if @a happened to have only a single element, you're still probably using an array subscript to get possibly many elements -- otherwise why would you be using an array @a instead of just a plain scalar in the first place?) Plus if you do want to force an array subscript instead of a scalar, or vice versa, you don't need any new keywords to do it: @[EMAIL PROTECTED]; 0..6] or @X[[42]; 0..6] (which is the same as @X[list 42; 0..6], right? Which could also be written @X[*42; 0..6], which is kind of nice, because [*42] means "give me the 42nd slice" while [*] means "give me an unspecified slice, a slice of everything".) Anyway, delving right into the can of wyrms, in P5 there were list, scalar, and void contexts (1-D, 0-D, and... uh... -1-D?), but now that we have real multidimensional arrays, we could have infinite contexts (ouch). Well, there must be some nice way to generalise that, but it raises a bunch of questions. (I can imagine "table context" being reasonably popular.) Various functions in P5 left the dimension of their arg untouched (take a list, return a list), or dropped it down one (take a list, return a scalar). (Taking a scalar and returning a list is less common, but I can imagine a 2-D version of 'split' that turns a string into a table) So in p6, should 'shift'ing an n-D array return a scalar or an array of n-1 dimensions? It depends on whether you see it as a way to criss-cross through an array one element at a time, or as a way to take one 'layer' off something. Both would be useful. 'grep' could return a list (1-D) of all matching individual elements, but perhaps more usefully it could preserve dimensionality: my @board is shape(8;8); #match alternating squares: @checkerboard=grep { ($_.index[0] + $_.index[1])%2 } @board; ...to end up with a ragged 2-D array containing only the usable half of our checkerboard. (I'm assuming we have something like .index to get the x, y co-ords of an element?) 'reverse' would presumably flip around the indices in all dimensions. Ah, the fun of coming up with new multidimensional variations on all the old (or new) favourites! - David "a head-scratcher no matter how you slice it" Green
Re: Dimension of slices; scalars versus 1-element arrays?
In article <[EMAIL PROTECTED]>, [EMAIL PROTECTED] (David Green) wrote: >I can imagine "table context" being reasonably popular. [...] >(Taking a scalar and returning a list is less common, but I can >imagine a 2-D version of 'split' that turns a string into a table) One way to generalise it might be to allow an array (ref) for the thing to split on. Each element of the array could specify the splitter for the corresponding dimension: @table_2D = split [//i, //i], $html_table; I guess forcing "table context" on a list would effectively turn it from a 1-D n-array into a 2-D 1xn-array. Scalar context on a 2-D table should return some sort of count (analogous with a list in scalar context), but maybe not the number of elements in the table. I think the number of records would typically be more useful. And list context on a table... it might return a list of array [refs], each containing a record -- in other words, convert the table into a p5-style nested data structure that simulates a true 2-D array. On the other hand, maybe list context simply returns a single plain list consisting of the table "headers". Actually, if we have "headings", that's very handy for DB modules, but we've gone beyond a plain array in two dimensions. A table with named fields would really be more of a 2-D hash... my %rec is shape(Int; ); %rec<0;foo>="Silence is"; %rec<1>=; #assign whole record at once(?) Except those Int keys are effectively used as strings that just happen to look like ints, right? That is, I'm not getting all the arrayary goodness (like pushing or popping or ordering). What I really want here is a hybrid hash-array. I suspect that there's no way to do that though (other than creating my own class and overloading array stuff to handle it for the numeric key(s)). - David "2-D or not 2-D" Green
taint mode generalization
Hola... I think taint mode should be made reusable somehow, by implementing it in terms of contagious attribution... For example: my $string : secret = "password"; # the "secret" attr is # contagious, and causes memory to be overwritten before being # returned to the OS $foo = substr($secret, $x, $y); # $foo is also secret system("echo", $foo); # fatal - secret data doesn't want to # be shared. The role determines how it doesn't want to be used Another idea is to enforce separation of data sets, a bit like traditional tainting: Data from user a is not allowed to interact with data from user b. Anything A's input touches is now exclusively owned by A, and cannot touch anything that is owned by B. Perhaps a sane way to do this is to make certain roles say they are contageous, and have them attach themselves in the same way that the taint bit does, to affected strings, or members of the same expression, or whatever. This could also be a useful in debugging. I for one would like to say my $var : lexical_data = "blah"; and have data derived from "blah" not be allowed to be used (or even to exist) outside the lexical scope it was created in. I think a flexible notation of what is disallowed to certain roles is also useful. For example, say I have a hash of sensitive data, I don't ever want tainted data to be usable as keys/values. Perhaps this intolerance of tainting is better defined in a contagious role: my %hash : pure; # doesn't like data which is tainted and perhaps it is better defined by facilities the tainted role provides. Lastly, maybe more thoughtful people can think up what vaccinating against contagious roles will look like... Like, say we have a filehandle that we allow writing secret data to. -- () Yuval Kogman <[EMAIL PROTECTED]> 0xEBD27418 perl hacker & /\ kung foo master: /me climbs a brick wall with his fingers: neeyah! pgp0HfYPlYeJy.pgp Description: PGP signature
Scope exit and timely destruction
Given is a Perl snippet like: { my $fh = IO::File->new; $fh->open(">test.tmp"); print $fh "a"; } The filehandle is closed automatically at scope exit and the file contains the expected contents. That's quite easy in the current Perl implementation as it does reference counting. At the closing bracket the last reference to C<$fh> has gone and the file gets closed during object destruction. As parrot isn't using reference counting for GC, we have to run a GC cycle to detect that the IO object is actually dead. This is accomplished by the "sweep 0" opcode, which does nothing if no object needs timely destruction. Above example could roughly translate to Parrot code like: new_pad -1 # new lexical scope $P0 = new ParrotIO store_lex -1, "$fh", $P0 open $P0, "test.tmp", ">" print $P0, "a" pop_pad # get rid of $fh lexical sweep 0 # scope exit sequence Alternatively a scope exithandler could be pushed onto the control stack. With such a sequence we got basically two problems: 1) Correctness of "sweep 0" At scope exit we've somewhere in the PMC registers the ParrotIO object still existing. This means that during marking the root set, the ParrotIO object is found being alive, where it actually isn't. The usage of registers keeps the object alive until this register frame isn't referenced anymore, that is after that function was left. 2) Performance The mark&sweep collector has basically a performance that is proportional to the amount of all allocated objects (all live objects are marked and live+dead objects are visited during sweep). So imagine, the Perl example is preceded by: my @array; $array[$_] = $_ for (1..10); { ... The GC cycle at scope exit has now to run through all objects to eventually find the ParrotIO object being dead (given 1) is solved). Performance would really suck for all non-trivial programs, i.e. for all programs with some considerable amount of live data. During the discussion WRT continuations I've already proposed a scheme that would solve 1) too. While 2) is "just" a performance problem, addressing it early can't harm as a solution might impact overall interpreter layout. But before going more into that issue, I'd like to hear some other opinions. leo
How to check external library dependencies in 'dynclasses'?
Hi, I'm working on a dynamic PMC that ties into the GNU dbm library, http://www.ugcs.caltech.edu/info/gdbm/gdbm_toc.html. The implementation is fairly straightforward, and simple test cases are already working. However I'm not sure about how to check for the availability of 'libgdbm.so'. If 'libgdbm.so' is not there I do not want to compile and I want to skip the tests. I could check for 'libgdbm.so' in the core Parrot configure step. But this is not nice, as core Parrot shouldn't need to know about funky extensions. So the real question is, wether something like 'h2xs', Module::Build, 'extrb' is planned for Parrot. CU, Bernhard -- ** Dipl.-Physiker Bernhard Schmalhofer Senior Developer Biomax Informatics AG Lochhamer Str. 11 82152 Martinsried, Germany Tel: +49 89 895574-839 Fax: +49 89 895574-825 eMail: [EMAIL PROTECTED] Website: www.biomax.com **
Re: [perl #33751] [PATCH] Use INTERP in *.pmc files
Leopold Toetsch wrote: Bernhard Schmalhofer (via RT) wrote: I noticed the there is an interesting mix of 'interpreter' and 'INTERP' Thanks, applied except dynclasses. I leave that part up for Sam - dunno if he got diffs there. Applied. - Sam Ruby
CIA
Parrot is now listed on CIA http://cia.navi.cx/stats/project/parrot This will track all future commits, making them available as a RSS feed, etc. This slows down commits a little because of some work the script does to try and merge requests. I can tweak a little if people start noticing a delay. -R (insert-at-point (spook))