Parrot Smoke Oct 31 20:00:02 2001 UTC hpux 11.00

2001-11-01 Thread H.M. Brand

Automated smoke report for patch Oct 31 20:00:02 2001 UTC
  v0.02 on hpux using cc version B.11.11.02
O = OK
F = Failure(s), extended report at the bottom
? = still running or test results not (yet) available
Build failures during:   - = unknown
c = Configure, m = make, t = make test-prep

 Configuration
---  
O O  
O O  nv=double
O O  iv=int
O O  iv=int --define nv=double
O O  iv=long
O O  iv=long --define nv=double
| |
| +- --debugging
+--- normal



Re: java vs. parrot mops

2001-11-01 Thread Simon Cozens

On Thu, Nov 01, 2001 at 10:31:07AM -0500, Dan Sugalski wrote:
> So it looks like about a 2.5 speedup with computed goto. Cool.

Looks really good to me, too. Where's the patch? This should probably
go in as an alternate runops core.

-- 
The problem with big-fish-little-pond situations is that you
have to put up with all these fscking minnows everywhere.
-- Rich Lafferty



Re: java vs. parrot mops

2001-11-01 Thread Dan Sugalski

At 03:34 PM 11/1/2001 +, Leon Brocard wrote:
>Dan Sugalski sent the following bits through the ether:
>
> > For the Java-impaired (i.e. me :) what's the -Xint option do?
>
>It turns off the JIT (which is enabled by default).

Ah, thanks. Much as I hate it (because the numbers are so lousy) I think we 
should have a JIT number for benchmark runs too. A pb2c count as well, so 
we don't have to feel *too* bad. I'd like .Net/Mono numbers too, if we can 
manage.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: String rationale

2001-11-01 Thread Simon Cozens

On Sat, Oct 27, 2001 at 04:23:48PM +0100, Tom Hughes wrote:
> The encoding_lookup() and chartype_lookup() routines will obviously
> need to load the relevant libraries on the fly when we have support
> for that.

Could you try rewriting them using an enum, like the vtable stuff and
the original string encoding stuff does?

-- 
An algorithm must be seen to be believed.
-- D.E. Knuth



Re: [PATCHES] Simple memory implementation

2001-11-01 Thread Dan Sugalski

At 01:12 AM 11/1/2001 -0500, Jeff wrote:
>Added features:
>
>1) 'alloci(i,i|ic)' - Allocates $2 integers, sets $1 to a "reference" to
>the new arena
>2) 'freei(i)' - Frees the arena referenced by $1
>3) 'savei(i,i,i|ic)' - Saves the integer register at $1 into arena $2,
>at index $3
>4) 'loadi(i,i,i|ic)' - Loads the integer from arena $2, index $3 into
>register $1

Interesting. We're going to end up doing it differently, though, I think. 
Less visible to the program running on the interpreter generally.

Keen anyway. And it even had tests! :)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: java vs. parrot mops

2001-11-01 Thread Daniel Grunblatt

OK then, here is the patch, of course I don't expect this to be commited
since it's crap but if you test it (please do it) and it's ok for everyone
I will rewrite it more efficiently.

*PLEASE* test it and give me some feedback.

Thanks in advance.

On Thu, 1 Nov 2001, Simon Cozens wrote:

> On Thu, Nov 01, 2001 at 10:31:07AM -0500, Dan Sugalski wrote:
> > So it looks like about a 2.5 speedup with computed goto. Cool.
>
> Looks really good to me, too. Where's the patch? This should probably
> go in as an alternate runops core.
>
> --
> The problem with big-fish-little-pond situations is that you
> have to put up with all these fscking minnows everywhere.
> -- Rich Lafferty
>


39c39
< while (pc) { DO_OP(pc, interpreter); }
---
> just_do_it(interpreter,pc);


107a108,109
> my @op_labels;
> my @op_addr;
116a119
> my $label  = "PC_$index:\n";
117a121
> my $source_goto= $op->source(\&map_ret_abs_goto, \&map_ret_rel_goto, \&map_arg, 
>\&map_res_abs, \&map_res_rel);
118a123
> push @op_addr, "&&PC_$index,\n";
121a127
> push @op_labels, "$label {\n$source_goto}\n\n";
129c135,136
< print SOURCE < if ($ARGV[0] eq "core.ops") {
> print SOURCE < int just_do_it(struct Parrot_Interp *, opcode_t *);
137a146,150
> int
> just_do_it(struct Parrot_Interp *interpreter, opcode_t * cur_opcode)
> {
> 
> static void *ops_l[] = {
140c153,157
< print SOURCE @op_funcs;
---
> print SOURCE @op_addr;
> 
> print SOURCE < };
> goto *ops_l[*cur_opcode];
141a159,171
> END_C
> 
> print SOURCE @op_labels;
> 
> print SOURCE "}\n";
> } else {
> print SOURCE <   NULL
> };
> END_C
> }
> 
> print SOURCE @op_funcs;
188a219,242
> 
> #
> # map_ret_abs_goto()
> #
> 
> sub map_ret_abs_goto
> {
>   my ($addr) = @_;
>   if ($addr eq '0') {
>   return "return (0);"
>   } else { 
>   return "goto *ops_l[*(cur_opcode = $addr)]";
>   }
> }
> 
> #
> # map_ret_rel_goto()
> #
> 
> sub map_ret_rel_goto
> {
>   my ($offset) = @_;
>   return "goto *ops_l[*(cur_opcode += $offset)]";
> }



Re: java vs. parrot mops

2001-11-01 Thread Daniel Grunblatt

Look, now I'm rewriting a patch for pbc2c.pl which will use computed goto
*ONLY* in a few ops like jump or ret (or any other that modifies the
program flow) in all other cases it will stay with the actual goto model
which is ,in my opinion, the fastest.

# ./mops
Iterations:1
Estimated ops: 3
Elapsed time:  2.228982
M op/s:134.590583


On Thu, 1 Nov 2001, Kevin Huber wrote:

> Daniel Grunblatt wrote:
>
> > I have tested times using computed goto in the interpreter and here are
> > the results:
> >
> > # ./test_prog mops.pbc
> > Iterations:1
> > Estimated ops: 3
> > Elapsed time:  8.604721
> > M op/s:34.864582
>
>
>
> Yes, I wrote a poor-man's computed goto version just of mops.pasm the
> other day with an inner loop like this:
>
> #define NEXT goto **ip++
>
>   op8:
>  if (ireg[2] == ireg[4]) {
>  *ip+=3;
>  goto **ip;
>  }
>  NEXT;
>   op9:
>  ireg[2] += ireg[3];
>  NEXT;
>   op10:
>  *ip-=3;
>  goto **ip;
>
> It ran at ~58 M op/s, slightly over twice the current dispatcher.
> Obviously my code is not a general machine, but I hypothesized that
> 58 would be near the ideal upper bound on direct-threading
> performance for the current architecture on my computer.
>
> -Kevin
>
>
>




Re: vmem memory manager

2001-11-01 Thread Michael L Maraist


> My intial impression is that a page should be 1024 bytes (though I'm also
> looking at 512 bytes to reduce the shotgun effect of 8-byte allocs).
> I've actually found that "q-caches" were detramental to things like
> conslidation, and the calling depth (overhead) of a worst-case allocation.
> So I'm currently leaving them out.  Instead I'm using a power-of-two basd
> dq-cache (for differential q-cache).

Forgot to include two other points on this topic.  First, if q-caches were 
used, the slab-header would require a whole page (so as to maintain 
alignment).  With a q-max of size 4 at most and a slab object multiple of 4,  
we have 16 pages as the minimum allocation actually fullfilled by vmem.  
However, pages are of 1k-alignment, which you'll note tells vmem that it 
should look in it's segment-hashtable.  Thus a q-cache in this system could 
not simply use multiples of a page size.  On the other hand, an alternative 
would be to subdivide pages into two regions instead of one.  If it assumed 
that accesses within the q-cache region will be rare (relegated mostly to 
spills into vmem by the dq-cache), then the q-cache could be considered a 
second tier subdivision of a page.  For example, if the page-size was raised 
to 16K (must be a power-of-two for efficient page-alignment calculations), 
the dq-caches would still utilize a slab-size of 1k and define dq's of sizes: 
8,16,24,...384, but the q-cache would utilize a slab-size of 16K (our new 
page size) and define q's of sizes: 1,2,3,4,5.  Note that dq's use power-of 
2/3 to efficiently cover a broad range while q's use multiples so as to 
minimize slack space.  Here, however, our q-cache allocations have an 
incredible slack space due to the forced subdivisions of 16 as the following 
table shows:

page-size 16K
q-cache-size : num-objects : header-overhead : slack-space : % overhead 
(including slack )
1 : 15 : 1 : 0 : 6.66%
2 : 7   : 1 : 1 : 14.2%
3 : 5   : 1 : 0 : 6.66%
4 : 3   : 1 : 3 : 33%
5 : 3   : 1 : 0 : 6.66%

There are several trade-offs.  The larger the slab-size, the less max 
overhead for any q-cache (here we have 33% max overhead  for optimial 
alignments: in the worst case, an allocation of size 3.01k, for example would 
have 77% slack).  However, an allocation of size 6k would still require a 
full 16K allocation by vmem, since we want to minimize fragmentation.  That's 
an overhead of 166% (wasting 10k of memory).  If we wanted to reduce both the 
percent and max physically wasted memory, we could run the q-cache all the 
way until size == 1 (at which point the slab is of little or no use):

page size = 8K
1 : 7 : 1 : 0 : 12.5%
2 : 3 : 1 : 1 : 25%
3 : 2 : 1 : 1 : 25%

We round sizes 4,5, 6 and 7 up to an allocation of size 8.  Thus the max 
overhead is 3.1k -> 8k => 166% which wastes over 4k of physical memory.
Note also that we incur at least 12.5% overhead.  It's debatable whether 8K 
or 16K is better.  Both have worst cases of 166%, but the 16K wastes more 
total physical memory.  Compare this to the following table of power-2 
allocations for dq-caches:

virtual page size = 1k, slab header size is approx 28 bytes
8  : 124 : 4 : 4 : 3.2%
12 : 83  : 2 : 0 : 2.811%
16 : 62  : 2 : 4 : 3.2%
24 : 41  : 1 : 12 : 4%
32 : 31  : 1 : 4  : 3.2%
48 : 20  : 1 : 36 : 6.6% 
64 : 15  : 1 : 36  : 6.6%
96 : 10  : 0 : 64  : 6.6%
128 : 7  : 1 : 100 : 14.2%
192 : 5  : 0 : 36  : 6.6%
256 : 3  : 1 : 228  : 33%
384 : 2  : 0 : 228 : 33%

BUT, these first tier dq-caches all need 1k allocations from vmem, which 
passes the request off to the q-caches, which incure an additional 12.5% 
overhead for 8K page or 6.66% overhead for 16K pages.  

Additionally note that the overhead can be reduced somewhat if we could use 
an external header for slabs of object-size >= .5 vpage.  But unfortunately, 
the first object of this slab falls on a page boundry, and thus will be 
inappropriatedly interpreted by vmem_size and vmem_free.  Note, once again, 
that explicit object caches do not use vmem directly and are thus shielded 
from this page/ vpage limitation.  This only applies to the internal dq-cache 
and the debated q-cache, not the explicitly allocated object-caches.

More important than slack space is the shot-gun effect, which is exacerbated 
since we now have 7 layers in which a freed memory object may reside:

8B dq-cache magazine (1 of 20; assuming max-stack-size = 5, data/prev_data 
are both full, and there are two threads)
8B dq-cache depot ( 1 of 50; assuming max-stacks = 5 ) 
8B slab ( 1 of 124 )
1024B q-cache magazine ( 1 of 8; assuming max-stack-size = 2, data/prev_data 
are both full and there are two threads)
1024B q-cache depot ( 1 of 50 )
1024B slab ( 1 of  7 )
vmem segment ( 1 of 255 ) (segment size = 255 * 8 * 1kvpage = 2 Meg) 

That's 347 million non-vmem hiding places for a free'd object.  A reclaim 
currently is only slated to free depot's, but reduces us to 3.47 
million hiding places.  If we do away with the second tier (and the vpage 
concept

Re: String rationale

2001-11-01 Thread Tom Hughes

In message <[EMAIL PROTECTED]>
Simon Cozens <[EMAIL PROTECTED]> wrote:

> On Sat, Oct 27, 2001 at 04:23:48PM +0100, Tom Hughes wrote:
> > The encoding_lookup() and chartype_lookup() routines will obviously
> > need to load the relevant libraries on the fly when we have support
> > for that.
> 
> Could you try rewriting them using an enum, like the vtable stuff and
> the original string encoding stuff does?

The intention is that when an encoding or character type is loaded it
will be allocated a unique ID number that can be used internally to
refer to it, but that the number will only valid for the duration of
that instance of parrot rather than being persistent. That's certainly
the way Dan described it happening in his rationale which is what my
code is based on.

Allocating them globally is not possible if we're going allow people
to add arbitrary encodings and character sets - as things stand adding
the foo encoding will be as simple as adding foo.so to the encodings
directory.

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu




Re: String rationale

2001-11-01 Thread Simon Cozens

On Thu, Nov 01, 2001 at 02:18:17PM +, Tom Hughes wrote:
> > Could you try rewriting them using an enum, like the vtable stuff and
> > the original string encoding stuff does?
> 
> Allocating them globally is not possible if we're going allow people
> to add arbitrary encodings and character sets - as things stand adding
> the foo encoding will be as simple as adding foo.so to the encodings
> directory.

As things stand, that won't work, because you're doing a string lookup in one
of the core functions, and you still need some way of registering incoming
stuff. With an enum, you can keep hold of a fake encoding_max, and hand
encoding_max++ to the initialisation function for each encoding.

-- 
Relf Test Passed.



Re: String rationale

2001-11-01 Thread Tom Hughes

In message <[EMAIL PROTECTED]>
Simon Cozens <[EMAIL PROTECTED]> wrote:

> As things stand, that won't work, because you're doing a string lookup in one
> of the core functions, and you still need some way of registering incoming
> stuff. With an enum, you can keep hold of a fake encoding_max, and hand
> encoding_max++ to the initialisation function for each encoding.

Well there won't be any point in it being an enum rather that an 
integer unless some of them are going to be preallocated. I'm not
sure if the encoding and character types will need to know their
own index numbers but if we do then they can be told at initialisation
time, yes.

I absolutely intend that the current hard coded strings in the core
will go away in due course though. When you look up an encoding or
character type by name it will first check a hash table or something
to see if it is already loaded and if not it will look for it on disk
and load it in, allocate it a number, and add it to the hash table
for future reference.

Hence the current strcmp junk in the lookup functions will go away.

In much the same way the byte code will have some sort of table of
names which it will look up as it is loaded rather than the current
hard coding of name to number mappings in the byte code.

So all I need now to make all this work is hash tables and dynamic
code loading ;-) Any volunteers...

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu




Re: java vs. parrot mops

2001-11-01 Thread Dan Sugalski

At 09:45 PM 10/31/2001 -0300, Daniel Grunblatt wrote:
>I have tested times using computed goto in the interpreter and here are
>the results:
>
># ./test_prog mops.pbc
>M op/s:34.864582
>
># java -Xint mops
>M op/s:30.950170356876555
>
>Just for the records, this is with the current interpreter:
># ./test_prog mops.pbc
>M op/s:13.260716

So it looks like about a 2.5 speedup with computed goto. Cool.

For the Java-impaired (i.e. me :) what's the -Xint option do?

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: java vs. parrot mops

2001-11-01 Thread Kevin Huber

Daniel Grunblatt wrote:

> I have tested times using computed goto in the interpreter and here are
> the results:
> 
> # ./test_prog mops.pbc
> Iterations:1
> Estimated ops: 3
> Elapsed time:  8.604721
> M op/s:34.864582



Yes, I wrote a poor-man's computed goto version just of mops.pasm the
other day with an inner loop like this:

#define NEXT goto **ip++

  op8:
 if (ireg[2] == ireg[4]) {
 *ip+=3;
 goto **ip;
 }
 NEXT;
  op9:
 ireg[2] += ireg[3];
 NEXT;
  op10:
 *ip-=3;
 goto **ip;

It ran at ~58 M op/s, slightly over twice the current dispatcher.
Obviously my code is not a general machine, but I hypothesized that
58 would be near the ideal upper bound on direct-threading
performance for the current architecture on my computer.

-Kevin





Re: java vs. parrot mops

2001-11-01 Thread Leon Brocard

Dan Sugalski sent the following bits through the ether:

> For the Java-impaired (i.e. me :) what's the -Xint option do?

It turns off the JIT (which is enabled by default).

Leon
-- 
Leon Brocard.http://www.astray.com/
Nanoware...http://www.nanoware.org/

 DATA - Dedicated Absorbing Technical Absurdities (?)



Re: [PATCH] Simple I/O

2001-11-01 Thread Gregor N. Purdy

Jeff --

Thanks. Applied.

NOTE: flags issue not solved.


Regards,

-- Gregor

On Tue, 2001-10-30 at 22:58, Jeff wrote:
> Implements the following instructions:
> 
> 1) open(i|ic,i|ic,s|sc) - Filehandle in $1, r/w mode in $2 (permissions
> 644), filename in $3
> 2) read(s,i|ic,i|ic) - String register in $1, filehandle in $2, number
> of chars in $3
> 3) write(i,s) - Filehandle in $1, string register in $2
> 4) close(i) - Filehandle in $1
> 
> You'll need to determine constants for O_CREAT &c on a per-platform
> basis. The modes will probably need to be manifest constants. For
> instance, O_CREAT|O_WRONLY on Linux is 65.
> 
> On UNIX, the following program takes input from command line and echoes
> it:
> 
> read S0,1,80 ; take 80 characters from STDIN (Also will need to be a
> manifest constant
> print S1 ; print the string from STDIN
> end
> 
> Writing to a file looks like this:
> 
> open I0,65,"test" ; O_CREAT|O_WRONLY, write to file 'test'
> write I0,"Hey, here's some test data" ; Write some sample data in
> close I0
> end
> 
> This may not even vaguely resemble the I/O model we eventually adopt,
> but it's one idea. Albeit a simplistic model.
> 
> --
> Jeff
> <[EMAIL PROTECTED]>
> 
> 
> 

> 7,9d6
> < #include 
> < #include 
> < #include 
> 75,139d71
> < 
> < 
> < 
> < 
> < =item B(i,i|ic,s|sc)
> < 
> < Open file with name (string, parameter 3) with mod (int, parameter 2), and
> < save the filenum into the register in parameter 1.
> < 
> < =cut
> < 
> < AUTO_OP open(i,i|ic,s|sc) {
> <   STRING *s = $3;
> <   $1 = open(s->bufstart,$2,S_IRUSR|S_IWUSR|S_IRGRP|S_IROTH);
> < }
> < 
> < 
> < 
> < 
> < =item read(s,i|ic,i|ic)
> < 
> < Read $3 characters from filehandle $1 into string $2
> < 
> < =cut
> < 
> < AUTO_OP read(s,i|ic,i|ic) {
> <   char *tmp;
> <   STRING *s;
> <   INTVAL len = $3;
> < 
> <   string_destroy($1);
> <   tmp = malloc(len+1);
> <   read($2,tmp,len);
> <   s = string_make(interpreter,tmp,len,0,0,0);
> <   $1 = s;
> <   free(tmp);
> < }
> < 
> < 
> < 
> < 
> < =item write(i,s)
> < 
> < Write the text at parameter 2 into filehandle with parameter 1.
> < 
> < =cut
> < 
> < AUTO_OP write(i,s) {
> <   STRING * s = $2;
> <   INTVAL count = string_length(s);
> <   write($1,s->bufstart,count);
> < }
> < 
> < 
> < 
> < 
> < =item close(i)
> < 
> < Close file reserved on the filehandle in parameter 1.
> < 
> < =cut
> < 
> < AUTO_OP close(i) {
> <   close($1);
> < }
-- 
 _ 
/ perl -e 'srand(-2091643526); print chr rand 90 for (0..4)'  \

   Gregor N. Purdy  [EMAIL PROTECTED]
   Focus Research, Inc.http://www.focusresearch.com/
   8080 Beckett Center Drive #203   513-860-3570 vox
   West Chester, OH 45069   513-860-3579 fax
\_/




Re: Building on Win32

2001-11-01 Thread Gregor N. Purdy

Richard --

Thanks. Applied.

NOTE: I put the code in time.c in a function Parrot_floatval_time().

Some Win32 person please see if that can be made to work. Patches
to make it actually work are hereby solicited.


Regards,

-- Gregor

On Thu, 2001-11-01 at 16:48, Richard J Cox wrote:
> Current get fails to build on Win32[1]
> 
> There are a host of problems (not the least being the object files not 
> going where the linker is expecting them), then aside the first is the 
> lack of a gettimeofday function. This is used in time_n.
> 
> Here's my Win32 version:
> 
> void gettimeofday(struct timeval* pTv, void *pDummy);
> {
> SYSTEMTIME sysTime;
> FILETIME fileTime;/* 100ns == 1 */
> LARGE_INTEGER i;
> 
> GetSystemTime(&sysTime);
> SystemTimeToFileTime(&sysTime, &fileTime);
> /* Documented as the way to get a 64 bit from a FILETIME. */
> memcpy(&i, &fileTime, sizeof(LARGE_INTEGER));
> 
> pTv->tv_sec = i.QuadPart / 1000; /*10e7*/
> pTv->tv_usec = (i.QuadPart / 10) % 100; /*10e6*/
> 
> }
> 
> Given a suitable definition of struct timeval for the prototype (there is 
> a definition in windows.h[2]) but making Windows.h a header across all the 
> builds causes its own problems (with BOOL for a start).
> 
> I'm not sure what a timeval of {0, 0} nominally represents, the above will 
> give seconds since 1 Jan 1601, so a fiddle factor is likely to be needed.
> 
> (I'm not sure what the best way to incorporate this is... for my test I 
> added an extra .c with a special include, but I think some form of common 
> OS abstraction layer is going to be needed rather than assuming one 
> ABI/API and then emulating elsewhere.)
> 
> 
> [1] Windows 2000, Visual Studio 6 SP5 & MS Platform SDK (Feb 01 edition)
> [2] Well, more correctly in winsock2.h which windows.h includes.
> -- 
> [EMAIL PROTECTED]
> 
> 
-- 
 _ 
/ perl -e 'srand(-2091643526); print chr rand 90 for (0..4)'  \

   Gregor N. Purdy  [EMAIL PROTECTED]
   Focus Research, Inc.http://www.focusresearch.com/
   8080 Beckett Center Drive #203   513-860-3570 vox
   West Chester, OH 45069   513-860-3579 fax
\_/




RE: Building on Win32

2001-11-01 Thread Hong Zhang

> void gettimeofday(struct timeval* pTv, void *pDummy);
> {
> SYSTEMTIME sysTime;
> FILETIME fileTime;/* 100ns == 1 */
> LARGE_INTEGER i;
> 
> GetSystemTime(&sysTime);
> SystemTimeToFileTime(&sysTime, &fileTime);
> /* Documented as the way to get a 64 bit from a FILETIME. */
> memcpy(&i, &fileTime, sizeof(LARGE_INTEGER));
> 
> pTv->tv_sec = i.QuadPart / 1000; /*10e7*/
> pTv->tv_usec = (i.QuadPart / 10) % 100; /*10e6*/
> 
> }

For speed reason, you can use GetSystemTimeAsFileTime(), which is
very efficient. The Win32 is little-endian only operating system.
You can use the following code.

void gettimeofday(struct timeval* pTv, void *pDummy);
{
__int64 l;
GetSystemTimeAsFileTime((LPFILETIME) &l);

pTv->tv_sec = (long) l / 1000; /*10e7*/
pTv->tv_usec = (unsigned long) (i.QuadPart / 10) % 100; /*10e6*/ 
}

You missed the cast.

Hong



cvs tags

2001-11-01 Thread Michael L Maraist

I just noticed that we're not using any tags in the CVS tree.  Do you think 
that we're at a state where we could start applying labels to our code.  I am 
considering two types of labels.  One for the periodic snapshots (so as to 
back-trace features in those files), and one for successful smoke-tests.

Thus if something is broken, developers can pull down the last known 
successful smoke-test and build up from there.

I'm assuming of course that there isn't a more sophisticated behind the 
scenes setup (like perforce).

-Michael



Experimental regex support

2001-11-01 Thread Angel Faus

Hi all,

I have developed some adittions that give Parrot a limited
amount of support to regular expressions.

It all started as a little experiment to find out what the 
"compile down to low-level ops" thing could mean 
someday.

The patch consists of:

* 5 new opcodes:

   - matchexactly
   - matchanychar
   - initbrstack
   - clearbrstack
   - backtrack
   - savebr

  The first two are the ones that actually implement the 
  matches.

  initbrstack, clearbrstack, backtrack, savebr are for
  managing the stack of pending possible matches. They
  use internally the integer and destination stack.

* A perl package and script that implement a simple regex
  compiler (using YAPE::Regex by the way).

  The compiler currently outputs a parrot program that
  matches the regexp against a predefined string. It could
  be easily modified to proceduce something more useful.

Currently, the following features are supported.

* exact matches
* any char (.)
* nested groups (do not capture)
* alternation
* simple quantifires (*, + ?)

There is a lot of room for improvment, either by 
implementing features that do not require changes in 
Parrot (non-greedy-quantifiers, anchors, capturing
and most of regex options can be added right now) 
or by making the necessary changes in Parrot 
(support for locales are required for macros, etc..).

This is not a serious patch, in the sense that there 
are many things missing, the ones that are supposed 
to work are not tested enough and even the ones 
that work are implemented in a way that is just wrong.

I am a rather mediocre programmer, and this are the first 
lines of code i ever sent to a mailing list, so please be 
benevolent with me. :)

Anyway I thought it would be interesting to share my 
little experiment.

Sincerly,

---
Angel Faus
[EMAIL PROTECTED]

1814a1815,1882
> 
> 
> AUTO_OP matchexactly(sc, s, i, ic){
>   
>   STRING* temp;
>  
>   
>   if (string_length($2) <= $3) {
> RETREL($4);
> }   
> 
>   temp = string_substr(interpreter, $2, $3 , string_length($1), NULL);
> 
>   if (string_compare(interpreter, $1, temp) != 0 ) {
> RETREL($4);
>   }
>   else {
> $3 = $3 + string_length($1);
>   }
> }  
> 
> AUTO_OP matchanychar(s, i, ic) {
>if (string_length($1) > $2){   
>   $2++;
>   }
>else {
>   RETREL($3);
>}
> }
>
> MANUAL_OP backtrack(i){
>   opcode_t *dest;
> 
>   pop_generic_entry(interpreter, &interpreter->user_stack_top, &($1), STACK_ENTRY_INT);
>   pop_generic_entry(interpreter, &interpreter->control_stack_top, &dest, STACK_ENTRY_DESTINATION);
> 
>   RETABS(dest);
> }
> 
> 
> AUTO_OP savebr(i, ic){
>  
>   push_generic_entry(interpreter, &interpreter->control_stack_top, cur_opcode + cur_opcode[2],  STACK_ENTRY_DESTINATION, NULL);
> 
>   push_generic_entry(interpreter, &interpreter->user_stack_top, &($1),  STACK_ENTRY_INT, NULL);
> 
> }
> 
> AUTO_OP initbrstack(ic) {
>   INTVAL i;
>   i = -1;
>   
>   push_generic_entry(interpreter, &interpreter->control_stack_top, cur_opcode + cur_opcode[1], STACK_ENTRY_DESTINATION, NULL);
>   push_generic_entry(interpreter, &interpreter->user_stack_top, &i, STACK_ENTRY_INT, NULL); 
> 
> }
> 
> AUTO_OP clearbrstack(i){
>   opcode_t *dest;
>   
>   while ($1 && $1 >= 0) {
> 	pop_generic_entry(interpreter, &interpreter->control_stack_top, &dest, STACK_ENTRY_DESTINATION);
> 	pop_generic_entry(interpreter, &interpreter->user_stack_top, &($1), STACK_ENTRY_INT); 
> 	}
> 	
> }
> 
> 
1826a1895
> 




package BabyRegex;

use YAPE::Regex 'BabyRegex';
use strict;
use vars '$VERSION';

$VERSION = '0.01';

my %modes = ( on => '', off => '' );

sub buildtree {
  my $self = shift;
  
  my $cnt = 0;
  my ($groupscnt, @groups);
  my @tree;
  
  while (my $node = $self->next) {

$node->id($cnt++);
$tree[-1]->next($node) if @tree;  

if ($node->type =~ /capture|group/) {
	push @groups, $node;
	$node->{ALTS} = [];
	$node->{COUNT} = $groupscnt++;
	}	
	
if ($node->type eq "alt") 	 {
	push (@{$groups[-1]->{ALTS}}, $node);
	my $groupnode = $groups[-1];
	$node->{GROUP} = $groupnode;
	
	push @{$groupnode->{ALTS}}, $node,  
	}

if ($node->type eq "close"){
	my $groupnode = pop @groups;
	$groupnode->{CLOSED} = $node;
	$node->{GROUP} = $groupnode;
	for my $alt (@{$groupnode->{ALTS}}) {
	#Alt nodes get its ID replaced by the Closing node ID, so 
	#that the when its antecessors calls ->next->id it gets the good one.
	#This is probably on of the worse to do that.
	$alt->{ID} = $node->{ID};
	}
}
push (@tree, $node);  
}

  return @tree;  
   
}

sub cry {
  if (@_[1]) {
  	my $label = shift;
  	my $opcode = shift;
  
  	my $spc = " " x (4 - length($label) ) ;
  	print $label. ":" . $spc . $opcode . "\n";
  }
  else {
  	my $opcode = shift;
  	print " $opcode\n";
  }

}


sub pasm {
  my ($self, $string) = @_;  
  my @tree = 

Re: Experimental regex support

2001-11-01 Thread Dan Sugalski

At 07:58 PM 11/1/2001 +0100, Angel Faus wrote:
>I am a rather mediocre programmer, and this are the first
>lines of code i ever sent to a mailing list, so please be
>benevolent with me. :)

Looks like your mailer wordwrapped the program pretty badly. Could you try 
again as either a context or unified diff (-c or -u) and attached to mail? 
I'm curious to look at it, as I've only partially considered how we'll do 
regexes to date.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Building on Win32

2001-11-01 Thread Richard J Cox

Current get fails to build on Win32[1]

There are a host of problems (not the least being the object files not 
going where the linker is expecting them), then aside the first is the 
lack of a gettimeofday function. This is used in time_n.

Here's my Win32 version:

void gettimeofday(struct timeval* pTv, void *pDummy);
{
SYSTEMTIME sysTime;
FILETIME fileTime;  /* 100ns == 1 */
LARGE_INTEGER i;

GetSystemTime(&sysTime);
SystemTimeToFileTime(&sysTime, &fileTime);
/* Documented as the way to get a 64 bit from a FILETIME. */
memcpy(&i, &fileTime, sizeof(LARGE_INTEGER));

pTv->tv_sec = i.QuadPart / 1000; /*10e7*/
pTv->tv_usec = (i.QuadPart / 10) % 100; /*10e6*/

}

Given a suitable definition of struct timeval for the prototype (there is 
a definition in windows.h[2]) but making Windows.h a header across all the 
builds causes its own problems (with BOOL for a start).

I'm not sure what a timeval of {0, 0} nominally represents, the above will 
give seconds since 1 Jan 1601, so a fiddle factor is likely to be needed.

(I'm not sure what the best way to incorporate this is... for my test I 
added an extra .c with a special include, but I think some form of common 
OS abstraction layer is going to be needed rather than assuming one 
ABI/API and then emulating elsewhere.)


[1] Windows 2000, Visual Studio 6 SP5 & MS Platform SDK (Feb 01 edition)
[2] Well, more correctly in winsock2.h which windows.h includes.
-- 
[EMAIL PROTECTED]



RE: Building Parrot for Win32.

2001-11-01 Thread Dan Sugalski

On Thu, 1 Nov 2001, Gregor N. Purdy wrote:

> We may end up needing to consolidate these platform-isms into a small
> number of files (one?) rather than have them split by type (like I did for
> Parrot_*_time). I don't know if we can get away with something as simple
> as platform.[hc] with all the yuck in one place or whether we need to have
> a platform/ directory with platform_*.[hc] in it that get copied at config
> time to platform.c and include/parrot/platform.h...

platform.c and platform.h is exactly what we're going to do. We need a
platforms directory as well. In there we'll have a win32.[ch], a
linux.[ch], a vms.[ch], a generic.[ch] and so on. Configure.pl will copy
the appropriate ones up and rename them platform.c & platform.h, and we'll
go from there.

For things like sleep that are not everywhere, I think we need to restrict
our usage to PP_sleep (for example) and #define that as appropriate in teh
platform.h file.

Dan




RE: Building Parrot for Win32.

2001-11-01 Thread Brent Dax

Jason Diamond:
# I'm having trouble building the latest parrot sources from
# CVS on Windows
# 2000. The Configure.pl script ran fine but after running
# nmake test_prog.exe
# I'm getting an error while compiling core_ops.c:
#
# core_ops.c(370) : error C2065: 'S_IRUSR' : undeclared identifier
# core_ops.c(370) : error C2065: 'S_IWUSR' : undeclared identifier
# core_ops.c(370) : error C2065: 'S_IRGRP' : undeclared identifier
# core_ops.c(370) : error C2065: 'S_IROTH' : undeclared identifier
#
# Are these constants that the Win32 CRT doesn't support? They
# don't seem to
# be defined anywhere in the fcntl.h file (which is where the
# other constants
# on that same line in core_ops.c seem to be defined).
#
# Is there something obvious that I missed?

No, this seems to be a case of Unix-centrism.  (I feel your pain--I'm on
Win32 too.)  I'm CCing perl6-internals on this, since I don't really
have the C experience to know what to do here.

In the mean time, if you really want to build Parrot *right now* and
don't care about the file I/O ops, you can comment out lines 143 and 148
in the 'core.ops' file and rebuild to get rid of this error.  However,
there are several other known Win32 build issues, so you'll likely have
to fiddle with things to get them to work.  (The sleep() op is another
source of such misery.)

--Brent Dax
[EMAIL PROTECTED]
Configure pumpking for Perl 6

When I take action, I’m not going to fire a $2 million missile at a $10
empty tent and hit a camel in the butt.
--Dubya




RE: Building Parrot for Win32.

2001-11-01 Thread Brent Dax

Jason Diamond:
# > Can you give me the output of that time.c compilation?  Someone just
# > asked for feedback on how that was working.
#
# It was complaining about SYSTEMTIME not being defined. You
# have to #include
#  for that. But that caused a redefinition error
# for BOOL. So I
# renamed the BOOL in parrot.h to P_BOOL just to see if I could
# get it to work
# and it did.

Yup, we've seen the same problem.  We're still sort of scratching our
heads at what to do about it.

# Once I got past that, it couldn't link to classes/intclass.obj. After
# looking at the output a bit, I found out that intclass.obj is
# being written
# to the main parrot directory and not into the classes
# directory. The option
# for CL to specify the output file is -Fo and not -o which is what the
# makefile is using. So now I'm trying to grok Configure.pl to
# see if I can
# make that change.

Aha!  I think I (may) know how to deal with this (sort of, at least)
now.

# I'm really just messing around at the moment. I'll probably delete
# everything, check it all out again and try to approach this in a more
# systematic way. I don't know how to make patches (any
# pointers?) so I don't
# know if I'd be able to contribute anything right away but I'd like to.

Ironically, you need a Unix toolkit.  ;^)  Specifically, you need the
'diff' program to generate patches; use -u or -c to generate prettier
patches.  Perl Power Tools (available from the CPAN) has many Unix
tools, including an implementation of 'diff'.

BTW, welcome to Perl6-Internals.  Everybody say hi!

--Brent Dax
[EMAIL PROTECTED]
Configure pumpking for Perl 6

When I take action, I’m not going to fire a $2 million missile at a $10
empty tent and hit a camel in the butt.
--Dubya




RE: Building Parrot for Win32.

2001-11-01 Thread Gregor N. Purdy

Dan --
 
> platform.c and platform.h is exactly what we're going to do. We need a
> platforms directory as well. In there we'll have a win32.[ch], a
> linux.[ch], a vms.[ch], a generic.[ch] and so on. Configure.pl will copy
> the appropriate ones up and rename them platform.c & platform.h, and we'll
> go from there.

OK. I just checked in a first cut at the files. I left time.[hc] as they
are now because I don't know enough about the config stuff to be able to
pull off the switcheroo myself. Hopefully someone who knows it better can
pull that off quickly and we'll have the right stuff for these sleep, time
and etc. problems to just get solved once and for all...

 
> For things like sleep that are not everywhere, I think we need to restrict
> our usage to PP_sleep (for example) and #define that as appropriate in teh
> platform.h file.


Regards,

-- Gregor




RE: Building Parrot for Win32.

2001-11-01 Thread Gregor N. Purdy

Brent (and Jason) --

Based on Dan's agreement to the approach, I just checked in the starting
point files for doing this "right". Please have a look and send patches
against those files. As soon as we get config wired up to autoselect the
appropriate platform files, we'll be able to make this All Just Work (TM).


Regards,

-- Gregor




Instruction timings for Intel/AMD chips?

2001-11-01 Thread Ken Fox

After downloading a dozen PDF files I've given up.
All I need is the approximate cycle counts for
instructions and address modes.

The particular problem I've got now is deciding
which of these three is the fastest:

   movl (%edi,%eax,4),%eax
   movl (%edi,%eax),%eax
   movl (%edi),%eax

Same with:

   movl $1,(%eax,%edx,4)
   movl $1,(%eax,%edi)

According to an old 486 book I have, it claims
that complex addressing modes don't have cycle
penalties for leaving out the scale or the offset.
That seems hard to believe for the RISC-like
P3s and Athalons.

What about other processors? Is it common to
have address modes like:

   base+offset*scale

Most RISC instruction sets only provide base +
constant offset don't they?

Yeah, yeah, I know. Premature optimization is the
root of all evil. Except this isn't premature.
Getting to 50 mops was pretty easy. Getting to 100
mops is a *lot* harder!

- Ken



My own RE implementation

2001-11-01 Thread Brent Dax

I've been playing around with my own RE ops, and I was wondering if
someone can check my logic:

#this should be equivalent to: "afbarz" =~ /f.o*?bar/i
match "afbarz", "i"
RE_1:
goforward RE_END#this gives us the behavior of marching
# forward through the string until we either match or reach the end
literal "f" #any single character
anything#dot
RE_2:
lazyrepeat  #if we hit this, and saveindex has saved 
anything,
# whatever saveindex saved is used as the current index
literal "o"
saveindex   #remember where we are--or if we've failed, 
remember
# that backtracking won't help (except for that 'goforward' at the
beginning)
literal "bar"

backtrack RE_2  #if backtracking is hopeless, make sure you tell the
next op
# that it isn't; if backtracking isn't hopeless, but the match has
failed, jump to RE_2
backtrack RE_1
RE_END:
endre I0#sticks the results into I0

print "Match results: "
print I0

I'm still getting a few problems with my RE ops, so I can't test it yet.

Thanks,
--Brent Dax
[EMAIL PROTECTED]
Configure pumpking for Perl 6

When I take action, I’m not going to fire a $2 million missile at a $10
empty tent and hit a camel in the butt.
--Dubya




RE: Building Parrot for Win32.

2001-11-01 Thread Brent Dax

Jason Diamond:
# > # Once I got past that, it couldn't link to
# classes/intclass.obj. After
# > # looking at the output a bit, I found out that intclass.obj is
# > # being written
# > # to the main parrot directory and not into the classes
# > # directory. The option
# > # for CL to specify the output file is -Fo and not -o which
# is what the
# > # makefile is using. So now I'm trying to grok Configure.pl to
# > # see if I can
# > # make that change.
# >
# > Aha!  I think I (may) know how to deal with this (sort of, at least)
# > now.
#
# Hi, Brent.
#
# I thought I'd try out the diff program that comes with
# cygwin. I've attached
# the diffs for the two files that I had to change to get the
# obj files to go
# to the right directory.
#
# I chose the -ru option because that's what I saw someone else
# do in the
# archive. Is there some sort of naming convention that you
# usually use for
# diff files? I just appended .diff to the original name. I'd
# appreciate any
# tips you might have on the best way to do this so that's easy
# for whoever
# has to apply the diff (I assume that's done with the patch program?).
#
# This is a really simple fix but I just wanted to see if it would work.

Nice.  Thanks, applied.  Your change should show up in CVS in a few
minutes.

--Brent Dax
[EMAIL PROTECTED]
Configure pumpking for Perl 6

When I take action, I’m not going to fire a $2 million missile at a $10
empty tent and hit a camel in the butt.
--Dubya




Re: vmem memory manager

2001-11-01 Thread Michael L Maraist

On Thursday 01 November 2001 09:08 pm, Ken Fox wrote:
> Michael L Maraist wrote:
> [an incredible amount of detailed information that will
>  take me weeks to digest...]
>
> This looks like a malloc/free style allocator. Since the whole
> GC system for Parrot is on the table, you don't have to constrain
> yourself to malloc/free. IMHO free is not needed at all -- we
> should be scavenging entire arenas all at once. I assume you
> want to use malloc to grab an arena, but carving up the arena is
> the GC's job.

The first couple paragraphs summarize what's going on, but malloc and free 
aren't even really even used in this memory manager.  Basically it's an arena 
type scheme (though SUN didn't use the word arena, I don't think), that works 
very well in multi-threading, by breaking the problem into multiple layers so 
as to avoid fragmentation, thread-contention, consolidation, and most 
importantly provides constant-access time (even in the worst possible case).  
There is a generic alloc/free interface in the back-end (which will most 
certainly be necessary when interfacing with c-code), and when utilizing 
dynamic memory sized objects (such as dynamicly resizable arrays), but most 
access is via interpreter specific arena's which minimizes thread-slack-space 
through the involved interaction of layers.  More-over, the "object caching 
scheme" allows complex data-types to avoid repeated destruction / 
construction cycles.  (Though primarily useful within an OS, as was it's 
original target)

As with the relationship to the GC, my vision was that the GC would free 
memory objects as it found them unused (mark/sweep or what-ever), which only 
hands the memory regions over to the deallocator, which intelligently makes 
decisions about what to do with it, and how to appropriately buffer.  
Further, this resource management scheme provides interfaces for reclaiming 
memory when necessary; something obviously needed by a GC.  One key aspect, 
however is that this provides so many features otherwise required by a GC 
that the GC is greatly simplified, and thus lends to greater possibilities.

The main focus is on memory efficiency and performance..

-Michael

-Michael



Re: Building Parrot for Win32

2001-11-01 Thread Gregor N. Purdy

Jason --
 
> I'm trying to build parrot on my Windows 2000 box and am failing when
> compiling core_ops.c with these errors:
> 
> core_ops.c(370) : error C2065: 'S_IRUSR' : undeclared identifier
> core_ops.c(370) : error C2065: 'S_IWUSR' : undeclared identifier
> core_ops.c(370) : error C2065: 'S_IRGRP' : undeclared identifier
> core_ops.c(370) : error C2065: 'S_IROTH' : undeclared identifier
> 
> It looks like you checked those new ops in just a few hours ago. I've
> searched for those symbols but the Microsoft header files don't seem to
> define them anywhere.
> 
> Any suggestions?

Sorry about that. I only have one Windows machine, and its not set up for
development. These are constants from a couple of *nix headers. On the
system I'm using right now, I have this:

  #define __S_IREAD   0400/* Read by owner.  */
  #define __S_IWRITE  0200/* Write by owner.  */
  #define __S_IEXEC   0100/* Execute by owner.  */

in /usr/include/bits/stat.h, and this:

  #define S_IRUSR __S_IREAD   /* Read by owner.  */
  #define S_IWUSR __S_IWRITE  /* Write by owner.  */
  #define S_IXUSR __S_IEXEC   /* Execute by owner.  */
  /* Read, write, and execute by owner.  */
  #define S_IRWXU (__S_IREAD|__S_IWRITE|__S_IEXEC)

  #define S_IRGRP (S_IRUSR >> 3)  /* Read by group.  */
  #define S_IWGRP (S_IWUSR >> 3)  /* Write by group.  */
  #define S_IXGRP (S_IXUSR >> 3)  /* Execute by group.  */
  /* Read, write, and execute by group.  */
  #define S_IRWXG (S_IRWXU >> 3)

  #define S_IROTH (S_IRGRP >> 3)  /* Read by others.  */
  #define S_IWOTH (S_IWGRP >> 3)  /* Write by others.  */
  #define S_IXOTH (S_IXGRP >> 3)  /* Execute by others.  */
  /* Read, write, and execute by others.  */
  #define S_IRWXO (S_IRWXG >> 3)

in /usr/include/sys/stat.h. As for what to includ in Windows to get
these (if anything), or what should be done to get them, I'm unsure.
I suppose for now, you could paste the above into a header file
somewhere with a #ifdef WIN32 around it to get things compiling.

Let me know if you get it working. Patches encouraged.


Regards,

-- Gregor





Re: Instruction timings for Intel/AMD chips?

2001-11-01 Thread Michael L Maraist


> According to an old 486 book I have, it claims
> that complex addressing modes don't have cycle
> penalties for leaving out the scale or the offset.
> That seems hard to believe for the RISC-like
> P3s and Athalons.

x4 is just a bit offset, so it shouldn't be hard to believe that the pentium+ 
micro-ops can handle this just as efficiently (it's just setup overhead).

>
> What about other processors? Is it common to
> have address modes like:
>
>base+offset*scale

I'm not sure, but I thought sparc had a special 3-way add some where.  In any 
case with pipelines as deep as they are these days, I'm doubting that your 
quest to achieve 100 mops for a 3 instruction loop is going to have much 
impact on more practical code (of several hundred or thousand instructions 
per loop).  Unless you're willing to write c-code compilers differently for 
different architectures I doubt you're going to find a universal performance 
tweak.  And I'd definately cringe at the idea of complexifying code for 
special purposes this early in the game.

>
> Most RISC instruction sets only provide base +
> constant offset don't they?

ALPHA and I think SUN provide instructions that are 4, 8 and 16 byte 
multiples of one register.  Again, it's just a simple trick played out in the 
decode phase.

-Michael



Re: vmem memory manager

2001-11-01 Thread Uri Guttman

> "MLM" == Michael L Maraist <[EMAIL PROTECTED]> writes:

  MLM> As with the relationship to the GC, my vision was that the GC
  MLM> would free memory objects as it found them unused (mark/sweep or
  MLM> what-ever), which only hands the memory regions over to the
  MLM> deallocator, which intelligently makes decisions about what to do
  MLM> with it, and how to appropriately buffer.  Further, this resource
  MLM> management scheme provides interfaces for reclaiming memory when
  MLM> necessary; something obviously needed by a GC.  One key aspect,
  MLM> however is that this provides so many features otherwise required
  MLM> by a GC that the GC is greatly simplified, and thus lends to
  MLM> greater possibilities.

dan at his recent talk at boston.pm's tech meeting said he was leaning
towards a copying GC scheme. this would be the split ram in half design
and copy all objects to the other half at CG time. the old half is
reclaimed (not even reclaimed, just ignored!) in one big chunk.

maybe this could be integrated with the vmem system as well. instead of
just freeing all GC objects and letting the vmem system collect and
consolidate, have the GC do a copying collection so that te vmem system
would have only freshly allocated chunks (at the appropriate level, hard
to tell here) to manage. this is not a fully thought out idea but since
vmem will consolidate and free when it can, why not have the
consolidation driven by a copying GC?

just musing,

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
-- Stem is an Open Source Network Development Toolkit and Application Suite -
- Stem and Perl Development, Systems Architecture, Design and Coding 
Search or Offer Perl Jobs    http://jobs.perl.org



Patch for building on Win32.

2001-11-01 Thread Jason Diamond

Here's a patch to get the latest CVS sources to get everything to build on
my Windows box.

Here's what I did:

* Modified Configure.pl, mswin32.pl, and Makefile.in so that the platform
specific files in the platforms directory gets copied to the correct
directories. The Makefile will re-copy the platform.h and .c files if the
original is modified.

* #include platform.h in parrot.h instead of time.h.

* Removed time.h and time.obj from the Makefile. time.h and time.c aren't
being used anymore since that code was moved to linux.c and win32.c.

* Renamed the BOOL typedef in parrot.h to BOOLVAL (inspired by INTVAL and
FLOATVAL). This was conflicting with the BOOL typedef in windows.h. This
required updating several files besides parrot.h.

* Added a DEFAULT_OPEN_MODE to linux.h and win32.h and used that instead of
the missing identifiers (on Windows) in core.ops in the calls to open.

* Added Parrot_sleep and Parrot_setenv to both the linux and win32 platform
files and called those from core.ops. The Linux code for these functions
came from core.ops so they should hopefully work.

After applying these patches, running Configure.pl and then nmake
test_prog.exe works without error (but still a bunch of warnings) for me. I
tried really hard to make sure that it does the same on Linux as well.

I hope you find it useful,
Jason.




diff
Description: Binary data


Re: vmem memory manager

2001-11-01 Thread Michael L Maraist

On Friday 02 November 2001 01:33 am, Uri Guttman wrote:
>
> dan at his recent talk at boston.pm's tech meeting said he was leaning
> towards a copying GC scheme. this would be the split ram in half design
> and copy all objects to the other half at CG time. the old half is
> reclaimed (not even reclaimed, just ignored!) in one big chunk.

Wonder how I missed this talk.  The first thing that occurs to me is 
cache-starvation due to runing through the entire heap periodically.  That's 
going to be ugly, especially in multi-CPU where most everything will wind up 
be flagged as "shared" or even "invalid".  But I'd have to see more details 
to comment further.

> maybe this could be integrated with the vmem system as well. instead of
> just freeing all GC objects and letting the vmem system collect and
> consolidate, have the GC do a copying collection so that te vmem system
> would have only freshly allocated chunks (at the appropriate level, hard
> to tell here) to manage. this is not a fully thought out idea but since
> vmem will consolidate and free when it can, why not have the
> consolidation driven by a copying GC?
> uri

vmem and the object caching layer are independent.  vmem is primarily focused 
on efficiently carving up a resource space in a constant time (via allocated 
hash-tables, free-lists, and segment-spans)  This is definately compatible 
with a copying GC, since the left and right regions would be considered 
separate segment spans.  The GC and vmem would, however, have to be written 
together.  The problem is that vmem wasn't designed for repeated access by 
multiple threads (as say GNU's malloc is).  Hense the object-cache layer.

The object cache layer provides arenas (SUN calls them slabs), which 
facilitate the remainder of perl's current requirements, but it's real 
contribution is in it's magazine, which I believe runs counter to this GC 
scheme.  A magazine is designed so that the next allocation by a thread is 
the LAST (same sized) free from that same thread, which all but requires it 
to pre-exist within the low-level cache.  More-over, you are almost garunteed 
that a free by one thread will not be acquired in an alloc from another 
thread (which would cause a "shared" or "invalid" cache state in multi-CPU 
systems).  From this, temporary allocs / frees (of random sized memory or 
arena-based objects) can be very quick, since caching is addressed.  
Obviously SUNs focus is on scalable memory architectures (since they have 64+ 
CPU machines that do resource allocation in the OS)

If a GC "never" free's, and instead periodically copy's valid memory to 
another segment, then there's no consideration of cache locality.  
EVERY allocation is garunteed to not be present within the cache (since all 
of memory will have been spanned before it's revisited in by an allocation).

What does this mean for performance?  I don't know, because I've never 
benchmarked GC's before.  But I can speculate that for applications that do a 
lot of memory allocation, this GC scheme is going to be hurting, since all of 
(heap) memory must be traversed when memory is starved.

In general, I'm not seeing a whole lot of benifit to using the vmem scheme 
with this style of GC, since the overhead seems to go to waste.  Might as 
well do a stack-carving allocation scheme.  Blindingly fast allocations 
garunteed so long as there's more memory, and there aren't any free's.  
Course any given app is going to use 4 times as much memory as other schemes 
( 2 x what-ever extra is wasted).  Interesting approach thought.  Not to 
mention that unless a separate thread is required for parrot operation, there 
will be massive hickups when the GC is invoked to run through all memory 
(potentially several meg worth every time it doubles in size).

Perhaps an adaptive approach can be used.. When heap memory is <= 1Meg this 
stack-carving technique is used (since copying a meg won't be noticable).  
When > 1Meg, more sophisticated methods with higher overhead but better 
memory efficiency are used.  I'm not terribly excited about vmem (though it's 
kind of cool) if the whole SUN system isn't used together.  And unless it can 
be found to be compatible with SOME GC for larger scale memory managers, I'm 
inclined to scrap it.

Question though.. Are we leaning towards requiring a separate thread for GC 
as with java?

-Michael




RE: Building Parrot for Win32.

2001-11-01 Thread Gregor N. Purdy

Brent --

[ snip Jason Diamond's question ]
 
> No, this seems to be a case of Unix-centrism.  (I feel your pain--I'm on
> Win32 too.)  I'm CCing perl6-internals on this, since I don't really
> have the C experience to know what to do here.

I just posted a reply to someone else on the matter. If someone takes that
and gets things working, I'll accept a patch to make it work for you guys.
 
> In the mean time, if you really want to build Parrot *right now* and
> don't care about the file I/O ops, you can comment out lines 143 and 148
> in the 'core.ops' file and rebuild to get rid of this error.  However,
> there are several other known Win32 build issues, so you'll likely have
> to fiddle with things to get them to work.  (The sleep() op is another
> source of such misery.)

We need to make a Parrot_sleep wrapper that for *nix does what we do now
and for Win32 does (a) nothing or (b) waves the entrails correctly to get
Win32 to sleep for N seconds. Patches encouraged, as always.

We may end up needing to consolidate these platform-isms into a small
number of files (one?) rather than have them split by type (like I did for
Parrot_*_time). I don't know if we can get away with something as simple
as platform.[hc] with all the yuck in one place or whether we need to have
a platform/ directory with platform_*.[hc] in it that get copied at config
time to platform.c and include/parrot/platform.h...


Regards,

-- Gregor





Re: Building Parrot for Win32

2001-11-01 Thread Jason Diamond

Hi.



> in /usr/include/sys/stat.h. As for what to includ in Windows to get
> these (if anything), or what should be done to get them, I'm unsure.
> I suppose for now, you could paste the above into a header file
> somewhere with a #ifdef WIN32 around it to get things compiling.

I'd suspect that those constants are implementation specific so I'd be wary
of using them on platforms where they aren't defined. I'll take a look and
see what cygwin did with these.

Should we start working on platform-specific implementations of these types
of functions? Parrot_open?

> Let me know if you get it working. Patches encouraged.

I got around it by commenting out the lines in core.ops for now. But then
there were other issues to get past. I've either fixed or hacked around
them. Once I get the tools to create patches I'll post them if Brent hasn't
already taken care of them.

Thanks,
Jason.





Re: vmem memory manager

2001-11-01 Thread Ken Fox

Michael L Maraist wrote:
[an incredible amount of detailed information that will
 take me weeks to digest...]

This looks like a malloc/free style allocator. Since the whole
GC system for Parrot is on the table, you don't have to constrain
yourself to malloc/free. IMHO free is not needed at all -- we
should be scavenging entire arenas all at once. I assume you
want to use malloc to grab an arena, but carving up the arena is
the GC's job.

- Ken