# New Ticket Created by  Mike Mattie 
# Please include the string:  [perl #41908]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=41908 >


Hello,

This patch begins the feature enhancement phase of the 
Parrot_locate_runtime_str.

based on: rev 17631

two new static functions are introduced.

* try_load_path:

this helper combines the path_finalization with the stat check for file 
existence.
it is also a good debug point so I have left the bread-crumbs ifdef'd out.

* try_bytecode_extensions

this new function uses an array of extensions trying each one with the 
try_load_path
call. compatibility is preserved by trying the path as given before attempting 
to
guess an extension.

*Parrot_locate_runtime_str:

updated to use the new try_bytecode_extensions whenever the requested type
is not PARROT_RUNTIME_FT_DYNEXT.

Also some tabs that leaked into the file have been removed. I will be more 
vigilant
in the future in regards to this.

status: compile tested, full test-suite ran w/o new failures.

At this point the extension guessing code is functional. I should in all 
likely-hood
code a test file for the extension guessing as it exists. If this is necessary 
simply
hold this ticket.

[phase 2]

At this point a real discussion is needed with the parrot developer community.

The ultimate purpose of these patches is to enable developers to write code that
will work the same way in both their working-copies and the install tree. To do
this the parrot development process will need to be modified. I have thought up
a solution that I believe is extremely un-obtrusive , and possibly a unique 
feature.

background:

If the developers begin removing extensions from their requested files the 
work-cycle
will change to a full-compilation cycle, as the loader will prefer a .pbc file 
over
a .pasm or a .pir.

In discussions on IRC there was a reluctance to switch to this cycle. It does 
take
some of the dynamic flavor out of the development process.

There are two solutions.

1. do a clean during development. only the source files will remain and there 
should
   not be any problems. There could be collisions with a installed tree, 
however the
   second option is a step towards a complete solution.

2. introduce a environment variable, for example: PARROT_LOAD_PREFER

   this variable when set would have two valid values: source|compile.

   if no variable was set, or it was incorrectly set it would default
   to the compile value, giving typical (perl5 for example) behavior
   where a compiled version is always loaded over the source version.

   when the variable would be set to "source" then the reverse would
   happen, and the ".pir" files would be loaded over a .pbc.

   This allows developers to simply export PARROT_LOAD_PREFER="source"
   when developing to guarantee that the loaded files will be
   the source files with their most recent changes.

   by looking at the diff I think you will see the code changes subsequent
   to this patch to implement this will be nearly trivial at this point.

[phase 3] the big win

   assuming that environment variables are a suitable way for parrot
   developers to maintain their current behavior a big step forward
   is now possible.

   the last remaining issue is the difference between the layout
   in the working-copy and the install tree. 

   ie: parrot/library vs. runtime/parrot/library

   I have been developing a perl5 program in my spare time meant to
   address this very problem for my own purposes. but either the idea
   or an adapted version of the code can solve this neatly.

   the idea is that markers are placed in the working-copy tree marking
   installation paths. so in runtime/ some sort of file (MANIFEST ?) would 
indicate
   that the paths below runtime/ are to be preserved in the install
   tree. 

   [if those files listed what needed to be installed writing the install
   target would be a snap, with language-hackers having control over
   their installed files.]

   With this information a perl program can recursively traverse the tree
   and create and add each of these directories with a MANIFEST file to
   the load path ( PARROT_INC ? , I am not sure )

   so if you have your source code in ${HOME}/parrot/ , and there is a
   MANIFEST file in runtime/ , the code would add an environment variable
   like this:

   PARROT_INC="${HOME}/parrot/runtime"

   the perl program can simply dump out this list of environment variables
   on stdout.

   when a perl developer is ready to hack in the tree , he runs the program,
   and source's the output, updating his environment for that session.

   now his path "Digest/MD5" automatically gets a ".pir" tried first because
   of PARROT_LOAD_PREFER="source", and the prefix is the exact same in both
   the working copy and the install tree.

   when the perl developer installs his code, and the environment variables
   are cleared .pbc files are loaded as they should be and the path prefixes
   are the same.

   Extensive release testing is not necessary to maintain the install target
   because things break in the working-copy in the same way they would break
   in the install-tree. This is extremely powerful for development.
  
[leftovers]

   there is a insignificant race, but still a race in the code. This needs
   to be addressed later as a part of fully insulating the src/library.c API.

   Ultimately a struct like this is needed.

   struct load_file {
      FILE* handle,
      parrot_runtime_ft type
   }

   1. Parrot_runtime_locate_runtime_str would be renamed to 
parrot_locate_runtime_file,
      it would return the struct with the handle, the type value would be a 
null/unkown
      value.

      this closes the race by getting rid of a pointless stat() introduced by 
the string
      return value, a result of the poor API insulation. dynext.c replicates 
much of
      Parrot_locate_runtime_str in it's own fashion.

      also this should be opened on unix with O_NOCTTY | O_NOFOLLOW btw. maybe 
even
      fsat for block/char devices too.

    2. a a hueristic routine for detecting parrot_runtime_ft 

       it would take the load_file struct, and do the magic number/heuristic 
checks 
       on the first chunk of the file and determine what runtime_ft_type it is, 
       setting the value in the load_file struct.

       At this point the appropriate loader/infrastructure can be chosen to 
load the
       file.

   also the handling of shared object extensions can be done in the same way of 
   try_bytecode_extensions now.

That's all for now. Thanks again for answering all my questions, and I look 
forward
to any comments or suggestions.

Cheers,
Mike Mattie ([EMAIL PROTECTED])
--- HEAD/src/library.c	2007-03-19 02:35:50.000000000 -0700
+++ parrot-0.4.9.test/src/library.c	2007-03-19 02:32:26.000000000 -0700
@@ -256,6 +256,78 @@
     return join;
 }
 
+#define LOAD_EXT_CODE_LAST 3
+
+static const char* load_ext_code[ LOAD_EXT_CODE_LAST + 1 ] = {
+    ".pbc",
+
+    /* source level files */
+
+    ".pasm",
+    ".past",
+    ".pir",
+};
+
+static STRING*
+try_load_path(Interp *interp, STRING* path) {
+    STRING *final;
+    
+    final = string_copy(interp, path);
+
+#if 0
+    printf("path is \"%s\"\n", 
+           string_to_cstring(interp, final ));
+#endif
+
+    final = path_finalize(interp, final );
+
+    if (Parrot_stat_info_intval(interp, final , STAT_EXISTS)) {
+        return final;
+    }
+    
+    return NULL;
+}
+
+/*
+  guess extensions, so that the user can drop the extensions
+  leaving it up to the build process/install wether or not
+  a .pbc or a .pir file is used.
+ */
+
+static STRING* 
+try_bytecode_extensions (Interp *interp, STRING* path )
+{
+    STRING *with_ext, *result;
+    
+    int guess;
+
+    /*
+      first try the path without guessing ensure compatability with existing code.
+     */
+
+    with_ext = string_copy(interp, path);
+
+    if ( (result = try_load_path(interp, with_ext) ) ) 
+	return result;
+    
+    /*
+      start guessing now. this version tries to find the lowest form of the
+      code, starting with bytecode and working up to PIR. note the atypical
+      loop control. This is so the array can easily be processed in reverse.
+     */
+
+    for( guess = 0 ; guess <= LOAD_EXT_CODE_LAST ; guess++ ) {
+	with_ext = string_copy(interp, path);
+	with_ext = string_append(interp, 
+				 with_ext, const_string(interp, load_ext_code[guess]));
+
+	if ( (result = try_load_path(interp, with_ext)) ) 
+	    return result;
+    }
+    
+    return NULL;
+}
+
 /*
 
 =item C<char* Parrot_locate_runtime_file(Interp *, const char *file_name,
@@ -288,6 +360,11 @@
     INTVAL i, n;
     PMC *paths;
 
+#if 0
+    printf("requesting path: \"%s\"\n", 
+	   string_to_cstring(interp, file ));
+#endif
+
     /* if this is an absolute path return it as is */
     if (is_abs_path(interp, file))
         return file;
@@ -305,23 +382,27 @@
         path = VTABLE_get_string_keyed_int(interp, paths, i);
         if (string_length(interp, prefix) &&
            !is_abs_path(interp,path)) {
- 	    full_name = path_concat(interp, prefix , path );
+	    full_name = path_concat(interp, prefix , path );
         }
         else
             full_name = string_copy(interp, path);
 
- 	full_name = path_append(interp, full_name , file );
+	full_name = path_append(interp, full_name , file );
+
+	full_name = ( type & PARROT_RUNTIME_FT_DYNEXT )
+	    ? try_load_path(interp, full_name )
+	    : try_bytecode_extensions(interp, full_name );
 
-	full_name = path_finalize(interp, full_name );
-        if (Parrot_stat_info_intval(interp, full_name, STAT_EXISTS)) {
+        if ( full_name )
             return full_name;
-        }
     }
 
-    full_name = path_finalize( interp, file );
-    if (Parrot_stat_info_intval(interp, full_name, STAT_EXISTS)) {
-        return full_name;
-    }
+    full_name = ( type & PARROT_RUNTIME_FT_DYNEXT )
+	? try_load_path(interp, file )
+	: try_bytecode_extensions(interp, file );
+
+    if ( full_name )
+	return full_name;
 
     return NULL;
 }

Attachment: signature.asc
Description: PGP signature

Reply via email to