Re: being smart about script structure

Philip Potter Sat, 12 Dec 2009 01:29:33 -0800

2009/12/11 Bryan R Harris <bryan_r_har...@raytheon.com>:
>>> Seems like a waste to do step 2 in a subroutine since we only do it once,
>>> but it does fill the main body of the script with code-noise that makes it
>>> harder to debug overall logic problems...  Not much logic here, but
>>> certainly in more complex scripts.
>>
>> A waste of what exactly? You don't have a limited budget of "sub" keywords.
>
> I guess I figured you had to build in structure (variables?) to be able to
> pass things back and forth from subroutines that normally you wouldn't have
> to set up.
>
> For example, if I'm populating a complex variable @d with lots of pointers,
> hashes, arrays, etc. within, if I populate that within a subroutine, how do
> I get it back out conveniently without it making a whole nother copy of it
> outside?  If it's 500 MB, isn't that horribly inefficient?  Plus, I have to
> keep track of it.


Perl doesn't have pointers; it has references. They are similar but
not the same. Anyway populate_x looks like this:

sub populate_x {
  return [
    # Big complex data structure
    # ...
    # ...
    # ...
    # ...
  ];
}

There is no large scale copying; the [] construct creates a new array
and returns a reference to it. You just return a single scalar value
from the function to the calling context. Kind of like C

int *populate_x (void) {
    int *foo = malloc(X_SIZE * sizeof *foo);
    if (!foo) return NULL;
    /* Do something to populate *foo */
    return foo;
}

but in C this is dangerous and hairy because you're mucking around
with pointers and raw blocks of memory while in Perl you're dealing
just with builtin datatypes.

If you want to populate an array instead:

my @z = populate_z();

# ...and later...

sub populate_z {
  return ( # note parens, not square brackets
    # Big complex data structure
    # ...
    # ...
    # ...
    # ...
  );
}

then this will theoretically copy the entire list; but if the list
contains references it will only copy the references, and not the
things referred to. So it's still not a "deep" copy, but it's deeper
than the reference-based populate_x.

>> Subroutines are not just about code reuse. Which is more readable:
>>
>> my $x = [
>>   # Big complex data structure
>>   # ...
>>   # ...
>>   # ...
>>   # ...
>> ];
>>
<snip>
>>
>> my $x = populate_x();

<snip>
>
> I guess my only struggle there is that any perl person can read the perl
> code.  At first glance, I have no clue what "populate_x()" does because you
> gave it a name that's not in the camel book.

You can't write any meaningful program with just builtins; sooner or
later you have to use a subroutine. And a reader won't know exactly
what a function does just by looking at a call to it; but the reader
*shouldn't have to*. If the subroutine is hiding information the
reader needs, perhaps it was a poor choice of code to put in a
subroutine. But consider this:

use CGI;
my $cgi = CGI->new();

I don't know exactly what this function call did. I know that it
create a CGI object and populated it with initial data. I don't know
what that object is -- it could be a blessed hash, and blessed array,
even a blessed scalar ref. I don't know what initial data I got. But I
*don't have to care*, because all I care about is that I have an
object which will let me do things like this:

print $cgi->header,
         $cgi->start_html('hello world'),
         $cgi->h1('hello world'),
         $cgi->end_html;

And as long as these subroutines understand the low-level details of
what the $cgi object contains, I don't have to; because I'm dealing
only at the higher level, and the subroutines take care of all the
messy details for me. [At a deeper level, because I am isolated from
these details, the designer of CGI is free to totally change the
detailed implementation; and because he controls the only subroutines
which ever deal with a CGI object's implementation, he knows that
nobody's code will break.]

I know what these subroutines do through two means: the name and the
documentation. The name is very important; it needs to speak the
caller's language and be a short, clear summary of what the subroutine
does. The documentation is even more important: it is the subroutine
author's guarantee to you of what it does.

As a result, it might well be a mistake to write this:

my $x = populate_x();

for (@$x) {
    # Big
    # Complex
    # Operation
    # On
    # $_
}

because in this case, you've hidden how $x was initialized, and then
you've gone and written code which the reader can't understand without
knowing the details of what $x contains. But if the reader sees this:

my $x = create_x();

for my $p (@$x) { # The fact that x is an arrayref is noted in
create_x's documentation
   # but we don't have to care about the exact structure of the contents
   # of the array; they might be further arrayrefs or hashrefs or simple
   # scalar values; but we know that we can call process_p on them:
   process_p($p);
}

then the reader doesn't know what $x contains exactly, but they *don't
have to*. We make the details irrelevent, and hide them, so that the
reader can see the high-level ideas being thrown around.

In this case, it's a good bet that create_x and process_p might become
a module. If we define $p to be a Widget and $x to be an array of
Widgets, we might end up with:

package Widget;
sub create_widget_array {
   my $num = shift || 10; # allow user to specify number of Widgets
   my @x;
   for (1..$num) {
      push @x, create_widget(); # subroutine defined elsewhere
   }
   return \...@x;
}

sub process_widget {
   my $widget = shift;
   # process $widget in some way
   # we can choose whether to modify in place or to make a copy
   # and return that
   # this decision will have to be documented
   return $widget;
}

>>> Any suggestions?  Where can I read more on this stuff?  What questions
>>> should I be asking that I'm not smart enough to ask?
>>
>> The best that I can suggest is to read other people's code and ask
>> other people to read your code. Think of a specific example and come
>> up with a plan of how you would code it, and ask for criticism.
>
> No other perl programmers here, unfortunately.  Good advice, though.

Why don't you post your ideas here for criticism then? I wouldn't post
an entire several hundred line script, but you could post your
specification and your plan for writing a code which met said
specification. If you're following a tutor, you could do the exercises
and post your answers here for criticism.

Phil

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: being smart about script structure

Reply via email to