Following the request for ideas on IO, this is my wish list for working with files. I am not a perl guru and so I do not claim to be able to write specifications. But I do know what I would like.
The organisation of the IO as roles seems to be a great idea. I think 
that what is suggested here would fall in naturally with that idea.
Suggestions:

a) I am fed up with writing something like

open(FP, “>${fname}_out.txt”) or die “Cant open ${fname}_out.txt for writing\n”;
The complex definition of the filename is only to show that it has to be 
restated identically twice.
Since the error code I write (die "blaa") is always the same, surely it 
can be made into a default that reports on what caused the die and 
hidden away as a default pointer to code that can be overridden if the 
programmer wants to.
b) Why do I have to 'open' anything? Surely when software first 
identifies a File object (eg., names it) that should be sufficient 
signal to do all the IO things. So, I would love to write
my File $file .= new(:name<mydatafile.txt>);

my File $output .=new(:name<myresults.txt>, :mode<write>);

and then:

while $file.read {…};

or:

say “Hello world” :to<$output>;

The defaults would include error routines that die if errors are encountered, read as the default mode, and a text file with EndOfLine markers as the file type. Obviously, other behaviours, such as not dying, but handling the lack of a file with a request to choose another file, could be accommodated by overridding the appropriate role attribute.
The suggestion here is that the method "say" on a File object is 
provided in a role and has some attributes, eg., $.error_code, that can 
be assigned to provide a different behaviour.
c) I want the simplest file names for simple scripts. As Damian Conway 
has pointed out, naming a resource is a can of worms. I work with 
Cyrillic texts and filenames and still have problems with the varieties 
of char sets. Unicode has done a lot, but humans just keep pushing the 
envelop of what is possible. I don't think there will ever be a 
resolution until humanity has a single language and single script.
It seems far better to me for standard resource names to be constrained 
to the simplest possible for 'vanilla' perl scripts, but also to let the 
programmer access the underlying bit/byte string so they can do what 
they want if they understand the environment.
The idea of 'stringification', that is providing to the programmer for 
use inside the program a predictable representation of a complex object, 
also seems to me to be something to exploit. In the case of a resource 
name, the one most easily available to the programmer would be a 
'stringified' version of the underlying stream of bytes used by the 
operating system.
Eg. if a File object located in some directory under some OS would have 
both $file.name as a unicode representation and a $file.underlying_name 
with some arbitrary sequence of bits with a semantics known only to the 
OS (and the perl implementation).
d) It would be nice to specify filters on the incoming and outgoing 
data. I find I do the following all the time in perl5:
while (<FN>) {chop; …};

So my example above, viz.,

while $file.read { … };

would automatically provide $_ with a line of text with the EOL chopped off.

Note that the reverse (adding an EOL on output) is so common that perl6 now has 'say', which does this.
Could this behaviour (filtering off and on the EOL) be made a part of 
the standard “read” and “say” functions?
Allowing access to the filter function (allowing a programmer the 
ability to override an attribute) could be quite useful. For example, 
suppose the role providing getline includes an attribute with default
$.infilter = { s/\n// }; # a good implementation would have different 
rules for different OS's
and this can be overridden with

$.infilter = { .trans ( /\s+/ => ' ' ) }; # squash all white space to a single space
or
$.infilter = { s/\n//; split /\t/ };

then a call to $file.read would assign an array to $_ ( or would it be @_ ?)
Filtering the outgoing data would be similar to using a format string 
with printf, but associating it with the IO object rather than with a 
specific printf statement. Thus suppose instead of a file, the IO object 
is a stream associated with the internet and the role that provides 
“say” as a method on a stream object has $.outfiler as an attribute, 
then overidding
$.outfilter = { s[(.*)] = “$1\n” };

with

$.outfilter = { s[(.*)] = “<html><body>$1</body></html>” }

would mean (I think) that

say “hello world” :to<$stream>;

would generate the http stream
<html><body>Hello World</body></html>
(Yes I know, the space should be coded, but hopefully the idea is clear.)

e) When dealing with files in directories in perl5 under linux, I need

opendir(DIR,'./path/') or die “cant open ./path/\n”;

my @filelist = grep { /^.+\.txt/ } readdir(DIR);

I would prefer something like

my Location $dir .= new(:OSpath<'./data'>);

and without any further code $dir contains an Array ($d...@elems) or Hash ($dir.%elems) (I dont know which, maybe both?) of File objects. If a Hash, then the keys would be the stringified .name attribute of the files.
No need to opendir or readdir. Lazy evaluation could handle most 
situations, but where the Location could be constantly changing its 
contents, a $dir.refresh() method might suffice.
f) In general on directories, I am sure a variety of solutions could be 
conceived. It seems to me that abstractly any form of directory could be 
thought of as a Location, which has some path defined for it (though the 
syntax of the path is OS dependent), and which might have children 
locations. At a minimum, a Location would need to provide information 
about whether new resources could be created or accessed (eg., read / 
write permissions).
There are various paradigms for defining how to traverse networks. At 
some point, our language legislators will need to define one for perl6.
If the name of the location node, which can be exposed to the user, eg., 
by printing it or showing it in a GUI to be clicked on, is separated 
from the OS/locale-dependent underlying_name (which may not be easily 
displayed on a standard GUI – suppose it is in ancient Buriyat), then 
identifying nodes and traversing a network of nodes could be made 
abstract enough to handle all sorts of environments.
Perhaps, too a module for a specific environment, eg., Windows, would 
provide the syntatic sugar that makes specifying a location look like 
specifying a directory natively, eg.
use IO::Windows;
my Location $x .= new(:OSpath<C:\\Documents\perldata\>);
whilst for linux it would be
use IO::Linux;
my Location $x .=new(:OSpath</home/perldata/>);

This started as short wish list and got far too long. Sorry

Reply via email to