On Tue, 18 Aug 2009, David Green wrote:

On 2009-Aug-17, at 8:36 am, Jon Lang wrote:
Timothy S. Nelson wrote:
      Well, my main thought in this context is that the stuff that can be
done to the inside of a file can also be done to other streams -- TCP
sockets for example (I know, there are differences, but the two are a lot
the same), whereas metadata makes less sense in the context of TCP sockets;

But any IO object might have metadata; some different from the metadata you traditionally get with files, and some the same, e.g. $io.size, $io.times{modified}, $io.charset, $io.type.

        Ok, now you're giving me ideas :).

[snipped a bit and moved it further down the e-mail]
      I guess what I'm saying here is that I think we can do the things
without people having to worry about the objects being separate unless they
care.  So, separate objects, but hide it as much as possible.  Is that
something you're fine with?

Yes -- to me that means some class/role that wraps up all the pieces together, but all the separate components are still there underneath. But I'm not too bothered about how it's implemented as long as it's transparent for casual use.

  my $file = io p[/some/file];
  my $contents = $file.data;
  my $mod-date = $file.times{modified};
  my $size = $file.size;

        That sounds like the kind of thing I'm heading for.

Pathnames still are strings, so that's fine. In fact, there are different
      As for pathnames being strings, you may be right FSVO string.  But
I'd say that, while they may be strings, they're not Str, but they do Str

Agreed, pathnames are "almost" strings, but worth distinguishing conceptually. There should be a URL type that does Str.

Actually, there are other differences, like case-insensitivity and illegal chars. Unfortunately, those depend on the given filesystem. As long as you're dealing with one FS at a time, that's OK; it probably means we have IO::Name::ext3, IO::Name::NTFS, IO::Name::HFS, etc. But what happens when you cross FS-barriers? Does a case-sensitive name match a case-insensitive one? Is filename-equality not commutative or not transitive? If you're looking for a filename "foo" on Mac/Win, then a file actually called "FOO" matches; but on Unix it wouldn't.

(Actually, Macs can do both IO::Name::HFS::case-insensitive and IO::Name::HFS::case-sensitive. Eek.)

        I think it should depend on the set of constraints involved.

I'd like Perl 6's treatment of filenames to be smart enough that smart-matching any of these pairs of "alternative spellings" would result in a successful match. So while I'll agree that filenames are string-like, I really don't want them to _be_ strings.

Well, the *files* are the same, but the pathnames are different. I'm not sure whether some differences in "spelling" should be ignored by default or not. There are actually several different kinds; S32 has a method "realpath", but I think "canonical" is a better name, because aliases can be just as "real" as the canonical path, e.g. a web page with multiple addresses. Or hard links rather than soft links -- though in that case, there is no one "canonical" path. It may not even be possible to easily tell if there is one or not.

Some ways in which different paths can be considered equivalent:
  Spelling: C:\PROGRA~1, case-insensitivity
  Simplification: foo/../bar/ to bar/
  Resolution: of symlinks/shortcuts
  Content-wise: hard links/multiple addresses

Depending on the circumstances, you might want any of those to count as the "same" file; or none of them. We'll need methods for each sort of transformation, $path.canonical, $path.normalize, $path.simplify, etc. Two high-level IO objects are "the same", regardless of path, if $file2 =:= $file2 (which might compare inodes, etc.). There should be a way to set what level of sameness applies in a given lexical scope; perhaps the first two listed above are a reasonable default to start with.

Ok, my next commit will have "canonpath" (stolen directly from p5's File::Spec documentation), which will do "No physical check on the filesystem, but a logical cleanup of a path", and "realpath" (idea taken from p5's Cwd documentation), which will resolve symlinks, etc, and provide an absolute path. Oh, and "resolvepath", which does both. I'm not quite sure I followed all your discussion above -- have I left something out?

Anyway, my assumption is that there should be a number of comparison options. Since we do Str, we should get string comparison for free. But I'm expecting other options at other levels, but have no idea how or what at this point.

There's something that slightly jars me here... I don't like the quotation returning an IO object.
But doesn't normal quoting return a Str object? And regex quoting return an object (Regex? Match? Something, anyway).

Certainly, but a regex doesn't produce a Signature object, say. I don't object to objects, just to creating objects, then doing something with them, then returning another kind of object, and calling that "parsing". If we're parsing the characters, we should end up with an IO::Name. If we end up with an IO::actual-file/stream-whatever, then we should call it something else (like an "io constructor").

According to my last commit, p{} will return a Path object that just stores the path, but has methods attached for accessing all the metadata. But it doesn't do file opening or things like that (unless you use the :T and :B thingies, which read the first block and try to guess whether it's text or binary -- these are in Perl 5 too).

[This bit was further up the e-mail, but I moved it here]
if (path{/path/to/file}.e) {
      @lines = slurp(path{/path/to/file});
}
      (I'm using one of David's suggested syntaxes above, but I'm not
closely attached to it).

I suggested variations along the line of: io "/path/to/file". It amounts to much the same thing, but it's important conceptually to distinguish a pathname from the thing it names. (A path doesn't have a modification date, a file does.) Also, special quoting/escaping could apply to other things, not limited to "filenames". That said, I don't think it's unreasonable to want to combine both operations for brevity, but the io-constructor should have built-in path parsing, not the other way around.

        Did my answer above answer the concerns here?

The difference in our approaches is that you seem keen to integrate
closely the data and the metadata, whereas I'm trying to integrate the paths
and the metadata.

Well, paths are just metadata too, although typically the most important kind. (You could even have an IO without a path or name.) I want a view that integrates all of them, because that's how people ordinarily think about files, unless they have a specific reason not to.

I think we want many of the same things, I'm just expressing them slightly differently. Let's keep working on this, and hopefully we end up with something great.

I was wanting to replace the "glob" language with something more like XPath, but that idea was vetoed by people who didn't want Tree-related objects to be part of the core, so I'm doing that as a library.

I'm all for some tree-related fun(ctions). A tree is basically a hash of hashes, so I'm surprised we don't have a few functions for traversing them and other very basic hashy concepts. But I would like to see XPath-type stuff hashed out [pun intended] anyway -- whether it ends up in a third-party module or not isn't such a big deal when it comes to P6, and somebody will have to figure how to do it in a perlish way eventually.

That would be me. I have some code, but I'm waiting on improvements in Rakudo (and btw, thanks to the Rakudo guys for doing a wonderful job).

if $file.type ~~ MIME("text/plain") {...}
Cool idea.  How would the type be determined?  Are you thinking of
the algorithms in the unix "file" utility?  Please tell me you're not
planning to use filename extentions -- that's bad :).
Wouldn't $file.type be metadata?

Yes; and yes, filename extensions are evil, but of course thanks to primitive filesystems, we're stuck with them to a large extent. And there's no perfect solution, but it would be useful for Perl to stick as closely as the FS/OS's idea of types as it can. Sometimes that would mean looking up an extension; it might mean using (or emulating) "file" magic; it might mean querying the FS for a MIME-type or a UTI. After all, the filename extension may not actually match the correct type of the file.

My suggestion would be that it's an interesting idea, but should maybe be left to a module, since it's not a small problem. Of course, I'm happy to be overruled by a higher power :). I'd like the feature, I'm just unsure it deserved core status.

        Anyway, HTH,


---------------------------------------------------------------------
| Name: Tim Nelson                 | Because the Creator is,        |
| E-mail: wayl...@wayland.id.au    | I am                           |
---------------------------------------------------------------------

----BEGIN GEEK CODE BLOCK----
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- PE(+) Y+>++ PGP->+++ R(+) !tv b++ DI++++ D G+ e++>++++ h! y-
-----END GEEK CODE BLOCK-----

Reply via email to