Re: Filename literals

Timothy S. Nelson Tue, 18 Aug 2009 06:17:33 -0700

On Tue, 18 Aug 2009, David Green wrote:

On 2009-Aug-17, at 8:36 am, Jon Lang wrote:
Timothy S. Nelson wrote:
      Well, my main thought in this context is that the stuff that can be
done to the inside of a file can also be done to other streams -- TCP
sockets for example (I know, there are differences, but the two are a lot
the same), whereas metadata makes less sense in the context of TCPsockets;
But any IO object might have metadata; some different from the metadata youtraditionally get with files, and some the same, e.g. $io.size,$io.times{modified}, $io.charset, $io.type.


        Ok, now you're giving me ideas :).

[snipped a bit and moved it further down the e-mail]

      I guess what I'm saying here is that I think we can do the things
without people having to worry about the objects being separate unlessthey
care.  So, separate objects, but hide it as much as possible.  Is that
something you're fine with?
Yes -- to me that means some class/role that wraps up all the piecestogether, but all the separate components are still there underneath. ButI'm not too bothered about how it's implemented as long as it's transparentfor casual use.
  my $file = io p[/some/file];
  my $contents = $file.data;
  my $mod-date = $file.times{modified};
  my $size = $file.size;


        That sounds like the kind of thing I'm heading for.

Pathnames still are strings, so that's fine. In fact, there aredifferent
      As for pathnames being strings, you may be right FSVO string.  But
I'd say that, while they may be strings, they're not Str, but they do Str
Agreed, pathnames are "almost" strings, but worth distinguishingconceptually. There should be a URL type that does Str.
Actually, there are other differences, like case-insensitivity and illegalchars. Unfortunately, those depend on the given filesystem. As long asyou're dealing with one FS at a time, that's OK; it probably means we haveIO::Name::ext3, IO::Name::NTFS, IO::Name::HFS, etc. But what happens whenyou cross FS-barriers? Does a case-sensitive name match a case-insensitiveone? Is filename-equality not commutative or not transitive? If you'relooking for a filename "foo" on Mac/Win, then a file actually called "FOO"matches; but on Unix it wouldn't.
(Actually, Macs can do both IO::Name::HFS::case-insensitive andIO::Name::HFS::case-sensitive. Eek.)


        I think it should depend on the set of constraints involved.

I'd like Perl 6's treatment of filenames to be smart enough thatsmart-matching any of these pairs of "alternative spellings" would resultin a successful match. So while I'll agree that filenames are string-like,I really don't want them to _be_ strings.
Well, the *files* are the same, but the pathnames are different. I'm notsure whether some differences in "spelling" should be ignored by default ornot. There are actually several different kinds; S32 has a method"realpath", but I think "canonical" is a better name, because aliases can bejust as "real" as the canonical path, e.g. a web page with multipleaddresses. Or hard links rather than soft links -- though in that case,there is no one "canonical" path. It may not even be possible to easily tellif there is one or not.
Some ways in which different paths can be considered equivalent:
  Spelling: C:\PROGRA~1, case-insensitivity
  Simplification: foo/../bar/ to bar/
  Resolution: of symlinks/shortcuts
  Content-wise: hard links/multiple addresses
Depending on the circumstances, you might want any of those to count as the"same" file; or none of them. We'll need methods for each sort oftransformation, $path.canonical, $path.normalize, $path.simplify, etc. Twohigh-level IO objects are "the same", regardless of path, if $file2 =:=$file2 (which might compare inodes, etc.). There should be a way to set whatlevel of sameness applies in a given lexical scope; perhaps the first twolisted above are a reasonable default to start with.

Ok, my next commit will have "canonpath" (stolen directly from p5'sFile::Spec documentation), which will do "No physical check on the filesystem,but a logical cleanup of a path", and "realpath" (idea taken from p5's Cwddocumentation), which will resolve symlinks, etc, and provide an absolutepath. Oh, and "resolvepath", which does both. I'm not quite sure I followedall your discussion above -- have I left something out?

Anyway, my assumption is that there should be a number of comparisonoptions. Since we do Str, we should get string comparison for free. But I'mexpecting other options at other levels, but have no idea how or what at thispoint.

There's something that slightly jars me here... I don't like thequotation returning an IO object.
But doesn't normal quoting return a Str object? And regex quoting returnan object (Regex? Match? Something, anyway).
Certainly, but a regex doesn't produce a Signature object, say. I don'tobject to objects, just to creating objects, then doing something with them,then returning another kind of object, and calling that "parsing". If we'reparsing the characters, we should end up with an IO::Name. If we end up withan IO::actual-file/stream-whatever, then we should call it something else(like an "io constructor").

According to my last commit, p{} will return a Path object that juststores the path, but has methods attached for accessing all the metadata. Butit doesn't do file opening or things like that (unless you use the :T and :Bthingies, which read the first block and try to guess whether it's text orbinary -- these are in Perl 5 too).


[This bit was further up the e-mail, but I moved it here]

if (path{/path/to/file}.e) {
      @lines = slurp(path{/path/to/file});
}
      (I'm using one of David's suggested syntaxes above, but I'm not
closely attached to it).
I suggested variations along the line of: io "/path/to/file". It amounts tomuch the same thing, but it's important conceptually to distinguish apathname from the thing it names. (A path doesn't have a modification date,a file does.) Also, special quoting/escaping could apply to other things,not limited to "filenames". That said, I don't think it's unreasonable towant to combine both operations for brevity, but the io-constructor shouldhave built-in path parsing, not the other way around.


        Did my answer above answer the concerns here?

The difference in our approaches is that you seem keen to integrate
closely the data and the metadata, whereas I'm trying to integrate thepaths
and the metadata.
Well, paths are just metadata too, although typically the most importantkind. (You could even have an IO without a path or name.) I want a viewthat integrates all of them, because that's how people ordinarily think aboutfiles, unless they have a specific reason not to.

I think we want many of the same things, I'm just expressing themslightly differently. Let's keep working on this, and hopefully we end upwith something great.

I was wanting to replace the "glob" language with something more likeXPath, but that idea was vetoed by people who didn't want Tree-relatedobjects to be part of the core, so I'm doing that as a library.
I'm all for some tree-related fun(ctions). A tree is basically a hash ofhashes, so I'm surprised we don't have a few functions for traversing themand other very basic hashy concepts. But I would like to see XPath-typestuff hashed out [pun intended] anyway -- whether it ends up in a third-partymodule or not isn't such a big deal when it comes to P6, and somebody willhave to figure how to do it in a perlish way eventually.

That would be me. I have some code, but I'm waiting on improvementsin Rakudo (and btw, thanks to the Rakudo guys for doing a wonderful job).

if $file.type ~~ MIME("text/plain") {...}
Cool idea.  How would the type be determined?  Are you thinking of
the algorithms in the unix "file" utility?  Please tell me you're not
planning to use filename extentions -- that's bad :).
Wouldn't $file.type be metadata?
Yes; and yes, filename extensions are evil, but of course thanks to primitivefilesystems, we're stuck with them to a large extent.And there's no perfectsolution, but it would be useful for Perl to stick as closely as the FS/OS'sidea of types as it can. Sometimes that would mean looking up an extension;it might mean using (or emulating) "file" magic; it might mean querying theFS for a MIME-type or a UTI. After all, the filename extension may notactually match the correct type of the file.

My suggestion would be that it's an interesting idea, but should maybebe left to a module, since it's not a small problem. Of course, I'm happy tobe overruled by a higher power :). I'd like the feature, I'm just unsure itdeserved core status.


        Anyway, HTH,


---------------------------------------------------------------------
| Name: Tim Nelson                 | Because the Creator is,        |
| E-mail: [email protected]    | I am                           |
---------------------------------------------------------------------

----BEGIN GEEK CODE BLOCK----
Version 3.12

GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V-PE(+) Y+>++ PGP->+++ R(+) !tv b++ DI++++ D G+ e++>++++ h! y-

-----END GEEK CODE BLOCK-----

Re: Filename literals

Reply via email to