r24088 - docs/Perl6/Spec
Author: ruoso Date: 2008-11-27 14:20:22 +0100 (Thu, 27 Nov 2008) New Revision: 24088 Added: docs/Perl6/Spec/S07-iterators.pod Log: [spec] Adding the first sketches on S07, thanks to wayland76++ Added: docs/Perl6/Spec/S07-iterators.pod === --- docs/Perl6/Spec/S07-iterators.pod (rev 0) +++ docs/Perl6/Spec/S07-iterators.pod 2008-11-27 13:20:22 UTC (rev 24088) @@ -0,0 +1,256 @@ +=encoding utf8 + +=head1 Title + +Synopsis 7: Iterators and Laziness + +=head1 Version + + Maintainer:??? + Contributions: Tim Nelson <[EMAIL PROTECTED]> +Daniel Ruoso <[EMAIL PROTECTED]> + Date: 27 Nov 2008 + Last Modified: 27 Nov 2008 + Version: 1 + +=head1 Laziness and Eagerness + +As we all know, one of the primary virtues of the Perl programmer is +laziness. This is also one of the virtues of Perl itself. However, +Perl knows better than to succumb to false laziness, and so is eager +sometimes, and lazy others. Perl defines 4 levels of laziness for +Iterators: + +=over + +=item Strictly Lazy + +Does not evaluate anything unless explictly required by the user, +including not traversing non-lazy objects. + +=item Mostly Lazy + +Try to obtain available items without causing eager evaluation of +other lazy objects. + +=item Mostly Eager + +Obtain all items, but does not try to eagerly evaluate when known to +be infinite. + +=item Strictly Eager + +Obtain all items, fail in data structures known to be infinite. + +=back + +It's important to realize that the responsability of determining the +level of lazyness/eagerness in each operation is external to each lazy +object, the runtime, depending on which operation is being performed +is going to assume the level of lazyness and perform the needed +operations to apply that level. + +=head2 The lazyness level of some common operations + +=over + +=item List Assignment: my @a = @something; + +In order to provide p5-like behavior in list assignment, this +operation is performed in the Mostly Eager level, meaning that if you do + + my @a = grep { ... }, @b; + +the grep will be evaluated as long as @b is not infinite. + + my @a = grep { ... }, 1, 2, 3, 4..* + +will give grep an infinite list (even if the first elements are +known), therefore it will also be lazy. On the other hand + + my @a = grep { ... }, 1, 2, 3, 4; + +will be eagerly evaluated. + +=item Feed operators: my @a <== @something; + +The feed operator is strictly lazy, meaning that no operation should +be performed before the user requests any element from @a. That's how + + my @a <== grep { ... } <== map { ... } <== grep { ... }, 1, 2, 3 + +is completely lazy, even if 1,2,3 is a fairly small known compact +list. + +=back + +But it's important to notice that eagerness takes precedence over +lazyness, meaning that + + my @a = grep { ... } <== map { ... } <== grep { ... }, 1, 2, 3 + +Will be eagerly evaluated, but that is still different from + + my @d = 1,2,3; + my @c = grep { ... }, @d; + my @b = map { ... }, @c; + my @a = grep { ... }, @b; + +Because in the first, the processing would be made as a flow, avoiding +the creation of the intermediary eager lists that the second example +creates. On the other hand + + my @d <== 1,2,3; + my @c <== grep { ... }, @d; + my @b <== map { ... }, @c; + my @a = grep { ... }, @b; + +provides the same lazyness level of the first example. + +=head1 The Iterator Role + +The iterator role represents the lazy access to a list, walking +through a data structure (list, tree whatever), feeds (map, grep etc) +or and, each time it is called, will return one (or more) of the nodes +from the data structure. + +When an iterator runs out of items, it will throw an exception. + +The iterator role has the following methods: + +=head2 method prefix:<=> {...} + +Returns something appropriate depending on the context: + +=head2 method new(Laziness => $laziness, Context => $context) {...} + +Creates the iterator, with appropriate laziness defaults. + +=head2 method SetLaziness($laziness) {...} + +Set the Laziness + +=head2 method SetContext($context) {...} + +Set the Context (intended only for coercions, not user use) + +=head2 Iterator Summary + +=begin code + +role Iterator { +has $!laziness; +has $!context; + +# enforces item context +method FETCH() {...} +# returns a list +method List() {...} +# returns a slice +method Slice() {...} +# returns the capture of the next iteration +method prefix:<=> { +given $self.context { +when 'Item' { return self.FETCH() } # called in item context +when 'Slice' { return self.Slice() } # called in slice context +when * { return self.List() } # called in list context (or any other context) +} +} +# Creates a new iterator; can be called with no parameters, and chooses sensible defaults +method new(Laziness => $laz
Re: Files, Directories, Resources, Operating Systems
In-Reply-To: Message from Mark Overmeer <[EMAIL PROTECTED]> of "Thu, 27 Nov 2008 08:23:50 +0100." <[EMAIL PROTECTED]> >* Tom Christiansen ([EMAIL PROTECTED]) [081126 23:55]: >> On "Wed, 26 Nov 2008 11:18:01 PST."--or, for backwards compatibility, >> at 7:18:01 p.m. hora Romae on a.d. VI Kal. Dec. MMDCCLXI AUC, >> Larry Wall <[EMAIL PROTECTED]> wrote: >> SUMMARY: I've been looking into this sort of thing lately (see p5p), >> and there may not even *be* **a** "right" answer. The reasons >> why take us into an area we've traditionally avoided. > What a long message... It *was*? That was approaching a medium in my epistolary (and RFC) world, the one unrelated to PostIt notes. I can therefore see you've never been FMTEYEWTK'd, and thus also to all outward appearances, we've not made each other's acquaintance. I'm tchrist; pleased to meet you. Read the //www.unicode.org/reports/tr10/ treatise, as I have repeatedly done, and you will quickly reassess your length calls. This is not necessarily a good thing. Neal Stephenson can do the same, and of far lesser utility. --tom
Re: Files, Directories, Resources, Operating Systems
Just as a variable name in perl6 must conform to a standard and abide by a set of constraints, why should file or other resource names be an exception? The constraints on variable names in perl6 are very flexible, but there are some rules that must be enforced for a program to work. It seems to me that resource (eg. file) names too should also be constrained so that software portability can be ensured. A reasonably constructed set of constraints for the perl6 core should deal with most locale/OS/character set considerations, and where a particular environment cannot cope, then a module will be needed to "eigenmunge" the names appropriately. Suppose for the sake of argument we state that resource names in perl6 shall comply with the rules for variable names; and the sort sequence of such names is the one defined for unicode strings. Where software in perl6 is written for a specific domain, eg. Catalan or Russian, the programmer will know more about the domain and how to deal with resource names in that locale. This would include sort sequences and the complexities Tom outlined. Such things would be relegated to OS / domain specific modules. Would this help? Tom Christiansen wrote: In-Reply-To: Message from Darren Duncan <[EMAIL PROTECTED]> of "Wed, 26 Nov 2008 19:34:09 PST." <[EMAIL PROTECTED]> Tom Christiansen wrote: I believe database folks have been doing the same with character data, but I'm not up-to-date on the DB world, so maybe we have some metainfo about the locale to draw on there. Tim? AFAIK, modern databases are all strongly typed at least to the point that the values you store in and fetch from them are each explicitly character data or binary data or numbers or what-have-you; and so, when you are dealing with a DBMS in terms of character data, it is explicitly specified somewhere (either locally for the data or globally/hardcoded for the DBMS) that each value of character data belongs to a particular character repertoire and text encoding, and so the DBMS knows what encoding etc the character data is in, or at least it treats it consistently based on what the user said it was when it input the data. Oh, good then. That's what I'd heard was happening, but wasn't sure since I've steared clear of such beasties since before it was true. I wish our filesystems worked that way. But Andrew said something to me last week about Ken and Dennis writing quite pointedly that while you *could* use the f/s as a database, that you *shouldn't*. I didn't know the reference he was thinking of, so just nodded pensively (=thoughtfully). There is ABSOLUTELY NO WAY I've found to tell whether these utf-8 string should test equal, and when, nor how to order them, without knowing the locale: "RESUME", "Resume" "resume" "Resum\x{e9}" "r\x{E9}sum\x{E9}" "r\x{E9}sume\x{301}" "Re\x{301}sume\x{301}" Case insensitively, in Spanish they should be identical in all regards. In French, they should be identical but for ties, in which case you work your way right to left on the diactricals. This leads me to talk about my main point about sensitivity etc. I believe that the most important issues here, those having to do with identity, can be discussed and solved without unduly worrying about matters of collation; It's funny you should say that, as I could nearly swear that I just showed that identify cannot be determmined in the examples above without knowing about locales. To wit, while all of those sort somewhat differently, even case-insensitively, no matter whether you're thinking of a French or a Spanish ordering (and what is English's, anyway?), you have a a more fundadmental = vs != scenario which is entirely locale-dependent. If I can make a "RESUME" file, ought I be able to make a distcint "r\x{E9}sum\x{E9}" or "re\x{301}sume\x{301}" file in a case-ignorant filesystem? There is no good answer, because we might think it reasonable to lc(strip_marks($old_fn)) eq lc(strip_marks($new_fn)) Theee problem of what is or is not a "mark" varies by locale, * Castilian doesn't think ~ is a mark; Portuguese does, and so if you strip marks, you in Castilian count as the same two letters that it deems disinct, but in Portuguese, you incur no lasting harm. * Catalan doesn't think ¸ is a mark; French does. and so if you strip marks, you in Catalan count as the same two letters that it deems disinct, but in French or Portuguese, you incur no lasting harm. * Modern English (usually) decomposes æ into a+e, but OE/AS and Icelandic do not. * Moreover, Icelandic deems é and e to be completely different letters altogether. If you strip marks, you count as the same letters that that language does not. Similarly with ö, which is at the end of their alphabet, (like ø in some), and
Re: Files, Directories, Resources, Operating Systems
In-Reply-To: Message from Darren Duncan <[EMAIL PROTECTED]> of "Wed, 26 Nov 2008 19:34:09 PST." <[EMAIL PROTECTED]> > Tom Christiansen wrote: >> I believe database folks have been doing the same with character data, but >> I'm not up-to-date on the DB world, so maybe we have some metainfo about >> the locale to draw on there. Tim? > AFAIK, modern databases are all strongly typed at least to the point > that the values you store in and fetch from them are each explicitly > character data or binary data or numbers or what-have-you; and so, > when you are dealing with a DBMS in terms of character data, it is > explicitly specified somewhere (either locally for the data or > globally/hardcoded for the DBMS) that each value of character data > belongs to a particular character repertoire and text encoding, and so > the DBMS knows what encoding etc the character data is in, or at least > it treats it consistently based on what the user said it was when it > input the data. Oh, good then. That's what I'd heard was happening, but wasn't sure since I've steared clear of such beasties since before it was true. I wish our filesystems worked that way. But Andrew said something to me last week about Ken and Dennis writing quite pointedly that while you *could* use the f/s as a database, that you *shouldn't*. I didn't know the reference he was thinking of, so just nodded pensively (=thoughtfully). >> There is ABSOLUTELY NO WAY I've found to tell whether these utf-8 >> string should test equal, and when, nor how to order them, without >> knowing the locale: >> >> "RESUME", >> "Resume" >> "resume" >> "Resum\x{e9}" >> "r\x{E9}sum\x{E9}" >> "r\x{E9}sume\x{301}" >> "Re\x{301}sume\x{301}" >> Case insensitively, in Spanish they should be identical in all regards. >> In French, they should be identical but for ties, in which case you >> work your way right to left on the diactricals. > This leads me to talk about my main point about sensitivity etc. > I believe that the most important issues here, those having to do with > identity, can be discussed and solved without unduly worrying about > matters of collation; It's funny you should say that, as I could nearly swear that I just showed that identify cannot be determmined in the examples above without knowing about locales. To wit, while all of those sort somewhat differently, even case-insensitively, no matter whether you're thinking of a French or a Spanish ordering (and what is English's, anyway?), you have a a more fundadmental = vs != scenario which is entirely locale-dependent. If I can make a "RESUME" file, ought I be able to make a distcint "r\x{E9}sum\x{E9}" or "re\x{301}sume\x{301}" file in a case-ignorant filesystem? There is no good answer, because we might think it reasonable to lc(strip_marks($old_fn)) eq lc(strip_marks($new_fn)) Theee problem of what is or is not a "mark" varies by locale, * Castilian doesn't think ~ is a mark; Portuguese does, and so if you strip marks, you in Castilian count as the same two letters that it deems disinct, but in Portuguese, you incur no lasting harm. * Catalan doesn't think ¸ is a mark; French does. and so if you strip marks, you in Catalan count as the same two letters that it deems disinct, but in French or Portuguese, you incur no lasting harm. * Modern English (usually) decomposes æ into a+e, but OE/AS and Icelandic do not. * Moreover, Icelandic deems é and e to be completely different letters altogether. If you strip marks, you count as the same letters that that language does not. Similarly with ö, which is at the end of their alphabet, (like ø in some), and nowhere near o or ó. BTW, those are three separate letters, not variants. * And in OE/AS you could have a long mark on an asc (say "ash" for the atomic *letter* æ). If split into a and e and stripped of marks, it woudn't make any sense at all. Case in point: Ælene Frisch, whom many of you doubtless know, insists her name be spelt as I have written it. She does not want Aelene Frish, for she considers her forename to have 5 letters in it, not 6. But Unicode doesn't give us a title case version of that (did AS?), suggesting it a ligature not a digraph. But if we have a file called "ÆLENE", may be assume it the same in a case- insensitive sense to both "aelene" and "ælene"? I can only go on code-points, because I don't want to deal with ß and SS and Ss. Case-folding file systems are just begging for trouble, and I just don't know what to do. Think of the 3 Greek sigmata. > identity is a lot more important than collation, as well as a > precondition for collation, and collation is a lot more difficult and can > be put off. I agree everything with everthing save "and can be put off". I would like you to be right. I should truly wish to be mistaken. And I don't kno
r24089 - docs/Perl6/Spec
Author: ruoso Date: 2008-11-27 14:49:24 +0100 (Thu, 27 Nov 2008) New Revision: 24089 Modified: docs/Perl6/Spec/S07-iterators.pod Log: [spec] general S07 cleanup, I think that can be considered the first version. Most of the cleanup is related to accepting that it is runtime s responsability to instantiate the GenericLazyList when needed, and not the Iterator itself Modified: docs/Perl6/Spec/S07-iterators.pod === --- docs/Perl6/Spec/S07-iterators.pod 2008-11-27 13:20:22 UTC (rev 24088) +++ docs/Perl6/Spec/S07-iterators.pod 2008-11-27 13:49:24 UTC (rev 24089) @@ -77,7 +77,7 @@ The feed operator is strictly lazy, meaning that no operation should be performed before the user requests any element from @a. That's how - my @a <== grep { ... } <== map { ... } <== grep { ... }, 1, 2, 3 + my @a <== grep { ... } <== map { ... } <== grep { ... } <== 1, 2, 3 is completely lazy, even if 1,2,3 is a fairly small known compact list. @@ -87,7 +87,7 @@ But it's important to notice that eagerness takes precedence over lazyness, meaning that - my @a = grep { ... } <== map { ... } <== grep { ... }, 1, 2, 3 + my @a = grep { ... } <== map { ... } <== grep { ... } <== 1, 2, 3 Will be eagerly evaluated, but that is still different from @@ -111,145 +111,52 @@ The iterator role represents the lazy access to a list, walking through a data structure (list, tree whatever), feeds (map, grep etc) -or and, each time it is called, will return one (or more) of the nodes -from the data structure. +or a stream (mostly for IO). Each time it is called, will return the +elements produced at that iteration. -When an iterator runs out of items, it will throw an exception. +It's important to realize that the iterator of a list can be accessed +by the .Iterator() method (but only the runtime will be calling that +most of the time), and the implemenation of each iterator is private +to the list and implementation specific. -The iterator role has the following methods: +This is a minimal API that should allow custom iterator +implemenations, but this spec should be expanded in the future to +provide additional API for batch-aware iterators. =head2 method prefix:<=> {...} -Returns something appropriate depending on the context: +Returns the items for that iteration. The grouping of elements +returned in each iteration is visible if this iterator is being used +to build a slice. While building a List, the items will be flattened. -=head2 method new(Laziness => $laziness, Context => $context) {...} +When it runs out of items, it will throw an exception. -Creates the iterator, with appropriate laziness defaults. +=head1 Auxiliary Implementations -=head2 method SetLaziness($laziness) {...} - -Set the Laziness - -=head2 method SetContext($context) {...} - -Set the Context (intended only for coercions, not user use) - -=head2 Iterator Summary - -=begin code - -role Iterator { -has $!laziness; -has $!context; - -# enforces item context -method FETCH() {...} -# returns a list -method List() {...} -# returns a slice -method Slice() {...} -# returns the capture of the next iteration -method prefix:<=> { -given $self.context { -when 'Item' { return self.FETCH() } # called in item context -when 'Slice' { return self.Slice() } # called in slice context -when * { return self.List() } # called in list context (or any other context) -} -} -# Creates a new iterator; can be called with no parameters, and chooses sensible defaults -method new(Laziness => $laziness, Context => $context) { -if(! $context) { -given want { -when :($) { $context = 'Item'; } # called in item context -when :(@@) { $context = 'Slice'; } # called in slice context -when * { $context = 'List'; } # called in list context (or any other context) -} -} -$self.context = $context; -self.SetLaziness($laziness); -} -method SetContext($context) { -$context ~~ /^(Item|List|Slice)$/) or die "Invalid context $context\n"; -$self.context = $context; - self.SetLaziness(); -} -method SetLaziness($laziness) { -if($laziness) { -$self.laziness = $laziness; -} else { -given $self.context { -when 'Item' { self.laziness = 'Mostly Lazy'; } -when * { self.laziness = 'Mostly Eager'; } -} -} -} -} - -=end code - -=head1 Default Implementations - Perl's built-ins require that a number of default iterators exist. -=head2 GenericIterator implementation +=head2 Generic Item Iterator -This is what is returned by default when an iterator is asked for, but no iterator type is known. +Operators like map requires one item at a time as
r24090 - docs/Perl6/Spec
Author: ruoso Date: 2008-11-27 14:57:34 +0100 (Thu, 27 Nov 2008) New Revision: 24090 Modified: docs/Perl6/Spec/S07-iterators.pod Log: [spec] Small text revisions on S07 Modified: docs/Perl6/Spec/S07-iterators.pod === --- docs/Perl6/Spec/S07-iterators.pod 2008-11-27 13:49:24 UTC (rev 24089) +++ docs/Perl6/Spec/S07-iterators.pod 2008-11-27 13:57:34 UTC (rev 24090) @@ -11,14 +11,14 @@ Daniel Ruoso <[EMAIL PROTECTED]> Date: 27 Nov 2008 Last Modified: 27 Nov 2008 - Version: 1 + Version: 2 =head1 Laziness and Eagerness As we all know, one of the primary virtues of the Perl programmer is laziness. This is also one of the virtues of Perl itself. However, -Perl knows better than to succumb to false laziness, and so is eager -sometimes, and lazy others. Perl defines 4 levels of laziness for +Perl 6 knows better than to succumb to false laziness, and so is eager +sometimes, and lazy others. Perl 6 defines 4 levels of laziness for Iterators: =over @@ -46,7 +46,7 @@ It's important to realize that the responsability of determining the level of lazyness/eagerness in each operation is external to each lazy -object, the runtime, depending on which operation is being performed +object, the runtime, depending on which operation is being performed, is going to assume the level of lazyness and perform the needed operations to apply that level. @@ -75,7 +75,7 @@ =item Feed operators: my @a <== @something; The feed operator is strictly lazy, meaning that no operation should -be performed before the user requests any element from @a. That's how +be performed before the user requests any element. That's how my @a <== grep { ... } <== map { ... } <== grep { ... } <== 1, 2, 3 @@ -111,8 +111,7 @@ The iterator role represents the lazy access to a list, walking through a data structure (list, tree whatever), feeds (map, grep etc) -or a stream (mostly for IO). Each time it is called, will return the -elements produced at that iteration. +or a stream (mostly for IO). It's important to realize that the iterator of a list can be accessed by the .Iterator() method (but only the runtime will be calling that @@ -133,7 +132,7 @@ =head1 Auxiliary Implementations -Perl's built-ins require that a number of default iterators exist. +Perl's built-ins require that a number of auxiliary types. =head2 Generic Item Iterator
r24091 - docs/Perl6/Spec
Author: ruoso Date: 2008-11-27 15:07:52 +0100 (Thu, 27 Nov 2008) New Revision: 24091 Modified: docs/Perl6/Spec/S07-iterators.pod Log: [spec] small code examples on how to get the generic item iterator, lazy slice and lazy list Modified: docs/Perl6/Spec/S07-iterators.pod === --- docs/Perl6/Spec/S07-iterators.pod 2008-11-27 13:57:34 UTC (rev 24090) +++ docs/Perl6/Spec/S07-iterators.pod 2008-11-27 14:07:52 UTC (rev 24091) @@ -140,22 +140,34 @@ they can use a generic item iterator to consolidate the access to the input iterator, doing additional iterators when an empty capture is returned and holding additional values if more than one item is -returned. +returned. The following code will result in getting a generic item +iterator: + my $foo <== @a; + +You can later do: + + my $item = =$foo; + =head2 Generic Lazy List The generic lazy list accepts an iterator as input, and consumes the iterator as the elements of the list are accessed but flattening and -storing the already-evaluated elements. +storing the already-evaluated elements. To obtain a generic lazy list, +just do: + my @a <== @b; + =head2 Generic Lazy Slice The generic lazy slice accepts an iterator as input, and consumes the iterator as the elements of the list are accessed but storing the already-evaluated elements as a bi-dimensional list, where the first dimension holds each iteration, and the second contains the return of -each iteration. +each iteration. To obtain a generic lazy slice, do: + my @@a <== map { ... }, 1,2,3; + =head1 Additions Please post errors and feedback to perl6-language. If you are making
r24092 - docs/Perl6/Spec
Author: ruoso Date: 2008-11-27 15:58:55 +0100 (Thu, 27 Nov 2008) New Revision: 24092 Modified: docs/Perl6/Spec/S07-iterators.pod Log: [spec] lazyness applies to every object Modified: docs/Perl6/Spec/S07-iterators.pod === --- docs/Perl6/Spec/S07-iterators.pod 2008-11-27 14:07:52 UTC (rev 24091) +++ docs/Perl6/Spec/S07-iterators.pod 2008-11-27 14:58:55 UTC (rev 24092) @@ -18,8 +18,7 @@ As we all know, one of the primary virtues of the Perl programmer is laziness. This is also one of the virtues of Perl itself. However, Perl 6 knows better than to succumb to false laziness, and so is eager -sometimes, and lazy others. Perl 6 defines 4 levels of laziness for -Iterators: +sometimes, and lazy others. Perl 6 defines 4 levels of laziness: =over
Re: Files, Directories, Resources, Operating Systems
Hi, First of all, sorry for breaking the thread, but I had some trouble with my mail provider, and couldn't hit the "reply" button. To the point... I think there are some things that are simply not solved by abstraction. Some problems are concrete problems that need concrete solutions, filesystem access is one of them, IMNSHO. I pretty much think if ($*OS ~~ POSIX) { ... } elsif ($*OS ~~ Win32) { ... } is much saner than trying to deal with an enormous API that would be the result of the attempt to get a sane abstraction of all the different possible scenarios, and that would end up having backward-incompatible changes after a while because of some use case scenario that wasn't adrressed. On the other hand, we really could think on having chmod, chown etc in the POSIX module, and have the POSIX module imported (where chmod would be in the default exports) by the prelude when in a posix machine, the same for the Win32 or whatever counterpart. Of course it would be very much interesting to have the "open" implemented by the POSIX module with the same API as the "open" implemented by the Win32 module. But I'm pretty much sure that's not the case for chown and chmod, and I don't think an abstract API is worth the trouble for 99% of the cases. But note that this doesn't stop the people in the 1% case to write the abstraction API, I just think it doesn't need to be the only way to access the features, and it certainly doesn't need to be loaded in the prelude. daniel
Re: Files, Directories, Resources, Operating Systems
Tom Christiansen wrote: In-Reply-To: Message from Darren Duncan <[EMAIL PROTECTED]> There is ABSOLUTELY NO WAY I've found to tell whether these utf-8 string should test equal, and when, nor how to order them, without knowing the locale: "RESUME", "Resume" "resume" "Resum\x{e9}" "r\x{E9}sum\x{E9}" "r\x{E9}sume\x{301}" "Re\x{301}sume\x{301}" I believe that the most important issues here, those having to do with identity, can be discussed and solved without unduly worrying about matters of collation; It's funny you should say that, as I could nearly swear that I just showed that identify cannot be determmined in the examples above without knowing about locales. To wit, while all of those sort somewhat differently, even case-insensitively, no matter whether you're thinking of a French or a Spanish ordering (and what is English's, anyway?), you have a a more fundadmental = vs != scenario which is entirely locale-dependent. If your current abstraction level is the Unicode codepoint level, then no knowledge of locale is needed at all in an everything-sensitive filesystem. Those 7 examples are all distinct for you, end of story. So you can see why I advocate everything-sensitive as being the "normal" case, same as with Perl identifiers. Rather than thinking of locales in terms of something special, AFAIK any locale can be reduced to a simple (though possibly verbose but predefinable in a library) normalized portable definition built from everything-sensitive components where the components are enumerations and functions describing a character repertoire (what characters can exist) plus representation normalization rules plus where applicable collation (ordering) rules plus where applicable mutual exclusion rules. When your core toolkit just works with everything-sensitive components and insensitive or locale issues are just defined as formulae over that, then we have indeed separated the locale issues into a connected but non-core problem. So collation doesn't need to be considered in Perl's file-system interface, while identity does; collation can be a layer on top of the core interface that just cares about identity. That seems a simplified version of reality. Identity isn't what monoglots think it is. I'm wondering if we're talking about the same meaning of the word "collation". The way I have been using it, or meaning to, "collation" simply talks about how you put a set of values in order such that each 2 distinct values has a before|after relationship. Whereas identity is testing whether 2 things you hold are just the same value or not. You don't need to have ordering rules defined in order to have known equality rules. If you *know* that the 7 strings are all UTF-8, then locale doesn't have to be considered for equality; just your unicode abstraction level matters, such as if you're defining the values in terms of graphemes vs codepoints vs bytes. That's not true. é is not the same letter as e in Icelandic. I don't consider those to be the same character period. Mind you everywhere I've said "graphemes" I meant language-independent graphemes. I grant you that if you get into a further abstraction level of language-dependent graphemes, then some may see those 2 characters as being identical, and if that's your point then I can better understand now where you're coming from with the problems you raise. Practically speaking, I think that portability and other concerns would require us to just not go higher than the language-independent grapheme abstraction level when dealing with either Perl identifiers or file names or other urls with non-platform-specific APIs, and simply treat every language-independent grapheme as being distinct/non-identical from every other one, even if some locales might do different. Users should be able to deal with this gracefully enough much as people can easily enough treat "E" and "e" as being distinct. -- Darren Duncan
r24098 - docs/Perl6/Spec
Author: wayland Date: 2008-11-27 23:40:00 +0100 (Thu, 27 Nov 2008) New Revision: 24098 Modified: docs/Perl6/Spec/S07-iterators.pod Log: Cleaned up text a little bit, hopefully clarified things. Modified: docs/Perl6/Spec/S07-iterators.pod === --- docs/Perl6/Spec/S07-iterators.pod 2008-11-27 18:34:05 UTC (rev 24097) +++ docs/Perl6/Spec/S07-iterators.pod 2008-11-27 22:40:00 UTC (rev 24098) @@ -18,8 +18,15 @@ As we all know, one of the primary virtues of the Perl programmer is laziness. This is also one of the virtues of Perl itself. However, Perl 6 knows better than to succumb to false laziness, and so is eager -sometimes, and lazy others. Perl 6 defines 4 levels of laziness: +sometimes, and lazy others. +One thing that Perl understands is the difference between Laziness and +Eagerness. When something is Lazy, it says "just give me what you've +got; I'll get the rest later", whereas when it's eager, it says "More! +More! Give me everything you can get!". + +Perl 6 defines 4 levels of laziness: + =over =item Strictly Lazy @@ -108,12 +115,21 @@ =head1 The Iterator Role -The iterator role represents the lazy access to a list, walking -through a data structure (list, tree whatever), feeds (map, grep etc) -or a stream (mostly for IO). +The iterator role represents the lazy access to a list, walking through +one of: +=over + +=item Data structure (list, tree, table, etc) + +=item Feed (map, grep, etc) + +=item Stream (mostly for IO) + +=back + It's important to realize that the iterator of a list can be accessed -by the .Iterator() method (but only the runtime will be calling that +by the .iterator() method (but only the runtime will be calling that most of the time), and the implemenation of each iterator is private to the list and implementation specific. @@ -121,6 +137,8 @@ implemenations, but this spec should be expanded in the future to provide additional API for batch-aware iterators. +The methods in this role are: + =head2 method prefix:<=> {...} Returns the items for that iteration. The grouping of elements
Iterators and Laziness
No doubt some of you have seen the Draft S07-iterators, but for those who haven't: http://svn.pugscode.org/pugs/docs/Perl6/Spec/S07-iterators.pod I have some questions here (mostly directed to Daniel Ruoso, but others can feel free to chip in if they have thoughts). Should laziness/eagerness be a property of the operator? You write "The iterator role represents the lazy access to a list". Why only lazy access? You write: my $item = =$foo; Does that get one item from the iterator object? You specify that the .Iterator() object on something that does List will return the appropriate iterator. I changed it to be .iterator() (so that it's lowercase like all the other List.whatever() functions); what's the function signature? method iterator() {...} ...or are there parameters? :) - | Name: Tim Nelson | Because the Creator is,| | E-mail: [EMAIL PROTECTED]| I am | - BEGIN GEEK CODE BLOCK Version 3.12 GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- PE(+) Y+>++ PGP->+++ R(+) !tv b++ DI D G+ e++> h! y- -END GEEK CODE BLOCK-