Dear Richard,
sorry for the delayed reply and thanks for your thoughts. I wrapped my head around your suggestions and tried a few thing to see how the play out in code. I tend towards the following names. - Reducing function (a role) is a binary block / object that responds to #value:value: and #numArgs; commonly used with #inject:into: - Completing function (a role) is an unary block / object that responds to #value:. - Reducer is a pair (rfn, cfn) of a reducing function and a completing function. - Reduction is a pair (red, val) of a reducer and an initial value. I like your idea to inverse the message flow and implement #reduce: in Reduction. It reads well in my eyes in actual code. I do not think that this is a misnomer, since it indeed reduces a sequence of elements to a single object. A reducer knows how to apply first its rfn to a sequence and then cfn to the result. negatedSum := (#+ completing: #negate) init: 0. result := negatedSum reduce: (1 to: 10). asSet := (#add completing: #trim) initializer: [Set new]. distinct := asSet reduce: aCollection I understand your concerns that we usually either do not need cfn or we applies it manually to the result of the reduction. Yes, most often cfn = identity and I took this into account. However, in some cases I actually need them to pass around as a single object. I also attempted to find names closer to #inject:into: but did not come up with any good ideas yet. That this makes sense so far? I deliberately did not mention transducers above to focus on the naming first. I heard about "obviously synchronizable series expressions" before and had a look at the paper just now. In short, transducers are in the same spirit and offer similar nice properties. But maybe we save the details for another thread? Regarding the examples. Yes, they are are simple, stupid and the reimplemenation of #reduce: is far from optimal. The goal was to show how the objects/messages are used rather then coming up with real world examples or showing off. ;-) Kind regards, Steffen Richard O'Keefe schrieb am Samstag, 15. April 2023 16:28:09 (+02:00): Let initial :: a combine :: a -> b -> a finish :: a -> c then (finish . foldl combine initial) :: Foldable t => t b -> c This appears to be analogous to your 'reduce with completion'. What *can* be done with composition generally *should* be done with composition; I really don't see any advantage in defining reduce finish combine initial = finish . combine initial As for growing a collection and then finally trimming it to its desired size, I've never known anything like that to be a good idea. There's a reason why my library includes Set BoundedSet IdentitySet BoundedIdentitySet Deque BoundedDeque Heap BoundedHeap I use BoundedHeap a lot. If I want to find the k best things out of n, using BoundedHeap lets me do it in O(n.log k) time and O(k) space instead of O(n.log n) time and O(n) space. Your example showing how much better #reduceLeft: is when expressed using transducers is not well chosen,because the implementation of #reduceLeft: in Pharo is (how to say this politely?) not up to the standards we expect from Pharo. Here is how it stands in my compatibility library: Enumerable methods for: 'summarising' reduceLeft: aBlock <compatibility: #pharo> |n| ^(n := aBlock argumentCount) > 2 ifTrue: [ |i a| a := Array new: n. i := 0. self do: [:each | a at: (i := i + 1) put: each. i = n ifTrue: [ a at: (i := 1) put: (aBlock valueWithArguments: a)]]. i = 1 ifTrue: [a at: i] ifFalse: [ self error: 'collection/block size mismatch']] ifFalse: [ "If n = 0 or 1 and the receiver has two or more elements, this will raise an exception passing two arguments to aBlock. That's a good way to complain about it." |r f| r := f := nil. self do: [:each | r := f ifNil: [f := self. each] ifNotNil: [aBlock value: r value: each]]. r == f ifNil: [CollectionTooSmall collection: self] ifNotNil: [r]] No OrderedCollection. And exactly one traversal of the collection. (Enumerables can only be traversed once. Collections can be traversed multiple times.) The only method the receiver needs to provide is #do:, so this method doesn't really need to be in Enumerable. It could, for example, be in Block. #reduceRight: I've placed in AbstractSequence because it needs #reverseDo:, but it equally makes sense as a method of Block (as it knows more about what Blocks can do than it does about what sequences can do). For example, I have SortedSet and SortedBag which can be traversed forwards and backwards but only #reduceLeft: is available to them; #reduceRight: is not, despite them sensibly supporting #reverseDo:. For that matter, Deques support #reverseDo: but are not sequences... To the limited extent that I understand the Transducers package, this view that (s reduce{Left,Right}: b) should really be (b reduce{Left,Right}) applyTo: s seems 100% in the spirit of Transducers. In fact, now that I understand a bit about Transducers, <collection> reduce: <transducer> (a) seems back to front compared with <reducer> applyTo: <collection> (b) seems like a misnomer because in general much may be generated transformed and retained, with nothing getting smaller, so why 'reduce'? It seems like a special case of applying a function-like object (the transducer) to an argument (the collection), so why not transducer applyTo: dataSource transducer applyTo: dataSource initially: initial Did you look at Richard Waters' "obviously synchronizable series expressions"? (MIT AI Memo 958A and 959A) Or his later SERIES package? (Appendix A of Common Lisp the Language, 2nd edition) On Fri, 14 Apr 2023 at 22:32, Steffen Märcker <merk...@web.de> wrote: Hi Richard! Thanks for sharing your thoughts. There's a reason why #inject:into: puts the block argument last. It works better to have "heavy" constituents on the right in an English sentence, and it's easier to indent blocks when they come last. Nice, I never though of it this way. I always appreciate a historical background. Let me try to respond to the rest with a focus on the ideas. First of all, the point of the transducers framework is to cleanly separate between iteration over sequences, processing of the elements and accumulation of a result. It enables easy reuse the concepts common to different data structures. 1. What do I mean by completion? If we iterate over a sequence of objects, we sometimes want to do a final step after all elements have seen to complete the computation. For instance, after copying elements to a new collection, we may want to trim it to its actual size: distinct := col inject: Set new into: #add. distinct trim. For some cases it turns out to be useful to have an object that knows how to do both: distinct := col reduce: (#add completing: #trim) init: Set new. #reduce:init: knows how to deal with both this new objects and ordinary blocks. For now I will call both variants a "reducing function". Note, completion is completely optional and the implementation is literally #inject:into: plus completion if required. 2. What is a reduction? In some cases, it turns out to be useful to pair up a reducing function with an (initial) value. You called it a magma and often its indeed the neutral element of a mathematical operator, e.g., + and 0. But we can use a block that yields the initial value, too. For instance: sum := #+ init: 0. result := numbers reduce: sum. toSet := (#add completing: #trim) initializer: [Set new]. distinct := col reduce: toSet. #reduce: is just a shorthand that passes the function and the value to #reduce:init: Maybe #reduceMagma: is a reasonable name? 3. The reason I implemented #reduce:init: directly on collections, streams, etc. is that these objects know best how to efficiently iterate over their elements. And if a data structure knows how to #reduce:init: we can use it with all the transducers functions, e.g., for partitioning, filtering etc. Other useful methods could then be added to the behaviour with a trait, e.g., #transduce:reduce:init which first apples a transducer and then reduces. As traits are not available in plain VW 8.3, I did not try this approach, though. 4. Lets take #reduce:Left: as and example and reimplement the method using transducers, shall we? The following code works for each sequence/collection/stream that supports #transduce:reduce:init: reduceLeft: aBlock | head rest arity | head := self transduce: (Take number: 1) reduce: [:r :e | e] init: nil. rest := Drop number: 1. arity := aBlock arity. ^arity = 2 ifTrue: [self transduce: rest reduce: aBlock init: head] ifFalse: [ | size arguments | size := arity - 1. rest := rest * (Partition length: size) * (Remove predicate: [:part | part size < size]). arguments := Array new: arity. arguments at: 1 put: head. self transduce: rest reduce: ([:args :part | args replaceFrom: 2 to: arity with: part; at: 1 put: (aBlock valueWithArguments: args); yourself] completing: [:args | args first]) init: arguments] This code is both more general and faster: It does not create an intermediate OrderedCollection and it treats the common case of binary blocks efficiently. Note the implementation can more compact and optimized if it was specialized in certain class. For instance, SequenceableCollection allows accessing elements by index which turns the first line into a simple "self first". Thanks for staying with me for this long reply. I hope I did not miss a point. I do not insist on the existing names but will appreciate any ideas. Best, Steffen Richard O'Keefe schrieb am Freitag, 14. April 2023 09:43:32 (+02:00): #reduce: aReduction Are you saying that aReduction is an object from which a dyadic block and an initial value can be derived? That's going to confuse the heck out of Dolphin and Pharo users (like me, for example). And in my copy of Pharo, #reduce: calls #reduceLeft:, not #foldLeft:. The sad thing about #reduceLeft: in Pharo is that in order to provide extra generality I have no use for, it fails to provide a fast path for the common case of a dyadic block. reduceLeft: aBlock aBlock argumentCount = 2 ifTrue: [ |r| r := self first. self from: 2 to: self last do: [:each | r := aBlock value: r value: each]. ^r]. ... everything else as before ... Adding up a million floats takes half the time using the fast path (67 msec vs 137 msec). Does your #reduce: also perform "a completion action"? If so, it definitely should not be named after #inject:into:. At any rate, if it does something different, it should have a different name, so #reduce: is no good. #reduce:init: There's a reason why #inject:into: puts the block argument last. It works better to have "heavy" constituents on the right in an English sentence, and it's easier to indent blocks when they come last. Which of the arguments here specifies the 'completion action'? What does the 'completion action' do? (I can't tell from the name.) I think the answer is clear: * choose new intention-revealing names that do not clash. If I have have understood your reduce: aReduction correctly, a Reduction specifies - a binary operation (not necessarily associative) - a value which can be passed to that binary operation which suggests that it represents a magma with identity. By the way, it is not clear whether {x} reduce: <<ident. binop>> answers x or binop value: ident value: x. It's only when ident is an identity for binop that you can say 'it doesn't matter'. I don't suppose you could bring yourself to call aReduction aMagmaWithIdentity? Had you considered aMagmaWithIdentity reduce: aCollection where the #reduce: method is now in your class so can't *technically* clash with anything else? All you really need from aCollection is #do: so it could even be a stream. MagmaWithIdentity >> identity >> combine:with: >> reduce: anEnumerable |r| r := self identity. anEumerable do: [:each | r := self combine: r with: each]. ^r MagmaSansIdentity >> combine:with: >> reduce: anEnumerable |r f| f := r := nil. anEnumerable do: [:each | r := f ifNil: [f := self. each] ifNotNil: [self combine: r with: each]]. f ifNil: [anEnumerable error: 'is empty']. ^r On Fri, 14 Apr 2023 at 05:02, Steffen Märcker <merk...@web.de> wrote: The reason I came up with the naming question in the first place is that I (finally !) finish my port of Transducers to Pharo. But currently, I am running into a name clash. Maybe you have some good ideas how to resolve the following situation in a pleasant way. - #fold: exists in Pharo and is an alias of #reduce: - #reduce: exists in Pharo and calls #foldLeft: which also deals with more than two block arguments Both of which are not present in VW. Hence, I used the following messages in VW with no name clash: - #reduce: aReduction "= block + initial value" - #reduce:init: is similar to #inject:into: but executes an additional completion action Some obvious ways to avoid a clash in Pharo are: 1) Make #reduce: distinguish between a reduction and a simple block (e.g. by double dispatch) 2) Rename the transducers #reduce: to #injectInto: and adapt #inject:into: to optionally do the completion 3) Find another selector that is not too counter-intuitive All three approaches have some downsides in my opinion: 1) Though straight forward to implement, both flavors behave quite different, especially with respect to the number of block arguments. The existing one creates a SequenceableCollection and partitions it according to the required number of args. Transducers' #reduce: considers binary blocks as the binary fold case but ternary blocks as fold with indexed elements. 2) This is a real extension of #inject:into: but requires to touch multiple implementations of that message. Something I consider undesirabe. 3) Currently, I cannot think of a good name that is not too far away from what we're familiar with. Do you have some constructive comments and ideas? Kind regards, Steffen Steffen Märcker schrieb am Donnerstag, 13. April 2023 17:11:15 (+02:00): :-D I don't know how compress made onto that site. There is not even an example in the list of language examples where fold/reduce is named compress. Richard O'Keefe schrieb am Donnerstag, 13. April 2023 16:34:29 (+02:00): OUCH. Wikipedia is as reliable as ever, I see. compress and reduce aren't even close to the same thing. Since the rank of the result of compression is the same as the rank of the right operand, and the rank of the result of reducing is one lower, they are really quite different. compress is Fortran's PACK. https://gcc.gnu.org/onlinedocs/gfortran/PACK.html On Fri, 14 Apr 2023 at 01:34, Steffen Märcker <merk...@web.de> wrote: Hi Richard and Sebastian! Interesting read. I obviously was not aware of the variety of meanings for fold/reduce. Thanks for pointing this out. Also, in some languages it seems the same name is used for both reductions with and without an initial value. There's even a list on WP on the matter: https://en.wikipedia.org/wiki/Fold_%28higher-order_function%29#In_various_languages Kind regards, Steffen Richard O'Keefe schrieb am Donnerstag, 13. April 2023 13:16:28 (+02:00): The standard prelude in Haskell does not define anything called "fold". It defines fold{l,r}{,1} which can be applied to any Foldable data (see Data.Foldable). For technical reasons having to do with Haskell's non-strict evaluation, foldl' and foldr' also exist. But NOT "fold". https://hackage.haskell.org/package/base-4.18.0.0/docs/Data-Foldable.html#laws On Thu, 13 Apr 2023 at 21:17, Sebastian Jordan Montano <sebastian.jor...@inria.fr> wrote: Hello Steffen, Let's take Kotlin documentation (https://kotlinlang.org/docs/collection-aggregate.html#fold-and-reduce) > The difference between the two functions is that fold() takes an initial > value and uses it as the accumulated value on the first step, whereas the > first step of reduce() uses the first and the second elements as operation > arguments on the first step. Naming is not so consistent in all the programming languages, they mix up the names "reduce" and "fold". For example in Haskell "fold" does not take an initial value, so it is like a "reduce" in Kotlin. In Kotlin, Java, Scala and other oo languages "reduce" does not take an initial value while "fold" does. Pharo align with those languages (except that out fold is called #inject:into:) So for me the Pharo methods #reduce: and #inject:into represent well what they are doing and they are well named. Cheers, Sebastian ----- Mail original ----- > De: "Steffen Märcker" <merk...@web.de> > À: "Any question about pharo is welcome" <pharo-users@lists.pharo.org> > Envoyé: Mercredi 12 Avril 2023 19:03:01 > Objet: [Pharo-users] Collection>>reduce naming > Hi! > > I wonder whether there was a specific reason to name this method #reduce:? > I would have expected #fold: as this is the more common term for what it > does. And in fact, even the comment reads "Fold the result of the receiver > into aBlock." Whereas #reduce: is the common term for what we call with > #inject:into: . > > I am asking not to annoy anyone but out of curiosity. It figured this out > only by some weird behaviour after porting some code that (re)defines > #reduce . > > Ciao! > Steffen -- Gesendet mit Vivaldi Mail. Laden Sie Vivaldi kostenlos unter vivaldi.com herunter