Hi Steffen
> The short answer is that the compact notation turned out to work much better > for me in my code, especially, if multiple transducers are involved. But > that's my personal taste. You can choose which suits you better. In fact, > > 1000 take. > > just sits on top and simply calls > > Take number: 1000. To me this is much much better. > If the need arises, we could of course factor the compact notation out into > a separate package. Good idea Btw, would you prefer (Take n: 1000) over (Take number: > 1000)? I tend to prefer explicit selector :) > Damien, you're right, I experimented with additional styles. Right now, we > already have in the basic Transducer package: > > (collection transduce: #squared map * 1000 take. "which is equal to" > (collection transduce: #squared map) transduce: 1000 take. > > Basically, one can split #transduce:reduce:init: into single calls of > #transduce:, #reduce:, and #init:, depending on the needs. > I also have an (unfinished) extension, that allows to write: > > (collection transduce map: #squared) take: 1000. To me this is much mre readable. I cannot and do not want to use the other forms. > This feels familiar, but becomes a bit hard to read if more than two steps > are needed. > > collection transduce > map: #squared; > take: 1000. Why this is would hard to read. We do that all the time everywhere. > I think, this alternative would reads nicely. But as the message chain has > to modify the underlying object (an eduction), very snaky side effects may > occur. E.g., consider > > eduction := collection transduce. > squared := eduction map: #squared. > take := squared take: 1000. > > Now, all three variables hold onto the same object, which first squares all > elements and than takes the first 1000. This is because the programmer did not understand what he did. No? Stef PS: I played with infinite stream and iteration back in 1993 in CLOS. Now I do not like to mix things because it breaks my flow of thinking. > > Best, > Steffen > > > > > > Am .06.2017, 21:28 Uhr, schrieb Damien Pollet > <damien.pollet+ph...@gmail.com>: > >> If I recall correctly, there is an alternate protocol that looks more like >> xtreams or the traditional select/collect iterations. >> >> On 2 June 2017 at 21:12, Stephane Ducasse <stepharo.s...@gmail.com> wrote: >> >>> I have a design question >>> >>> why the library is implemented in functional style vs messages? >>> I do not see why this is needed. To my eyes the compact notation >>> goes against readibility of code and it feels ad-hoc in Smalltalk. >>> >>> >>> I really prefer >>> >>> square := Map function: #squared. >>> take := Take number: 1000. >>> >>> Because I know that I can read it and understand it. >>> From that perspective I prefer Xtreams. >>> >>> Stef >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> On Wed, May 31, 2017 at 2:23 PM, Steffen Märcker <merk...@web.de> wrote: >>> >>>> Hi, >>>> >>>> I am the developer of the library 'Transducers' for VisualWorks. It was >>>> formerly known as 'Reducers', but this name was a poor choice. I'd like >>>> to >>>> port it to Pharo, if there is any interest on your side. I hope to learn >>>> more about Pharo in this process, since I am mainly a VW guy. And most >>>> likely, I will come up with a bunch of questions. :-) >>>> >>>> Meanwhile, I'll cross-post the introduction from VWnc below. I'd be very >>>> happy to hear your optinions, questions and I hope we can start a >>>> fruitful >>>> discussion - even if there is not Pharo port yet. >>>> >>>> Best, Steffen >>>> >>>> >>>> >>>> Transducers are building blocks that encapsulate how to process elements >>>> of a data sequence independently of the underlying input and output >>>> source. >>>> >>>> >>>> >>>> # Overview >>>> >>>> ## Encapsulate >>>> Implementations of enumeration methods, such as #collect:, have the >>>> logic >>>> how to process a single element in common. >>>> However, that logic is reimplemented each and every time. Transducers >>>> make >>>> it explicit and facilitate re-use and coherent behavior. >>>> For example: >>>> - #collect: requires mapping: (aBlock1 map) >>>> - #select: requires filtering: (aBlock2 filter) >>>> >>>> >>>> ## Compose >>>> In practice, algorithms often require multiple processing steps, e.g., >>>> mapping only a filtered set of elements. >>>> Transducers are inherently composable, and hereby, allow to make the >>>> combination of steps explicit. >>>> Since transducers do not build intermediate collections, their >>>> composition >>>> is memory-efficient. >>>> For example: >>>> - (aBlock1 filter) * (aBlock2 map) "(1.) filter and (2.) map elements" >>>> >>>> >>>> ## Re-Use >>>> Transducers are decoupled from the input and output sources, and hence, >>>> they can be reused in different contexts. >>>> For example: >>>> - enumeration of collections >>>> - processing of streams >>>> - communicating via channels >>>> >>>> >>>> >>>> # Usage by Example >>>> >>>> We build a coin flipping experiment and count the occurrence of heads >>>> and >>>> tails. >>>> >>>> First, we associate random numbers with the sides of a coin. >>>> >>>> scale := [:x | (x * 2 + 1) floor] map. >>>> sides := #(heads tails) replace. >>>> >>>> Scale is a transducer that maps numbers x between 0 and 1 to 1 and 2. >>>> Sides is a transducer that replaces the numbers with heads an tails by >>>> lookup in an array. >>>> Next, we choose a number of samples. >>>> >>>> count := 1000 take. >>>> >>>> Count is a transducer that takes 1000 elements from a source. >>>> We keep track of the occurrences of heads an tails using a bag. >>>> >>>> collect := [:bag :c | bag add: c; yourself]. >>>> >>>> Collect is binary block (reducing function) that collects events in a >>>> bag. >>>> We assemble the experiment by transforming the block using the >>>> transducers. >>>> >>>> experiment := (scale * sides * count) transform: collect. >>>> >>>> From left to right we see the steps involved: scale, sides, count and >>>> collect. >>>> Transforming assembles these steps into a binary block (reducing >>>> function) >>>> we can use to run the experiment. >>>> >>>> samples := Random new >>>> reduce: experiment >>>> init: Bag new. >>>> >>>> Here, we use #reduce:init:, which is mostly similar to #inject:into:. >>>> To execute a transformation and a reduction together, we can use >>>> #transduce:reduce:init:. >>>> >>>> samples := Random new >>>> transduce: scale * sides * count >>>> reduce: collect >>>> init: Bag new. >>>> >>>> We can also express the experiment as data-flow using #<~. >>>> This enables us to build objects that can be re-used in other >>>> experiments. >>>> >>>> coin := sides <~ scale <~ Random new. >>>> flip := Bag <~ count. >>>> >>>> Coin is an eduction, i.e., it binds transducers to a source and >>>> understands #reduce:init: among others. >>>> Flip is a transformed reduction, i.e., it binds transducers to a >>>> reducing >>>> function and an initial value. >>>> By sending #<~, we draw further samples from flipping the coin. >>>> >>>> samples := flip <~ coin. >>>> >>>> This yields a new Bag with another 1000 samples. >>>> >>>> >>>> >>>> # Basic Concepts >>>> >>>> ## Reducing Functions >>>> >>>> A reducing function represents a single step in processing a data >>>> sequence. >>>> It takes an accumulated result and a value, and returns a new >>>> accumulated >>>> result. >>>> For example: >>>> >>>> collect := [:col :e | col add: e; yourself]. >>>> sum := #+. >>>> >>>> A reducing function can also be ternary, i.e., it takes an accumulated >>>> result, a key and a value. >>>> For example: >>>> >>>> collect := [:dic :k :v | dict at: k put: v; yourself]. >>>> >>>> Reducing functions may be equipped with an optional completing action. >>>> After finishing processing, it is invoked exactly once, e.g., to free >>>> resources. >>>> >>>> stream := [:str :e | str nextPut: each; yourself] completing: >>>> #close. >>>> absSum := #+ completing: #abs >>>> >>>> A reducing function can end processing early by signaling Reduced with a >>>> result. >>>> This mechanism also enables the treatment of infinite sources. >>>> >>>> nonNil := [:res :e | e ifNil: [Reduced signalWith: res] ifFalse: >>>> [res]]. >>>> >>>> The primary approach to process a data sequence is the reducing protocol >>>> with the messages #reduce:init: and #transduce:reduce:init: if >>>> transducers >>>> are involved. >>>> The behavior is similar to #inject:into: but in addition it takes care >>>> of: >>>> - handling binary and ternary reducing functions, >>>> - invoking the completing action after finishing, and >>>> - stopping the reduction if Reduced is signaled. >>>> The message #transduce:reduce:init: just combines the transformation and >>>> the reducing step. >>>> >>>> However, as reducing functions are step-wise in nature, an application >>>> may >>>> choose other means to process its data. >>>> >>>> >>>> ## Reducibles >>>> >>>> A data source is called reducible if it implements the reducing >>>> protocol. >>>> Default implementations are provided for collections and streams. >>>> Additionally, blocks without an argument are reducible, too. >>>> This allows to adapt to custom data sources without additional effort. >>>> For example: >>>> >>>> "XStreams adaptor" >>>> xstream := filename reading. >>>> reducible := [[xstream get] on: Incomplete do: [Reduced signal]]. >>>> >>>> "natural numbers" >>>> n := 0. >>>> reducible := [n := n+1]. >>>> >>>> >>>> ## Transducers >>>> >>>> A transducer is an object that transforms a reducing function into >>>> another. >>>> Transducers encapsulate common steps in processing data sequences, such >>>> as >>>> map, filter, concatenate, and flatten. >>>> A transducer transforms a reducing function into another via #transform: >>>> in order to add those steps. >>>> They can be composed using #* which yields a new transducer that does >>>> both >>>> transformations. >>>> Most transducers require an argument, typically blocks, symbols or >>>> numbers: >>>> >>>> square := Map function: #squared. >>>> take := Take number: 1000. >>>> >>>> To facilitate compact notation, the argument types implement >>>> corresponding >>>> methods: >>>> >>>> squareAndTake := #squared map * 1000 take. >>>> >>>> Transducers requiring no argument are singletons and can be accessed by >>>> their class name. >>>> >>>> flattenAndDedupe := Flatten * Dedupe. >>>> >>>> >>>> >>>> # Advanced Concepts >>>> >>>> ## Data flows >>>> >>>> Processing a sequence of data can often be regarded as a data flow. >>>> The operator #<~ allows define a flow from a data source through >>>> processing steps to a drain. >>>> For example: >>>> >>>> squares := Set <~ 1000 take <~ #squared map <~ (1 to: 1000). >>>> fileOut writeStream <~ #isSeparator filter <~ fileIn readStream. >>>> >>>> In both examples #<~ is only used to set up the data flow using reducing >>>> functions and transducers. >>>> In contrast to streams, transducers are completely independent from >>>> input >>>> and output sources. >>>> Hence, we have a clear separation of reading data, writing data and >>>> processing elements. >>>> - Sources know how to iterate over data with a reducing function, e.g., >>>> via #reduce:init:. >>>> - Drains know how to collect data using a reducing function. >>>> - Transducers know how to process single elements. >>>> >>>> >>>> ## Reductions >>>> >>>> A reduction binds an initial value or a block yielding an initial value >>>> to >>>> a reducing function. >>>> The idea is to define a ready-to-use process that can be applied in >>>> different contexts. >>>> Reducibles handle reductions via #reduce: and #transduce:reduce: >>>> For example: >>>> >>>> sum := #+ init: 0. >>>> sum1 := #(1 1 1) reduce: sum. >>>> sum2 := (1 to: 1000) transduce: #odd filter reduce: sum. >>>> >>>> asSet := [:set :e | set add: e; yourself] initializer: [Set new]. >>>> set1 := #(1 1 1) reduce: asSet. >>>> set2 := #(1 to: 1000) transduce: #odd filter reduce: asSet. >>>> >>>> By combining a transducer with a reduction, a process can be further >>>> modified. >>>> >>>> sumOdds := sum <~ #odd filter >>>> setOdds := asSet <~ #odd filter >>>> >>>> >>>> ## Eductions >>>> >>>> An eduction combines a reducible data sources with a transducer. >>>> The idea is to define a transformed (virtual) data source that needs not >>>> to be stored in memory. >>>> >>>> odds1 := #odd filter <~ #(1 2 3) readStream. >>>> odds2 := #odd filter <~ (1 to 1000). >>>> >>>> Depending on the underlying source, eductions can be processed once >>>> (streams, e.g., odds1) or multiple times (collections, e.g., odds2). >>>> Since no intermediate data is stored, transducers actions are lazy, >>>> i.e., >>>> they are invoked each time the eduction is processed. >>>> >>>> >>>> >>>> # Origins >>>> >>>> Transducers is based on the same-named Clojure library and its ideas. >>>> Please see: >>>> http://clojure.org/transducers >>>> >>>> >