I'm interested to help for such new "containers". May be we should proceed that way:
On Tue, May 16, 2017 at 7:44 PM, p...@highoctane.be <p...@highoctane.be> wrote: > We may also use Discord and do something "somewhat live" > > Phil > > On Tue, May 16, 2017 at 7:23 PM, <serge.stinckw...@gmail.com> wrote: > >> I was asking Philippe but hope to see you also at ESUG ! >> >> Envoyé de mon iPhone >> >> Le 16 mai 2017 à 19:02, Oleksandr Zaytsev <olk.zayt...@gmail.com> a >> écrit : >> >> I would love to, but to go to Lille from my country I would need a visa. >> Which is not that easy to acquire. >> So maybe I will come to PharoDays 2018. >> And I will definitely try to come to ESUG Conference in September. >> >> Oleks >> >> On Tue, May 16, 2017 at 7:26 PM, <serge.stinckw...@gmail.com> wrote: >> >>> >>> >>> Envoyé de mon iPhone >>> >>> Le 11 mai 2017 à 11:43, "p...@highoctane.be" <p...@highoctane.be> a >>> écrit : >>> >>> ---------- Message transféré ---------- >>> De : "p...@highoctane.be" <p...@highoctane.be> >>> Date : 11 mai 2017 10:54 >>> Objet : Re: 11/05/17 - Tabular Data Structures for Data Analysis - >>> Oleksandr Zaytsev >>> À : "Nick Papoylias" <npapoyl...@gmail.com> >>> Cc : >>> >>> >>> >>> On Thu, May 11, 2017 at 10:20 AM, Nick Papoylias <npapoyl...@gmail.com> >>> wrote: >>> >>>> >>>> >>>> On Thu, May 11, 2017 at 5:24 AM, Oleksandr Zaytsev < >>>> olk.zayt...@gmail.com> wrote: >>>> >>>>> >>>>> *A. Work done* >>>>> >>>>> - Downloaded the threaded VM as suggested by Esteban Lorenzano to >>>>> make Iceberg work. And it does! I have successfully pushed my >>>>> NeuralNetwork >>>>> code to GitHub: https://github.com/olekscode/MLNeuralNetwork >>>>> - Joined the PolyMath organization on GitHub >>>>> - Created a repository for the TabularDataset project >>>>> https://github.com/PolyMathOrg/TabularDataset >>>>> <https://github.com/PolyMathOrg/TabularDataset> as a part of >>>>> PolyMath organization on GitHub >>>>> - Fixed a PolyMath issue #25 and made a PR >>>>> - Read an article from Wolfram Mathematica documentation regarding >>>>> Dataset. It was one of the reading suggestions sent to me by Nick >>>>> Papoylias >>>>> >>>>> >>>>> *B. Next steps* >>>>> >>>>> - Fix more issues of PolyMath, using Iceberg. I have to get used >>>>> to it by the time the coding phase starts >>>>> - Read the rest of Nick Papoylias's suggestions >>>>> >>>>> >>>>> *C. Help needed* >>>>> >>>>> - The Dataset in Wolfram, as well as Pandas in Python, has a very >>>>> advanced indexing system. Smalltalk has its own special conventions for >>>>> indexing, so I think that it would be great if I got familiar with >>>>> them. >>>>> Could you suggest me some reading on this topic (what are the indexing >>>>> conventions in Smalltalk?). >>>>> For example, in Wolfram, I can write *dataset[[-1]]* to extract >>>>> the last row. But in Pharo indexes can not be negative. In Pharo I >>>>> would >>>>> say *dataset last*. But how about *dataset[[-5]]*? >>>>> >>>>> This would be a good exercise for you ;) In Pharo you can easily add >>>> negative indexing yourself. >>>> >>>> *Hint:* You know the index of the last element, since this is the size >>>> of the collection, so... ;) >>>> >>>> No need for changes, this exists already. >>> >>> Use atWrap: index put: value and atWrap: with negative indexes. >>> 'hello' atWrap: -2 >>> >>> There is a specific version for Array using a primitive. >>> #[ 10 20 30 40 ] atWrap: -1 >>> >>> atWrap:0 gives you the last item. >>> atWrap: -1 gives 30 >>> >>> This is different from 0 based index languages. >>> >>> The interesing thing about atWrap: is that it uses modulo interally so >>> you do not need to care about that. >>> >>> ($/ split: 'abc/def/ghi/jkl') atWrap: -1 >>> --> 'ghi' >>> >>> The Matrix class has a bunch of things API wise but the class is highly >>> inefficient, doing copies all the time etc. It would be nice to have some >>> kind of futures/copy on write style things in there. >>> >>> I miss cbind and rbind. These are useful. I have some half baked super >>> inefficient implementations of these things for Matrix. >>> >>> https://stat.ethz.ch/R-manual/R-devel/library/base/html/cbind.html >>> >>> The ability to name columns is also nice to have. >>> >>> In R one does: >>> >>> df <- dataframe() >>> cbind(df, c(1,2,3)) >>> cbind(df, c(4,5,6)) >>> names(df)<-("C1", "C2", "C3") >>> names can be found back with: >>> >>> names(df) >>> >>> A Smalltalkish style would be welcome. >>> >>> >>> >>> >>> Interesting ! Are you coming to PharoDays ? We can talk about that if we >>> found time. >>> >>> Maybe looking at the Voyage queries can be helpful. >>> >>> Phil >>> >>> >>> >>>> Try adding an extention method to Ordrered or SequenceableCollection. >>>> >>>> If the Pharo by example chapter is not enough or the MOOC, read the >>>> source >>>> itself in the core, to see how basic methods are implemented (it is >>>> less scary, >>>> than it sounds). >>>> >>>> You can also try Chapters 9, 10, 11 of the blue book (some API changes >>>> may apply): >>>> >>>> <http://goog_1902892863> >>>> http://sdmeta.gforge.inria.fr/FreeBooks/BlueBook/Bluebook.pdf >>>> >>>> >>>>> - Or what is the best way of implementing this index: >>>>> *dataset[["name"]]* (extracts a named row), *dataset[[1]*] >>>>> (extracts the first row)? Should I create two separate messages: >>>>> *dataset >>>>> rowNamed: 'name'* and *dataset rowAt: 1*? >>>>> >>>>> rowNamed: >>> rowAt: >>> >>> yes, look like it. >>> >>> But if we want to model things like R dataframes for example, this has >>> to be seen as a vectorized operation, so you can to use row slices, column >>> slices, and logical indexes. >>> >>> Check this out: >>> >>> http://www.r-tutor.com/r-introduction/data-frame/data-frame-row-slice >>> https://www.r-bloggers.com/working-with-data-frames/ >>> >>> >>> >>>> The internal representation of your data-structure can be anything at >>>> the moment, *as long as you encapsulate it.* >>>> >>>> (ie it can be nested OrderedCollections with meta-data for column-names >>>> to indexes, or dictionary of collections etc). >>>> >>>> *If you don't expose it to the user* (ie return it from the public >>>> api, or expect knowledge of it in argument passing), >>>> we can easily change it later. So *first make it work, and we optimize >>>> later ;)* >>>> >>>> For your case it will be a little bit trickier because *you also have >>>> the notions of a) rows and b) columns*, which >>>> are exposed to the user. So *you would need to create abstractions* >>>> for these too. >>>> >>>> Cheers, >>>> >>>> Nick >>>> >>>>> >>>>> - >>>>> >>>>> >>>>> If someone else is having problems with Iceberg on Linux, try >>>>> downloading the threaded VM: >>>>> >>>>> wget -O- get.pharo.org/vmT60 | bash >>>>> >>>>> And use SSH (not HTTPS) remote URL. >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Pharo Google Summer of Code" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to pharo-gsoc+unsubscr...@googlegroups.com. >>>>> To post to this group, send email to pharo-g...@googlegroups.com. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/pharo-gsoc/CAEp0Uzu-8fw3dA >>>>> 6ezVoj-QptvLcB8cWPHvZ1tfLg1Ce8qkTqfQ%40mail.gmail.com >>>>> <https://groups.google.com/d/msgid/pharo-gsoc/CAEp0Uzu-8fw3dA6ezVoj-QptvLcB8cWPHvZ1tfLg1Ce8qkTqfQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Pharo Google Summer of Code" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to pharo-gsoc+unsubscr...@googlegroups.com. >>>> To post to this group, send email to pharo-g...@googlegroups.com. >>>> To view this discussion on the web visit https://groups.google.com/d/ms >>>> gid/pharo-gsoc/CACEStOgLC6HbYJ8HBLHWfs5%2BwqN3ib_kdVGuVizx7G >>>> h1c0sM%3DA%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/pharo-gsoc/CACEStOgLC6HbYJ8HBLHWfs5%2BwqN3ib_kdVGuVizx7Gh1c0sM%3DA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >> >