Thank you Alexandre! I have to admit it is a *ton* of work. I think there are lots of good pathways literally every direction such as simplifying the numerics layer (tech.datatype), potentially getting a subset working on graalvm-native, zero-copy conversion when possible for parquet and arrow (totally possible in lots of cases), etc. etc; it just depends on what seems like it provides the most value to everyone.
Plus learning just exactly how to use this system is a thing; it is complex as are numpy, pandas, data.table .... Bridging between Clojure and APL, and C puts this in a unique position. That being said, Thomaz has released tablecloth <https://github.com/scicloj/tablecloth> which has a more advanced dataset api based on the primitives in tech.ml.dataset with some great documentation <https://scicloj.github.io/tablecloth/index.html>. On Mon, Jun 15, 2020 at 1:48 PM Alexandre Almosni < alexandre.almo...@gmail.com> wrote: > Congratulations. This is really a great effort and something we really > needed. I hope the community takes this as the base layer for data science > and we can build on your efforts, expand the documentation, etc. > > > > On Monday, June 15, 2020 at 5:50:52 PM UTC+1, Chris Nuernberger wrote: >> >> Good morning Clojurians :-) >> >> It is with much pride that I announce version 2.0 of tech.ml.dataset >> <https://github.com/techascent/tech.ml.dataset>, our library that maps >> powerful concepts from libraries like Pandas and data.table into Clojure >> using functional paradigms. This data frame >> <https://github.com/mobileink/data.frame/wiki/What-is-a-Data-Frame%3F> >> library has unified loading from csv, tsv, xlsx, xls, Apache parquet, >> Apache arrow (.feather), sql, json and sequences of maps as well as >> efficient cpu and memory >> <https://gist.github.com/cnuernber/26b88ed259dd1d0dc6ac2aa138eecf37> >> performance. Finally, because the dataset knows the datatype of each >> column, you can interoperate with schema-ful things like SQL >> <https://github.com/techascent/tech.ml.dataset.sql> without writing down >> the schema. >> >> >> user> (require '[tech.ml.dataset :as ds]) >> nil >> user> (-> (ds/->dataset "https://vega.github.io/vega/data/stocks.csv") >> >> (ds/descriptive-stats))https://vega.github.io/vega/data/stocks.csv: >> descriptive-stats [3 10]: >> >> | :col-name | :datatype | :n-valid | :n-missing | :min | >> :mean | :mode | :max | :standard-deviation | :skew | >> |-----------|--------------------|----------|------------|------------|------------|-------|------------|---------------------|-------| >> | date | :packed-local-date | 560 | 0 | 2000-01-01 | >> 2005-05-12 | | 2010-03-01 | | | >> | price | :float32 | 560 | 0 | 5.970 | >> 100.7 | | 707.0 | 132.6 | 2.413 | >> | symbol | :string | 560 | 0 | | >> | MSFT | | | | >> >> Data science is (still) alive and well in Clojure and the JVM. Stepping >> back and considering python bindings >> <https://github.com/clj-python/libpython-clj>, R bindings >> <https://github.com/scicloj/clojisr>, smile <https://haifengl.github.io/>, >> the next-gen blas/numerics library Neanderthal >> <https://github.com/uncomplicate/neanderthal> and the exceptionally >> powerful saite science platform <https://github.com/jsa-aerial/saite>, >> we have really come a long way in the last year! >> >> Thanks and enjoy :-) >> > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/clojure/d2063089-7985-4de7-8c40-fd178667dcbbo%40googlegroups.com > <https://groups.google.com/d/msgid/clojure/d2063089-7985-4de7-8c40-fd178667dcbbo%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/clojure/CADbpEJvV3xu1N05cO--BPTPH8q%3DOPdf%2B_c%3DULXse7ba9DC-ACw%40mail.gmail.com.