[ANN] tech.ml.dataset - 2.0

2020-06-15 Thread Chris Nuernberger
Good morning Clojurians :-)

It is with much pride that I announce version 2.0 of tech.ml.dataset
, our library that maps
powerful concepts from libraries like Pandas and data.table into Clojure
using functional paradigms. This data frame

library has unified loading from csv, tsv, xlsx, xls, Apache parquet,
Apache arrow (.feather), sql, json and sequences of maps as well as
efficient cpu and memory

performance. Finally, because the dataset knows the datatype of each
column, you can interoperate with schema-ful things like SQL
 without writing down
the schema.


user> (require '[tech.ml.dataset :as ds])
nil
user> (-> (ds/->dataset "https://vega.github.io/vega/data/stocks.csv";)
  (ds/descriptive-stats))https://vega.github.io/vega/data/stocks.csv:
descriptive-stats [3 10]:

| :col-name |  :datatype | :n-valid | :n-missing |   :min
|  :mean | :mode |   :max | :standard-deviation | :skew |
|---||--||||---||-|---|
|  date | :packed-local-date |  560 |  0 | 2000-01-01
| 2005-05-12 |   | 2010-03-01 | |   |
| price |   :float32 |  560 |  0 |  5.970
|  100.7 |   |  707.0 |   132.6 | 2.413 |
|symbol |:string |  560 |  0 |
||  MSFT || |   |

Data science is (still) alive and well in Clojure and the JVM. Stepping
back and considering python bindings
, R bindings
, smile ,
the next-gen blas/numerics library Neanderthal
 and the exceptionally
powerful saite science platform , we
have really come a long way in the last year!

Thanks and enjoy :-)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/clojure/CADbpEJuDaq1719%3DXfXvPUP3gU4SdGLRNSECqXE4zOk3Vi%3D9qfA%40mail.gmail.com.


Re: first time without state - and I'm lost

2020-06-15 Thread Ernesto Garcia
Hi, it's a long time that this question was posted, but I have found it 
interesting in the implementation of token refreshes.

First of all, for service invocation, given a `revise-oauth-token` method, 
I think this is good client code:

(http/request
  {:method :get 
   :url "https://example.com/";
   :oauth-token (revise-oauth-token token-store)})

If you find it too repetitive or fragile in your client code, you can make 
a local function, but I wouldn't abstract the service invocation at a 
higher layer.

Regarding the implementation of the token store, we could initially think 
of a synchronized store, like an atom, and `revise-oauth-token` would swap 
its content when a refresh is required. This is inconvenient for 
multithreaded clients, because there could be several refresh invocations 
going on concurrently.

In order to avoid concurrent refreshes, I propose to implement the token 
store as an atom of promises. Implementation of `revise-oauth-token` would 
be:

(defn revise-oauth-token [token-store]
  (:access_token
@(swap! token-store
   (fn [token-promise]
 (if (token-needs-refresh? @token-promise (Instant/now))
   (delay (refresh-oauth-token (:refresh_token @token-promise)))
   token-promise)

Note that using a delay avoids running `refresh-oauth-token` within the 
`swap!` operation, as this operation may be run multiple times.
Also note that `token-needs-refresh` takes an argument with the present 
time. This keeps the function pure, which could help for unit testing, for 
example.

There is an alternative implementation using `compare-and-set!` that avoids 
checking `token-needs-refresh?` several times, but it is more complicated. 
I have posted full sample code in a gist: 
https://gist.github.com/titogarcia/4f09bcc5fa38fbdc1076954b9a99a8fc

Remark: None of this refers to "functional programming" per se. Dealing 
with state in a purely functional way involves using different constructs 
(like possibly monads, for which you can find Clojure libraries if you are 
interested), and best practices are still a topic of research. Clojure has 
taken the pragmatic approach of making purely functional code easy to 
write, but it doesn't reject the use of state, rather it provides 
well-behaved primitives like vars, atoms, agents, etc.

Ernesto

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/clojure/ac79058b-2c31-4b9c-9cf3-e2de998eb8deo%40googlegroups.com.


Re: first time without state - and I'm lost

2020-06-15 Thread Justin Smith
The usage of delay here is clever. I suggest as an addition, using
`force` instead of `deref` to disambiguate delay vs. atom (of course
if you take a few moments to think about it, swap! shouldn't return an
atom etc., but I think it becomes clearer with force).

On Mon, Jun 15, 2020 at 10:34 AM Ernesto Garcia  wrote:
>
> Hi, it's a long time that this question was posted, but I have found it 
> interesting in the implementation of token refreshes.
>
> First of all, for service invocation, given a `revise-oauth-token` method, I 
> think this is good client code:
>
> (http/request
>   {:method :get
>:url "https://example.com/";
>:oauth-token (revise-oauth-token token-store)})
>
> If you find it too repetitive or fragile in your client code, you can make a 
> local function, but I wouldn't abstract the service invocation at a higher 
> layer.
>
> Regarding the implementation of the token store, we could initially think of 
> a synchronized store, like an atom, and `revise-oauth-token` would swap its 
> content when a refresh is required. This is inconvenient for multithreaded 
> clients, because there could be several refresh invocations going on 
> concurrently.
>
> In order to avoid concurrent refreshes, I propose to implement the token 
> store as an atom of promises. Implementation of `revise-oauth-token` would be:
>
> (defn revise-oauth-token [token-store]
>   (:access_token
> @(swap! token-store
>(fn [token-promise]
>  (if (token-needs-refresh? @token-promise (Instant/now))
>(delay (refresh-oauth-token (:refresh_token @token-promise)))
>token-promise)
>
> Note that using a delay avoids running `refresh-oauth-token` within the 
> `swap!` operation, as this operation may be run multiple times.
> Also note that `token-needs-refresh` takes an argument with the present time. 
> This keeps the function pure, which could help for unit testing, for example.
>
> There is an alternative implementation using `compare-and-set!` that avoids 
> checking `token-needs-refresh?` several times, but it is more complicated. I 
> have posted full sample code in a gist: 
> https://gist.github.com/titogarcia/4f09bcc5fa38fbdc1076954b9a99a8fc
>
> Remark: None of this refers to "functional programming" per se. Dealing with 
> state in a purely functional way involves using different constructs (like 
> possibly monads, for which you can find Clojure libraries if you are 
> interested), and best practices are still a topic of research. Clojure has 
> taken the pragmatic approach of making purely functional code easy to write, 
> but it doesn't reject the use of state, rather it provides well-behaved 
> primitives like vars, atoms, agents, etc.
>
> Ernesto
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups 
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clojure+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/clojure/ac79058b-2c31-4b9c-9cf3-e2de998eb8deo%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/clojure/CAGokn9L_Od2ZN2LJAsYUfJ2G_hbLKkamkUxgFX2vTKySxpHQWg%40mail.gmail.com.


Re: first time without state - and I'm lost

2020-06-15 Thread Ernesto
That's nice. We could do something like:

(-> (swap! ...)
  force
  :access_token)

On Mon, Jun 15, 2020 at 7:57 PM Justin Smith  wrote:

> The usage of delay here is clever. I suggest as an addition, using
> `force` instead of `deref` to disambiguate delay vs. atom (of course
> if you take a few moments to think about it, swap! shouldn't return an
> atom etc., but I think it becomes clearer with force).
>
> On Mon, Jun 15, 2020 at 10:34 AM Ernesto Garcia 
> wrote:
> >
> > Hi, it's a long time that this question was posted, but I have found it
> interesting in the implementation of token refreshes.
> >
> > First of all, for service invocation, given a `revise-oauth-token`
> method, I think this is good client code:
> >
> > (http/request
> >   {:method :get
> >:url "https://example.com/";
> >:oauth-token (revise-oauth-token token-store)})
> >
> > If you find it too repetitive or fragile in your client code, you can
> make a local function, but I wouldn't abstract the service invocation at a
> higher layer.
> >
> > Regarding the implementation of the token store, we could initially
> think of a synchronized store, like an atom, and `revise-oauth-token` would
> swap its content when a refresh is required. This is inconvenient for
> multithreaded clients, because there could be several refresh invocations
> going on concurrently.
> >
> > In order to avoid concurrent refreshes, I propose to implement the token
> store as an atom of promises. Implementation of `revise-oauth-token` would
> be:
> >
> > (defn revise-oauth-token [token-store]
> >   (:access_token
> > @(swap! token-store
> >(fn [token-promise]
> >  (if (token-needs-refresh? @token-promise (Instant/now))
> >(delay (refresh-oauth-token (:refresh_token @token-promise)))
> >token-promise)
> >
> > Note that using a delay avoids running `refresh-oauth-token` within the
> `swap!` operation, as this operation may be run multiple times.
> > Also note that `token-needs-refresh` takes an argument with the present
> time. This keeps the function pure, which could help for unit testing, for
> example.
> >
> > There is an alternative implementation using `compare-and-set!` that
> avoids checking `token-needs-refresh?` several times, but it is more
> complicated. I have posted full sample code in a gist:
> https://gist.github.com/titogarcia/4f09bcc5fa38fbdc1076954b9a99a8fc
> >
> > Remark: None of this refers to "functional programming" per se. Dealing
> with state in a purely functional way involves using different constructs
> (like possibly monads, for which you can find Clojure libraries if you are
> interested), and best practices are still a topic of research. Clojure has
> taken the pragmatic approach of making purely functional code easy to
> write, but it doesn't reject the use of state, rather it provides
> well-behaved primitives like vars, atoms, agents, etc.
> >
> > Ernesto
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Clojure" group.
> > To post to this group, send email to clojure@googlegroups.com
> > Note that posts from new members are moderated - please be patient with
> your first post.
> > To unsubscribe from this group, send email to
> > clojure+unsubscr...@googlegroups.com
> > For more options, visit this group at
> > http://groups.google.com/group/clojure?hl=en
> > ---
> > You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to clojure+unsubscr...@googlegroups.com.
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/clojure/ac79058b-2c31-4b9c-9cf3-e2de998eb8deo%40googlegroups.com
> .
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "Clojure" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/clojure/Vur5Lol45EE/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> clojure+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/clojure/CAGokn9L_Od2ZN2LJAsYUfJ2G_hbLKkamkUxgFX2vTKySxpHQWg%40mail.gmail.com
> .
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+u

Re: [ANN] tech.ml.dataset - 2.0

2020-06-15 Thread Alexandre Almosni
Congratulations. This is really a great effort and something we really 
needed. I hope the community takes this as the base layer for data science 
and we can build on your efforts, expand the documentation, etc.



On Monday, June 15, 2020 at 5:50:52 PM UTC+1, Chris Nuernberger wrote:
>
> Good morning Clojurians :-)
>
> It is with much pride that I announce version 2.0 of tech.ml.dataset 
> , our library that maps 
> powerful concepts from libraries like Pandas and data.table into Clojure 
> using functional paradigms. This data frame 
>  
> library has unified loading from csv, tsv, xlsx, xls, Apache parquet, 
> Apache arrow (.feather), sql, json and sequences of maps as well as 
> efficient cpu and memory 
>  
> performance. Finally, because the dataset knows the datatype of each 
> column, you can interoperate with schema-ful things like SQL 
>  without writing down 
> the schema.
>
>
> user> (require '[tech.ml.dataset :as ds])
> nil
> user> (-> (ds/->dataset "https://vega.github.io/vega/data/stocks.csv";)
>   (ds/descriptive-stats))https://vega.github.io/vega/data/stocks.csv: 
> descriptive-stats [3 10]:
>
> | :col-name |  :datatype | :n-valid | :n-missing |   :min |  
> :mean | :mode |   :max | :standard-deviation | :skew |
> |---||--||||---||-|---|
> |  date | :packed-local-date |  560 |  0 | 2000-01-01 | 
> 2005-05-12 |   | 2010-03-01 | |   |
> | price |   :float32 |  560 |  0 |  5.970 |  
> 100.7 |   |  707.0 |   132.6 | 2.413 |
> |symbol |:string |  560 |  0 ||   
>  |  MSFT || |   |
>
> Data science is (still) alive and well in Clojure and the JVM. Stepping 
> back and considering python bindings 
> , R bindings 
> , smile , 
> the next-gen blas/numerics library Neanderthal 
>  and the exceptionally 
> powerful saite science platform , we 
> have really come a long way in the last year! 
>
> Thanks and enjoy :-)
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/clojure/d2063089-7985-4de7-8c40-fd178667dcbbo%40googlegroups.com.


Re: [ANN] tech.ml.dataset - 2.0

2020-06-15 Thread Chris Nuernberger
Thank you Alexandre!  I have to admit it is a *ton* of work.  I think there
are lots of good pathways literally every direction such as simplifying the
numerics layer (tech.datatype), potentially getting a subset working on
graalvm-native, zero-copy conversion when possible for parquet and arrow
(totally possible in lots of cases), etc. etc; it just depends on what
seems like it provides the most value to everyone.

Plus learning just exactly how to use this system is a thing; it is complex
as are numpy, pandas, data.table   Bridging between Clojure and APL,
and C puts this in a unique position.

That being said, Thomaz has released tablecloth
 which has a more advanced dataset
api based on the primitives in tech.ml.dataset with some great documentation
.

On Mon, Jun 15, 2020 at 1:48 PM Alexandre Almosni <
alexandre.almo...@gmail.com> wrote:

> Congratulations. This is really a great effort and something we really
> needed. I hope the community takes this as the base layer for data science
> and we can build on your efforts, expand the documentation, etc.
>
>
>
> On Monday, June 15, 2020 at 5:50:52 PM UTC+1, Chris Nuernberger wrote:
>>
>> Good morning Clojurians :-)
>>
>> It is with much pride that I announce version 2.0 of tech.ml.dataset
>> , our library that maps
>> powerful concepts from libraries like Pandas and data.table into Clojure
>> using functional paradigms. This data frame
>> 
>> library has unified loading from csv, tsv, xlsx, xls, Apache parquet,
>> Apache arrow (.feather), sql, json and sequences of maps as well as
>> efficient cpu and memory
>> 
>> performance. Finally, because the dataset knows the datatype of each
>> column, you can interoperate with schema-ful things like SQL
>>  without writing down
>> the schema.
>>
>>
>> user> (require '[tech.ml.dataset :as ds])
>> nil
>> user> (-> (ds/->dataset "https://vega.github.io/vega/data/stocks.csv";)
>>   
>> (ds/descriptive-stats))https://vega.github.io/vega/data/stocks.csv: 
>> descriptive-stats [3 10]:
>>
>> | :col-name |  :datatype | :n-valid | :n-missing |   :min |  
>> :mean | :mode |   :max | :standard-deviation | :skew |
>> |---||--||||---||-|---|
>> |  date | :packed-local-date |  560 |  0 | 2000-01-01 | 
>> 2005-05-12 |   | 2010-03-01 | |   |
>> | price |   :float32 |  560 |  0 |  5.970 |  
>> 100.7 |   |  707.0 |   132.6 | 2.413 |
>> |symbol |:string |  560 |  0 ||  
>>   |  MSFT || |   |
>>
>> Data science is (still) alive and well in Clojure and the JVM. Stepping
>> back and considering python bindings
>> , R bindings
>> , smile ,
>> the next-gen blas/numerics library Neanderthal
>>  and the exceptionally
>> powerful saite science platform ,
>> we have really come a long way in the last year!
>>
>> Thanks and enjoy :-)
>>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/clojure/d2063089-7985-4de7-8c40-fd178667dcbbo%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are su