One of the challenges with random number generation is that there are quite 
a few specialised requirements. I don't believe a generic approach can meet 
all needs. I think we actually need a few things:

1. Better implementation for clojure.core/rand etc. I think conditional 
usage of j.u.c.ThreadLocalRandom for Java > 1.7 would be great if we can 
make it work - there are plenty of concurrent workloads where a shared 
regular java.util.Random isn't a good solution.
2. A library generic random number generation tools (e.g. "data.random" - 
should be general purpose, able to generate a wide range of useful 
districutions, allow arbitrary java.util.Random instances to be passed as 
seeds etc.)
3. More specialised solutions can live in specific libraries (e.g. 
core.matrix will be getting support for generation of random matrices 
etc.). Often specialised implementations will offer much better performance 
for specific use cases, so we need to keep this option open. An example 
would be generating large random boolean matrices - generating and storing 
individual bits in bulk is *much* more efficient than going via generic 
random number functions for each bit.

I think we should clearly separate random number generation from sample 
data construction. The latter certainly depends upon the former, but random 
numbers have a lot of other independent use cases. Hence I'm in favour of 
something like "data.random" being separate from "data.generators"

On Thursday, 5 June 2014 05:53:10 UTC+1, Mars0i wrote:
>
> clojure.core provides a minimal set of functions for random effects: rand, 
> rand-int, and rand-nth, currently with no simple ability to base these on a 
> resettable random number generator or on different RNGs in different 
> threads.  (But see this ticket 
> <http://dev.clojure.org/jira/browse/CLJ-1420> pointed out by Andy 
> Fingerhut in another thread.)
>
> data.generators includes additional useful general-purpose functions 
> involving random numbers and random choices, but this is entirely not 
> obvious when you read the docstrings.  (Some of the docstrings are pretty 
> mysterious.)  It's also not necessarily what one would guess from the name 
> of the library.  (None of this is a criticism of anyone or anything about 
> the project.  Data.generators is at an 0.n.m release stage.  I'm very 
> grateful for the work that people have put in on it.)
>
> As I understand it, data.generators was split off from test.generative, 
> which sounds like a good idea.So data.generators was intended to provide 
> functions that generate random data for testing.  (I imagine that the 
> existing documentation makes more sense in the context of test.generative, 
> too.)
>
> However, what's in data.generator has more general applications, for 
> people who want random numbers, samples, etc. outside of software testing.  
> (In my case, that would be for random effects in scientific simulations.)  
> Off the top of my head, it seems to me that these other applications might 
> have slightly different needs from the use of data.generators by 
> test.generative.  
>
> For one thing, efficiency might matter a lot in some simulations, but not 
> in software testing.  (At least, *I* wouldn't care if my test functions 
> were slow.)  I'm not saying that functions in data.generator are slow, but 
> I don't think there's a good reason to worry about making them efficient if 
> they're only intended for software testing.
>
> Further, there are other needs than are currently provided by 
> test.generators.  See the sampling functions in bigml/sampling 
> <https://github.com/bigmlcom/sampling> or Incanter <http://incanter.org/>, 
> for example, and lots of other random functions that Incanter provides.  
> Some of those should remain in Incanter, of course, but I wonder whether 
> Clojure would benefit from a contributed library that satisfied a set of 
> core needs for random effects.  (Incanter partly builds on clojure.core's 
> rand at this point.)
>
> Maybe data.generators is/will be that library.  Or maybe parts of 
> data.generators would make more sense as part of a separate library 
> (math.random? data.random? math.probability?) that could be split out of 
> data.generators.  (If it doesn't make sense to split data.generators, then 
> would a new name for the library be more appropriate?)
>
> Just some things I was wondering about.  Curious to see what others say.
>
> (Fun tip: Check out data.generators' anything function, which is like 
> Emacs' Zippy the Pinhead functions for people who prefer industrial atonal 
> music composed by randomly filtered Jackson Pollock paintings, to speech.  
> Or: People who want to thoroughly test their functions by throwing random 
> randomly-typed data at them.)
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to