Am 06.02.2016 um 12:31 schrieb James Carman: > Okay, folks, this is definitely getting out of hand. Let's put a moratorium > on this thread for the weekend or something and try to come back together > next week and try to move forward. I would urge folks to watch this while > we wait: > > https://m.youtube.com/watch?v=rOWmrlft2FI > > p.s. Phil, I do hope you'll reconsider.
Big +1! Please take some time to rethink. This would be a huge loss for this community. Oliver > > On Fri, Feb 5, 2016 at 10:47 PM Phil Steitz <phil.ste...@gmail.com> wrote: > >> OK, I give up. I am withdrawing as volunteer chair or member of the >> new TLP. >> >> Phil >> >> On 2/5/16 7:23 PM, Gilles wrote: >>> Phil, >>> >>> You talk again about me trying to push forward changes that >>> serve no purpose besides "trash performance and correctness". >>> This is again baseless FUD to which I've already answered >>> (with detailed list of facts which you chose to ignore). >>> You declare anything for which you don't have an answer as >>> "bogus argument". Why is the reference to multi-threaded >>> implementations bogus? You contradict yourself in pretending >>> that CM RNGs could be so good as to make people want to use >>> them while refusing to consider whether another design might >>> be better suited to such high(er)-performance extensions. >>> This particular case is a long shot but if any and all >>> discussions are stopped dead, how do you imagine that we can >>> go anywhere? >>> As you could read from experts, micro-benchmarks are deceiving; >>> but you refuse to even consider alternative designs if there >>> might be a slight suspicion of degradation. >>> How can we ever set up a constructive discussion on how to >>> make everybody enjoy this project if the purported chair is >>> so bent to protecting existing code rather than nurture a good >>> relationship with developers who may sometimes have other ideas? >>> I'm trying to improve the code (in a dimension which you can't >>> seem to understand unfortunately) but respectfully request >>> data points from those users of said code, in order to be >>> able to prove that no harm will be done. >>> But you seem to prefer to not disclose anything that would >>> get us closer to agreement (better design with similar >>> performance and room for improvement, to be discussed >>> together as a real development team -- Not you requiring, >>> as a bad boss, that I bow to your standards for judging >>> usefulness). >>> This 1% which you throw at me, where does it come from? >>> What does 1% mean when the benchmark shows standard deviations >>> that vary from 4 to 26% in the "nextInt" case and from 3 to >>> 7% in the "nextGaussian" case? >>> This 1% looks meaningless without context; context is what I'm >>> asking in order to try and establish objectively whether >>> another design will have a measurable impact on actual tasks. >>> I'm not going to show any "damaged" benchmark because of how >>> unwelcome you make me feel every time I wish to talk about >>> other aspects of the code. >>> There is no development community here. Only solitary >>> coders who share a repository. >>> >>> Not sorry for the top-post, >>> Gilles >>> >>> >>> On Fri, 5 Feb 2016 17:07:16 -0700, Phil Steitz wrote: >>>> On 2/5/16 12:59 PM, Gilles wrote: >>>>> On Fri, 5 Feb 2016 06:50:10 -0700, Phil Steitz wrote: >>>>>> On 2/4/16 3:59 PM, Gilles wrote: >>>>>>> Hi. >>>>>>> >>>>>>> Here is a micro-benchmark report (performed with >>>>>>> "PerfTestUtils"): >>>>>>> ----- >>>>>>> nextInt() (calls per timed block: 2000000, timed blocks: 100, >>>>>>> time >>>>>>> unit: ms) >>>>>>> name time/call std dev total time ratio >>>>>>> cv difference >>>>>>> o.a.c.m.r.JDKRandomGenerator 1.088e-05 2.8e-06 2.1761e+03 1.000 >>>>>>> 0.26 0.0000e+00 >>>>>>> o.a.c.m.r.MersenneTwister 1.024e-05 1.5e-06 2.0471e+03 0.941 >>>>>>> 0.15 -1.2900e+02 >>>>>>> o.a.c.m.r.Well512a 1.193e-05 4.4e-07 2.3864e+03 1.097 >>>>>>> 0.04 2.1032e+02 >>>>>>> o.a.c.m.r.Well1024a 1.348e-05 1.9e-06 2.6955e+03 1.239 >>>>>>> 0.14 5.1945e+02 >>>>>>> o.a.c.m.r.Well19937a 1.495e-05 2.1e-06 2.9906e+03 1.374 >>>>>>> 0.14 8.1451e+02 >>>>>>> o.a.c.m.r.Well19937c 1.577e-05 8.8e-07 3.1542e+03 1.450 >>>>>>> 0.06 9.7816e+02 >>>>>>> o.a.c.m.r.Well44497a 1.918e-05 1.4e-06 3.8363e+03 1.763 >>>>>>> 0.08 1.6602e+03 >>>>>>> o.a.c.m.r.Well44497b 1.953e-05 2.8e-06 3.9062e+03 1.795 >>>>>>> 0.14 1.7301e+03 >>>>>>> o.a.c.m.r.ISAACRandom 1.169e-05 1.9e-06 2.3375e+03 1.074 >>>>>>> 0.16 1.6139e+02 >>>>>>> ----- >>>>>>> where "cv" is the ratio of the 3rd to the 2nd column. >>>>>>> >>>>>>> Questions are: >>>>>>> * How meaningful are micro-benchmarks when the timed operation >>>>>>> has >>>>>>> a very >>>>>>> small duration (wrt e.g. the duration of other machine >>>>>>> instructions that >>>>>>> are required to perform them)? >>>>>> >>>>>> It is harder to get good benchmarks for shorter duration >>>>>> activities, >>>>>> but not impossible. One thing that it would be good to do is to >>>>>> compare these results with JMH [1]. >>>>> >>>>> I was expecting insights based on the benchmark which I did run. >>>> >>>> You asked whether or not benchmarks are meaningful when the task >>>> being benchmarked is short duration. I answered that question. >>>>> >>>>> We have a tool in CM; if it's wrong, we should remove it. >>>>> How its results compare with JMH is an interesting question, >>>> >>>> I will look into this. >>>>> I >>>>> agree, but I don't have time to make an analysis of benchmarking >>>>> tools (on top of what I've been doing since December because >>>>> totally innocuous changes in the RNG classes were frowned upon >>>>> out of baseless fear). >>>> >>>> Please cut the hypberbole. >>>>> >>>>>>> * In a given environment (HW, OS, JVM), is there a lower limit >>>>>>> (absolute >>>>>>> duration) below which anything will be deemed good enough? >>>>>> >>>>>> That depends completely on the application. >>>>> >>>>> Sorry, I thought that it was obvious: I don't speak of applications >>>>> that don't care about performance. :-) >>>>> >>>>> For those that do, I do not agree with the statement: the question >>>>> relates to finding a point below which it is the environment that >>>>> overwhelms the other conditions. >>>>> A point where there will be _unavoidable_ overhead (transferring >>>>> data >>>>> from/to memory, JVM book-keeping, ...) and perturbations (context >>>>> switches, ...) such that their duration adds a constant time (on >>>>> average) that may render most enhancements to an already efficient >>>>> algorithm barely noticeable in practice. >>>>> Similarly, but in the opposite direction, some language constructs >>>>> or design choices might slow down things a bit, but without >>>>> endangering any user. >>>>> >>>>> A problem arises when any enhancement to the design is deemed >>>>> harmful because it degrades a micro-benchmark, even though that >>>>> benchmark may not reflect any real use-cases. >>>>> Then, the real harm is against development. >>>>> >>>>>>> * Can a library like CM admit a trade-off between ultimate >>>>>>> performance and >>>>>>> good design? IOW, is there an acceptable overhead in exchange >>>>>>> for other qualities >>>>>>> (clarity, non-redundancy, extensibility, etc.)? >>>>>> >>>>>> That is too general a question to be meaningful. We need to look >>>>>> at specific cases. What exactly are you proposing? >>>>> >>>>> <rant> >>>>> It is quite meaningful even if it refers to general principles. >>>>> Those could (should, IMO) be taken into account when managing a >>>>> project like CM, on a par with "performance" (whose intrinsic value >>>>> is never questioned). >>>>> </rant> >>>> >>>> Rant all you want. Vague generalities and hyperbole have no value. >>>>> >>>>> Two specific cases are: >>>>> * inheritance vs delegation (a.k.a. composition) >>>>> * generics (that could require runtime casts) >>>> >>>> This is getting closer to meaningful. Where exactly in the code are >>>> you wanting to use something and seeing benchmark damage? >>>>> >>>>>>> * Does ultimate performance for the base functionality >>>>>>> (generation >>>>>>> of a >>>>>>> random number) trump any consideration of use-cases that would >>>>>>> need an >>>>>>> extension (of the base functionality, such as computation to >>>>>>> match another >>>>>>> distribution) that will unavoidably degrades the performance >>>>>>> (hence the >>>>>>> micro-benchmark will be completely misleading for those users)? >>>>>> >>>>>> Again, this is vague and the answer depends on what exactly you >>>>>> are >>>>>> talking about. Significantly damaging performance of PRNG >>>>>> implementations is a bad idea, >>>>> >>>>> Now, *this* is vague: what do you mean by "significantly"? >>>>> That was actually my question in the first place. >>>> If you are talking about PRNG performance, I would say a 1% hit is >>>> significant. >>>>> Referring to the >>>>> benchmark above, people who'd know why they require ultimate >>>>> performance >>>>> should be able to tell what range of numbers they'd find >>>>> acceptable in >>>>> that table. >>>>> >>>>> <rant> >>>>> Actually my questions are very precise, but the answers would >>>>> require >>>>> some decent analysis, rather than the usual "bad idea" dismissal. >>>>> </rant> >>>>> >>>>> In the Javadoc of the "random" package, there is information about >>>>> performance but no reference as to the benchmarking procedure. >>>> >>>> It would be great to repeat these using JMH, which is emerging as a >>>> de facto standard for java benchmarking. I will look into this. >>>>> >>>>> I can consistently observe a totally different behaviour (using >>>>> "PerfTestUtils"): >>>>> 1. "MersenneTwister" is *always* faster than all of the WELL RNGs; >>>>> 2. moreover, the ratio *grows* with each of the longer periods >>>>> members of the WELL family (see the above table). >>>>> >>>>> This makes me wonder how someone who purports to need "ultimate" >>>>> performance can have any objective basis to determine what is good >>>>> or bad for his own applications. >>>>> >>>>>> unless there are actual practical use >>>>>> cases you can point to that whatever changes you are proposing >>>>>> enable. >>>>> >>>>> As I've explained in very much details in another thread, I've >>>>> reviewed (from a design POV) the RNG code in "random" and IMHO, >>>>> there >>>>> is room for improvement (cf. above for what I mean by that term). >>>>> <rant> >>>>> I have some code ready for review but I had to resort to what I >>>>> considered sub-optimal design (preemptively renouncing to propose a >>>>> "delegation"-based design) solely because of the destructive >>>>> community >>>>> process that takes place here.[1] >>>>> </rant> >>>> >>>> More vague hyperbole that serves no purpose. Please focus on actual >>>> code or design issues. >>>>> >>>>> The practical use-cases is anything that needs further >>>>> processing of >>>>> the numbers produced according to a uniform distribution: >>>> >>>> Isn't that what the samplers in the distributions package do? What >>>> we need from the PRNG implementations is just blocks of bits. Since >>>> we wanted a pluggable replacement for j.u.Random, we added uniform >>>> ints, longs and floats and gaussian floats. The samplers just need >>>> uniform doubles. The practical use case we need is well-supported >>>> in the code we have. What is missing, exactly? >>>>> I agree that >>>>> there would be little sense to code that latter part in a "pure" OO >>>>> way[2]. And Luc made it indeed quite efficient, I think, in the >>>>> various >>>>> concrete classes. >>>>> What I want to reconsider is how those concrete low-level >>>>> algorithms can >>>>> be plugged in a higher-level function that just requires a >>>>> "source of >>>>> randomness", as I'd call a provider of "int" (or "long") values, >>>>> where >>>>> the high level functionality does not care at all about the >>>>> provider's >>>>> inner working (a.o. how it's seeded!). >>>> >>>> This is why many higher-level samplers and other things that require >>>> random data inside [math] have a pluggable RandomGenerator. >>>>> >>>>> A case in point is the sampling of other distributions (namely the >>>>> Normal distribution). >>>> >>>> Or any of the others. We have a default, inversion-based method >>>> that the abstract distribution classes provide and some pretty good >>>> specialized implementations within individual distributions. Most >>>> of these just require uniform random doubles as source. >>>> >>>>> >>>>> Here is the benchmark report: >>>>> ----- >>>>> nextGaussian() (calls per timed block: 2000000, timed blocks: 100, >>>>> time unit: ms) >>>>> name time/call std dev total time ratio >>>>> cv difference >>>>> o.a.c.m.r.JDKRandomGenerator 1.200e-05 1.7e-06 2.4001e+03 1.000 >>>>> 0.14 0.0000e+00 >>>>> o.a.c.m.r.JDKRandomGenerator 7.646e-05 5.1e-06 1.5292e+04 6.371 >>>>> 0.07 1.2892e+04 >>>>> o.a.c.m.r.MersenneTwister 6.396e-05 3.6e-06 1.2793e+04 5.330 >>>>> 0.06 1.0393e+04 >>>>> o.a.c.m.r.Well512a 6.880e-05 5.0e-06 1.3760e+04 5.733 >>>>> 0.07 1.1360e+04 >>>>> o.a.c.m.r.Well1024a 6.956e-05 3.0e-06 1.3913e+04 5.797 >>>>> 0.04 1.1513e+04 >>>>> o.a.c.m.r.Well19937a 7.262e-05 2.0e-06 1.4525e+04 6.052 >>>>> 0.03 1.2125e+04 >>>>> o.a.c.m.r.Well19937c 7.164e-05 4.3e-06 1.4329e+04 5.970 >>>>> 0.06 1.1928e+04 >>>>> o.a.c.m.r.Well44497a 8.166e-05 3.2e-06 1.6332e+04 6.804 >>>>> 0.04 1.3931e+04 >>>>> o.a.c.m.r.Well44497b 8.259e-05 4.6e-06 1.6518e+04 6.882 >>>>> 0.06 1.4118e+04 >>>>> o.a.c.m.r.ISAACRandom 6.724e-05 5.4e-06 1.3449e+04 5.603 >>>>> 0.08 1.1049e+04 >>>>> ----- >>>>> where the first line is JDK's "nextInt()" and the remaining are >>>>> "nextGaussian()". >>>>> >>>>> The generation time is thus about 4-fold that of "nextInt()". >>>>> Thus, degrading the performance of "nextInt()" by 10% would >>>>> degrade the >>>>> performance of "nextGaussian()" by half that. >>>>> >>>>> For a performance discussion to be meaningful, I think that we'd >>>>> need >>>>> to know how that fact would affect, even modestly, any moderately >>>>> complex >>>>> post-processing of the generated values. >>>>> >>>>> Another case, for modularity, would be to consider that other >>>>> algorithms could >>>>> be implemented to provide the required distribution.[3] >>>>> In the current design (inheritance-based), that can only be done >>>>> by creating >>>>> a subclass, even though the core functionality ("nextDouble()") is >>>>> not >>>>> overridden. >>>>> >>>>>>> * What are usages of the CM RNGs? >>>>>>> Do those use-cases strictly forbid "loosing" a dozen >>>>>>> milliseconds per >>>>>>> million calls? >>>>>> >>>>>> There are many different use cases. My own applications use >>>>>> them in >>>>>> simulations to generate random deviates, to generate random hex >>>>>> strings as identifiers and in stochastic algorithms like some >>>>>> of our >>>>>> internal uses. The last case is definitely sensitive to PRNG >>>>>> performance. >>>>> >>>>> Thanks for giving examples, but since we talk about performance, I >>>>> was hoping for some real flesh, like the relative duration of >>>>> numbers >>>>> generation (e.g. the total duration of calls to the >>>>> "RandomGenerator" >>>>> instances wrt to the total duration of the application). >>>>> >>>>> I don't know if by "last case", you are referring to code that is >>>>> inside CM. I didn't spot anything that makes "heavy" usage of a >>>>> RNG (in the sense that generation would count as a sizable part of >>>>> the whole processing). >>>> monteCarloP in KolmogorovSmirnovTest is one to check. >>>>> >>>>> As I pointed out many times: if an application is severely >>>>> dependent >>>>> on the performance of RNG, the user probably will turn to specific >>>>> tools (e.g. GPUs? [4]) rather than use CM. >>>> >>>> That is a bogus argument. We should make our PRNGs simple and fast >>>> so their use can extend to performance-sensitive applications. >>>>> >>>>> Conversely, using Java might be preferred for its flexibility, >>>>> which >>>>> is destroyed by a search for ultimate performance (which nobody >>>>> seems >>>>> able to define reasonably). >>>>> Performance is not a goal in itself; it should not be a trophy >>>>> which >>>>> sits uselessly on a shelf. >>>> >>>> Nor should "beautiful design" in the eyes of one person. >>>>> >>>>> My goal is not to deliberately slow things down; it is to allow >>>>> some >>>>> leeway so that designs which are deemed better (on all counts >>>>> except, >>>>> perhaps, performance) are given a chance to show their >>>>> strengths, in >>>>> particular in areas where performance in absolute terms is "good >>>>> enough" for all use-cases which CM should care about (hence the >>>>> need >>>>> of actual data points[5]). >>>> >>>> I see no reason that we can't have it both ways - good design and >>>> good performance. What we have now, modulo maybe some small changes >>>> to reduce code duplication, works fine. If you want to play with >>>> 64-bit generators and can find reference implementations and verify >>>> that they do in fact perform better, great. If not, I don't see the >>>> point. You can rant and complain all you want; but I am not going >>>> to let us trash performance or correctness of code in the random >>>> class or anywhere else just because you think it is somehow "better >>>> designed" unless you can show specific, practical use cases >>>> demonstrating the value of the changes. >>>> >>>> Phil >>>>> >>>>> >>>>> Gilles >>>>> >>>>> [1] "Is it faster?" >>>>> "No." >>>>> "Then, no." >>>>> [2] Although that is in some sense what you indirectly defend by >>>>> wanting >>>>> to stick with a meaningless "next(int bits)" method. >>>>> [3] http://www.doornik.com/research/ziggurat.pdf >>>>> [4] http://http.developer.nvidia.com/GPUGems3/gpugems3_ch37.html >>>>> [5] Hence the need to agree on a methodology/policy for >>>>> benchmarking. >>>>> >>>>>> >>>>>> Phil >>>>>> >>>>>> [1] http://openjdk.java.net/projects/code-tools/jmh/ >>>>>>> IOW, would those users for which such a difference matters use >>>>>>> CM at all? >>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Gilles >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>> For additional commands, e-mail: dev-h...@commons.apache.org >>> >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org