2nd issue: Benchmarks I use both criterium and simple 'run repeatedly and divide the clock time'.
I've had trouble getting consistent results from run to run with either. Most recently (yesterday) I've added many more warmup runs, giving HotSpot lots of time to do its stuff, which seems to be stabilizing the results. This might be unfair, because in practice a lot of important code might never run that many times. At least, it gives the limiting performance of code that runs a lot. Unfortunately, this benchmark is split over 3 github projects, so it might be a little hard to follow. The task I'm timing is computing the sum of squares of the elements in a moderately large (4m elements) double[] array (aka L2Norm). I've compared 8 implementations, all of which accumulate the sum in Java. 6.37ms inline --- naive sum of squares in Java. 6.37ms invokestatic --- call a static method in Java to square the elements. 6.38ms primitive --- calls square.invokePrim(x[i]) to square the elements 6.37ms boxprimitive --- calls square.invoke(x[i]) to square the elements 6.37ms funxprimitive --- calls funxSquare.invokePrim(), where funxSquare is square wrapped with MetaFn, an experiment metadata wrapper. 6.38ms funxboxed --- calls funxSquare.invoke() 43.55ms boxed --- calls boxedSquare.invoke() , where boxedSquare is a version of square without type hints. 151.61ms cljmeta --- calls metaSquare.invoke(), where metasquare is the result of (with-meta square {...}) Clojure 1.8.0, Oracle JDK 1.8, Win10, Lenovo X1 i5-7300U. Sum of squares for each of 2 double[4194304], in 2 threads, concurrently. boxed and cljmeta create a lot of garbage, causing their clock times to be more variable, depending on exactly what GC does.. It's possible they appear relatively faster with a small number of total calls, if they don't trigger a GC, and HotSpot doesn't fully optimize the others. I think this is a valid usage pattern, but many many calls to small functions is the case that I'm interested in at present. Main scripts: Using criterium: https://github.com/palisades-lakes/function-experiments/blob/dynesty/src/scripts/clojure/palisades/lakes/funx/l2norm/bench.clj Running the benchmark 4k times and dividing the clock time: https://github.com/palisades-lakes/function-experiments/blob/dynesty/src/scripts/clojure/palisades/lakes/funx/l2norm/msec.clj These both use general benchmarking code from https://github.com/palisades-lakes/benchtools The experimental metadata function wrapper is in: https://github.com/palisades-lakes/dynamic-functions/blob/dynesty/src/main/java/palisades/lakes/dynafun/java/MetaFn.java On Wed, Sep 20, 2017 at 11:17 AM, John McDonald <palisades.la...@gmail.com> wrote: > Thanks for the quick response. > > One issue at a time: > > (A) Putting metadata on Vars instead of on the functions themselves: > > I need to be able to associate facts with the function instances. I can't > rely on every function being bound to a Var. > For example, I'm constructing cost functions for machine learning, and > other applications, by summing, composing, etc. other functions. > > > > On Tue, Sep 19, 2017 at 11:34 PM, Alex Miller <a...@puredanger.com> wrote: > >> >> >> On Tuesday, September 19, 2017 at 8:01:07 PM UTC-5, John Alan McDonald >> wrote: >>> >>> I'd like to be able to do something like: >>> >>> (defn square ^double [^double x] (* x x)) >>> (def meta-square (with-meta square {:domain Double/TYPE :codomain Double >>> /TYPE :range {:from 0.0 :to Double/POSITIVE_INFINITY :also Double/NaN}}) >>> >>> https://clojure.org/reference/metadata >>> <https://www.google.com/url?q=https%3A%2F%2Fclojure.org%2Freference%2Fmetadata&sa=D&sntz=1&usg=AFQjCNHbXrwRkSRL6pFAprN1DcrQPUEbJA> >>> says "Symbols and collections support metadata...". Nothing about >>> whether any other types do or do not support metadata. >>> >> >> Functions are probably the big missing thing in that list of types that >> have metadata that you can modify. >> >> A few other things also have metadata that can be set at construction and >> read but not modified after construction: namespaces and the reference >> types (vars, atoms, refs, agents). >> >> >>> The code above works, at least in the sense that it doesn't throw >>> exceptions, and meta-square is a function that returns the right values, >>> and has the right metadata. >>> That's because square is an instance of a class that extends AFunction, >>> which implements IObj (https://github.com/clojure/cl >>> ojure/blob/master/src/jvm/clojure/lang/AFunction.java#L18). >>> >>> It doesn't work, in the sense that it violates "Two objects that differ >>> only in metadata are equal." from https://clojure.org/reference/metadata. >>> That is, >>> (= square meta-square) >>> returns false. >>> >> >> I think the tricky thing here is that functions are only equal if they >> are identical. Perhaps it would make sense to implement equals in the >> function hierarchy to somehow "unwrap" the meta wrappers. I'm not sure if >> that even makes sense. >> >> Another option here is this, which I think would be more typical: >> >> (def ^{:domain Double/TYPE :codomain Double/TYPE :range {:from 0.0 :to >> Double/POSITIVE_INFINITY :also Double/NaN}} meta-square square) >> >> that puts the meta on the var (meta-square), not on the function instance >> itself. In this case equality works and the invocation timing should be >> about the same since in both cases you're going through the var to invoke >> the identical function. You can retrieve the meta with (meta #'meta-square) >> since it's on the var. >> >> For my purposes, what really matters is that calling meta-square has >>> roughly 30 times the cost of square itself (and about 3 times the cost of a >>> version without type hints). >>> >> >> I see the overhead of your meta-square as more like 2 times the cost in a >> quick test, not sure how you're testing. I'm using 1.9.0-beta1 and Java 8 >> and timing 100,000 invocations over a series of runs. There's a lot of >> variability - using something like Criterium would yield better data. >> >> >>> The reason is that meta-square is an instance of a class that extends >>> RestFn (https://github.com/clojure/clojure/blob/master/src/jvm/cloj >>> ure/lang/AFunction.java#L26), whose invoke() methods are expensive. >>> >>> Also, for my purposes, it would actually be better if "Two objects >>> that differ only in metadata are NOT equal." So perhaps I shouldn't be >>> using metadata at all. It just seems >>> >>> Options: >>> >>> (1) Add a meta field to clojure.lang.AFunction (and fix equals and >>> hashcode). I presume the reason there isn't already a meta field is to keep >>> functions as light weight as possible. Are there good benchmarks that I >>> could use to measure the cost of adding an almost always empty field? >>> >> >> I think it would add the cost of an object ref (prob 32 or 64 bits) to >> every function object and I don't think it matters if it's empty or not. I >> don't really know the reason for the current design, would require some >> research. There are a LOT of potential considerations here with respect to >> backwards compatibility, etc. Any change like this would be treated very >> carefully. I do not think the need is necessarily worth such a change, but >> it's hard to weigh that. >> >> >>> (2) Experiments with a mechanical wrapper class ( >>> https://github.com/palisades-lakes/dynamic-functions/blob/d >>> ynesty/src/main/java/palisades/lakes/dynafun/java/MetaFn.java) show >>> almost no overhead, but extending that to cover every possible combination >>> of clojure.lang.IFn$DD, clojure.lang.IFn$DLD, ..., is impractical. >>> >> >> That's what code gen is for. >> >> >>> (3) Use asm to create a new class that extends the original function's >>> class and implements IObj in the obvious way. >>> >> >> The asm included inside Clojure should be considered an internal >> implementation detail, subject to version and API changes without warning. >> >> >>> My short term plan is (2), ignoring the equals violation, and >>> implementing primitive interface wrappers as needed. >>> >> >> Will the var version above satisfy? >> >> >>> Are there problems with (3) asm, as a long term solution? >>> >> >> As mentioned above, you should not rely on this being available or free >> from breakage. >> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clojure@googlegroups.com >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clojure+unsubscr...@googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to a topic in the >> Google Groups "Clojure" group. >> To unsubscribe from this topic, visit https://groups.google.com/d/to >> pic/clojure/D8mksieuUPI/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> clojure+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.