On Tue, Aug 11, 2009 at 12:17 AM, fft1976 <fft1...@gmail.com> wrote:

> I don't know JVM too well, but I think no efficient user-level
> solution is possible. Why? To take care of substructure sharing, you
> need to remember a set of shareable values that have already been
> serialized, and do "reference equality" comparisons when new new
> substructures are serialized.
>
> This comparison and a set implementation can easily be done with
> pointers (because you have "<"), but there are no pointers in the JVM,
> and no "reference inequality", so you must use linear seeks, making
> the time complexity of serialization quadratic, where in C/C++ it
> could be O(N log N)


Reference equality is available in the JVM (instructions if_acmpeq and
if_acmpne), in Java (operators == and !=), and in Clojure (predicate
identical?). Furthermore, though < on pointers isn't, so a tree-map of
already serialized structures to themselves also isn't, Java provides
System.identityHashCode() and IdentityHashMap. These use a hash that
respects reference equality. So one in fact can implement one's own
serialization that is O(n) using O(1) hashmap lookups (and using reflection,
and not working if SecurityManager won't let you setAccessible private
fields and the like, so not in an unsigned applet).

(Another use for reference equality is to see if Double.valueOf() is
caching, something that arose as an issue in another thread. On my system,
Sun JVM 1.6.0_13 -server and Clojure 1.0.0, it apparently is not:

user=> (identical? 2.0 2.0)
false

If this comes out to true then it's caching. Integer.valueOf() is caching on
my system, but only for small integers:

user=> (identical? 1 1)
true
user=> (identical? 5 5)
true
user=> (identical? 50 50)
true
user=> (identical? 500 500)
false
user=> (identical? 255 255)
false
user=> (identical? 127 127)
true
user=> (identical? 128 128)
false

The threshold seems to be at values that will fit in one byte.

[Remember the literals get boxed when passed to a function like identical?
that isn't inlined. And identical? isn't inlined:

user=> (meta (var identical?))
{:ns #<Namespace clojure.core>, :name identical?, :doc "Tests if 2 arguments
are the same object", :arglists ([x y])}

whereas two-argument + is:

user=> (meta (var +))
{:ns #<Namespace clojure.core>, :name +, :file "clojure/core.clj", :line
549, :arglists ([] [x] [x y] [x y & more]), :inline-arities #{2}, :inline
#<core$fn__3329 clojure.core$fn__3...@d337d3>, :doc "Returns the sum of
nums. (+) returns 0."}

You can also use ^#'identical? and ^#'+ but I like my Clojure looking like
Lisp, not like perl. :)])

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to