Am Samstag, 27. Juli 2013 17:30:08 UTC+2 schrieb Mikera:

> On Saturday, 27 July 2013 03:59:55 UTC+1, Jeremy Heiler wrote:
>
>> On July 26, 2013 at 10:39:47 AM, Jürgen Hötzel (jue...@hoetzel.info) 
>> wrote:
>>
>> I did some memory profiling on a Clojure Application.
>>
>> I wondered why 361000 clojure.lang.Symbols exist.
>>
>> So I did some object browsing on the memory dump  and found duplicate 
>> symbols. After checking the source:
>>
>> static public Symbol intern(String nsname){
>>         int i = nsname.indexOf('/');
>>         if(i == -1 || nsname.equals("/"))
>>                 return new Symbol(null, nsname.intern());
>>         else
>>                 return new Symbol(nsname.substring(0, i).intern(), 
>> nsname.substring(i + 1).intern());
>> }
>>
>>
>> I realized that interning of a symbol always returns a "new" Symbol 
>> object.
>>
>> If a symbol X is interned twice, shouldn't the second Symbol.intern(X) 
>>  return the  previous interned symbol object?
>>
>> It's not the symbol that's being interned, it's the string that 
>> represents the symbol. If you look at the equals() method for Symbol, 
>> you'll notice that it is using object equality (the double equals operator, 
>> ==) to compare the name of the symbol. So, if you have two symbols with the 
>> same name, they will be equivalent because they have the same string object 
>> internally. However, since each symbol can have its own metadata, the 
>> symbol objects cannot be interned as your are describing. Keywords fill 
>> that role, as Marshall mentioned.
>>
> You could plausibly intern all the symbols that are created with nil 
> metadata - a fairly common case?
>
>
Yes i think this is the common case when reading Clojure code. I was 
surprised that 49% of CPU time is spent in Symbol.intern here (using 
bultitude): 

(namespaces-in-jar (io/file 
"/home/juergen/.lein/self-installs/leiningen-2.2.1-SNAPSHOT-standalone.jar"))
 

> Not sure if it would be worth it though. Symbols are already very 
> lightweight (assuming the underlying Strings are interned). You would win a 
> bit in terms of memory usage and quicker equality testing, at the expense 
> of having to maintain an interned symbol cache and perform lookups in it 
> all the time.
>

I was also surprised that interned Symbols are not identical (like interned 
Strings):

user> (identical? "aa" "aa")
true
user> (identical? 'aa 'aa)
false

wheres in CL:

CL-USER> (eq 'a 'a)
T                         

because Symbols are not really interned as mentioned above. 

Jürgen


-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to