About the String's expensiveness, working a lot with this kind of objects, 
it's big enough. In fact, that depends if you have a lot of small strings 
then the used memory space can become important.
For instance, the word "Bazinga" takes almost 60 bytes in memory (for a 
32bits JVM) in which 24 bytes are used by internal String object. That 
space is bigger in 64 bits JVM. You can have more information from Internet 
(like 
http://blog.nirav.name/2011/11/optimizing-string-memory-footprint-in.html).

Thus, when I need a lot of string chunks I often prefer use a characters 
array. It reduces the used memory footprint.



Le samedi 5 janvier 2013 00:06:48 UTC+1, larry google groups a écrit :
>
>
> I am still somewhat new to Clojure and the JVM. I am querying a database 
> and trying to output some XML. For the XML, I am using this library:
>
> https://github.com/clojure/data.xml
>
> Apparently the database queries worked, and I was able to get the data 
> into an XML structure, but at some point the data gets lost and nothing is 
> being output. I decided, for the sake of debugging, I would add in some 
> println statements and output it to the terminal. 
>
> I was storing the xml in an atom called recent-activity. I attempted to 
> store this as clojure.data.xml elements in a vector. I got no output so I 
> decided t switch to emit-str and store it as a string. Then suddenly I 
> got an OutOfMemory error. I found this very surprising. Are strings that 
> expensive on memory?
>
> The code looks like this. The first function does the query against the 
> database:
>
> (defn update-recent-discourse-from-this-site [db k]
>   (jdbc/with-connection db
>     (jdbc/with-query-results database-results
>       [(str 
>         " SELECT
>               d.id, d.description, d.created_at, 
>               p.id as profile_id, p.first_name, p.last_name, 
>               u.username "
>         " FROM discourse as d, sf_guard_user as u,  sf_guard_user_profile 
> as p "
>         " WHERE d.user_id=p.user_id "
>         " AND d.user_id=u.id "
>         " AND p.user_id=u.id "
>         " AND d.question_id = 0 "
>         " AND d.answer_id = 0 "
>         " AND d.discourse_id = 0 "
>         " AND d.created_at > ? "
>         " ORDER BY d.created_at DESC "
>         " LIMIT 100 ")
>        (das/one-month-ago-as-a-string-for-the-database)]
>       (let [feed (transform-posts-into-a-feed database-results k 
> "discourse")]
>         (swap! recent-activity concat feed @recent-activity)))))
>
> and this function was suppose to put the content into an atom. At first I 
> did not use emit-str, but I added that on my last attempt to figure out 
> what is going on. 
>
> I was using "conj", but then switched to "concat" when I switched to using 
> emit-str. 
>
> This is the function that formed the XML:
>
> (defn transform-posts-into-a-feed [database-results k 
> what-type-of-item-is-this]
>   (let [site-url (make-site-url k)
>         map-of-xml-elements (reduce 
>                              (fn [feed db-row]
>                                (conj feed
>                                      (xml/emit-str 
>                                       (xml/element :item {}
>                                                    (xml/element 
> :what-type-of-item-is-this {} (str what-type-of-item-is-this)) 
>                                                    (xml/element :username 
> {} (str (make-user-nice-name db-row))) 
>                                                    (xml/element 
> :user-profile-url {} (str (make-profile-url site-url (:profile_id 
> db-row)))) 
>                                                    (xml/element 
> :in-response-to-url {} (str (make-in-response-to-url site-url 
> what-type-of-item-is-this (:in_response_to_id db-row))))
>                                                    (xml/element :site-url 
> {} (str  site-url))
>                                                    (xml/element :title {} 
> (str  (:title db-row)))
>                                                    (xml/element :item-url 
> {} (str (make-item-url site-url what-type-of-item-is-this (:id db-row)))) 
>                                                    (xml/element 
> :description {} (str (:description db-row)))
>                                                    (xml/element :date {} 
> (str  (:created_at db-row)))
>                                                    ))))
>                              [] database-results)]
>     map-of-xml-elements))
>
> For awhile (before I used emit-str), at the end of this function, I had 
> this:
>
> (println (apply str   map-of-xml-elements))
>
> and I could see that the data in map-of-xml-elements was what I expected. 
> And yet the atom "recent-activity" seemed to remain empty, which I found 
> very confusing. 
>
> The function that basically drives this app (called from main) is the one 
> that throws an error:
>
> (defn iterate-through-sites-and-output-files []
>   "2012-11-10 - The TMA server might have 10 or 20 or more websites, each 
> with their own database config. We need to update every site. 
> fg/database-connections holds a map where the key is the name of the site 
> and the value is another map that has all of the info needed to connect to 
> that sites database."
>   (println "Now we will iterate over the sites again. The time is: " 
> (das/current-time-as-string))
>   (doseq [[k db] @fg/database-connections]
>     (println (apply str " We will now update " (str k)))
>     (ura/update-recent-activity-from-this-site db k)
>     (println (mem/show-stats-regarding-resources-used-by-this-app))
>     (println (apply str (debug/thread-top)))
>     (println " We are processing " (str k))
>     (println "This is what we currently have stored up as recent activity: 
> ")
>   (println "We will now wait 1 hour, then iterate over all of the sites 
> again. The time is: " (das/current-time-as-string))
>   (. java.lang.Thread sleep 3600000)
>   (iterate-through-sites-and-output-files))
>
> The error I get: 
>
> Exception in thread "Thread-0" java.lang.OutOfMemoryError: Java heap space
>         at java.util.Arrays.copyOf(Unknown Source)
>         at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)
>         at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown 
> Source)
>         at java.lang.AbstractStringBuilder.append(Unknown Source)
>         at java.lang.StringBuilder.append(Unknown Source)
>         at clojure.core$str$fn__3501.invoke(core.clj:500)
>         at clojure.core$str.doInvoke(core.clj:502)
>         at clojure.lang.RestFn.applyTo(RestFn.java:139)
>         at clojure.core$apply.invoke(core.clj:600)
>         at 
> recent_activity.core$iterate_through_sites_and_output_files.invoke(core.clj:33)
>         at clojure.lang.AFn.run(AFn.java:24)
>         at java.lang.Thread.run(Unknown Source)
>
> line 33 is:
>
>   (println "We will now wait 1 hour, then iterate over all of the sites 
> again. The time is: " (das/current-time-as-string))
>
> That line looks innocent and it did not cause a problem before. It just 
> stared causing a problem when I starting using emit-str to store the XML as 
> a string in the atom recent-activity. 
>
> That leads to 2 questions: 
>
> 1.) are strings expensive on memory? 
>
> 2.) what are the simplest profiling tools I can use to compare the memory 
> use of emit-str versus what I was doing previously? 
>
> I am giving up on the use of emit-str and I'm going to try a different 
> approach. But I would be grateful for any insights about why I might have 
> gotten the OutOfMemory error. 
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to