About the String's expensiveness, working a lot with this kind of objects, it's big enough. In fact, that depends if you have a lot of small strings then the used memory space can become important. For instance, the word "Bazinga" takes almost 60 bytes in memory (for a 32bits JVM) in which 24 bytes are used by internal String object. That space is bigger in 64 bits JVM. You can have more information from Internet (like http://blog.nirav.name/2011/11/optimizing-string-memory-footprint-in.html).
Thus, when I need a lot of string chunks I often prefer use a characters array. It reduces the used memory footprint. Le samedi 5 janvier 2013 00:06:48 UTC+1, larry google groups a écrit : > > > I am still somewhat new to Clojure and the JVM. I am querying a database > and trying to output some XML. For the XML, I am using this library: > > https://github.com/clojure/data.xml > > Apparently the database queries worked, and I was able to get the data > into an XML structure, but at some point the data gets lost and nothing is > being output. I decided, for the sake of debugging, I would add in some > println statements and output it to the terminal. > > I was storing the xml in an atom called recent-activity. I attempted to > store this as clojure.data.xml elements in a vector. I got no output so I > decided t switch to emit-str and store it as a string. Then suddenly I > got an OutOfMemory error. I found this very surprising. Are strings that > expensive on memory? > > The code looks like this. The first function does the query against the > database: > > (defn update-recent-discourse-from-this-site [db k] > (jdbc/with-connection db > (jdbc/with-query-results database-results > [(str > " SELECT > d.id, d.description, d.created_at, > p.id as profile_id, p.first_name, p.last_name, > u.username " > " FROM discourse as d, sf_guard_user as u, sf_guard_user_profile > as p " > " WHERE d.user_id=p.user_id " > " AND d.user_id=u.id " > " AND p.user_id=u.id " > " AND d.question_id = 0 " > " AND d.answer_id = 0 " > " AND d.discourse_id = 0 " > " AND d.created_at > ? " > " ORDER BY d.created_at DESC " > " LIMIT 100 ") > (das/one-month-ago-as-a-string-for-the-database)] > (let [feed (transform-posts-into-a-feed database-results k > "discourse")] > (swap! recent-activity concat feed @recent-activity))))) > > and this function was suppose to put the content into an atom. At first I > did not use emit-str, but I added that on my last attempt to figure out > what is going on. > > I was using "conj", but then switched to "concat" when I switched to using > emit-str. > > This is the function that formed the XML: > > (defn transform-posts-into-a-feed [database-results k > what-type-of-item-is-this] > (let [site-url (make-site-url k) > map-of-xml-elements (reduce > (fn [feed db-row] > (conj feed > (xml/emit-str > (xml/element :item {} > (xml/element > :what-type-of-item-is-this {} (str what-type-of-item-is-this)) > (xml/element :username > {} (str (make-user-nice-name db-row))) > (xml/element > :user-profile-url {} (str (make-profile-url site-url (:profile_id > db-row)))) > (xml/element > :in-response-to-url {} (str (make-in-response-to-url site-url > what-type-of-item-is-this (:in_response_to_id db-row)))) > (xml/element :site-url > {} (str site-url)) > (xml/element :title {} > (str (:title db-row))) > (xml/element :item-url > {} (str (make-item-url site-url what-type-of-item-is-this (:id db-row)))) > (xml/element > :description {} (str (:description db-row))) > (xml/element :date {} > (str (:created_at db-row))) > )))) > [] database-results)] > map-of-xml-elements)) > > For awhile (before I used emit-str), at the end of this function, I had > this: > > (println (apply str map-of-xml-elements)) > > and I could see that the data in map-of-xml-elements was what I expected. > And yet the atom "recent-activity" seemed to remain empty, which I found > very confusing. > > The function that basically drives this app (called from main) is the one > that throws an error: > > (defn iterate-through-sites-and-output-files [] > "2012-11-10 - The TMA server might have 10 or 20 or more websites, each > with their own database config. We need to update every site. > fg/database-connections holds a map where the key is the name of the site > and the value is another map that has all of the info needed to connect to > that sites database." > (println "Now we will iterate over the sites again. The time is: " > (das/current-time-as-string)) > (doseq [[k db] @fg/database-connections] > (println (apply str " We will now update " (str k))) > (ura/update-recent-activity-from-this-site db k) > (println (mem/show-stats-regarding-resources-used-by-this-app)) > (println (apply str (debug/thread-top))) > (println " We are processing " (str k)) > (println "This is what we currently have stored up as recent activity: > ") > (println "We will now wait 1 hour, then iterate over all of the sites > again. The time is: " (das/current-time-as-string)) > (. java.lang.Thread sleep 3600000) > (iterate-through-sites-and-output-files)) > > The error I get: > > Exception in thread "Thread-0" java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Unknown Source) > at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source) > at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown > Source) > at java.lang.AbstractStringBuilder.append(Unknown Source) > at java.lang.StringBuilder.append(Unknown Source) > at clojure.core$str$fn__3501.invoke(core.clj:500) > at clojure.core$str.doInvoke(core.clj:502) > at clojure.lang.RestFn.applyTo(RestFn.java:139) > at clojure.core$apply.invoke(core.clj:600) > at > recent_activity.core$iterate_through_sites_and_output_files.invoke(core.clj:33) > at clojure.lang.AFn.run(AFn.java:24) > at java.lang.Thread.run(Unknown Source) > > line 33 is: > > (println "We will now wait 1 hour, then iterate over all of the sites > again. The time is: " (das/current-time-as-string)) > > That line looks innocent and it did not cause a problem before. It just > stared causing a problem when I starting using emit-str to store the XML as > a string in the atom recent-activity. > > That leads to 2 questions: > > 1.) are strings expensive on memory? > > 2.) what are the simplest profiling tools I can use to compare the memory > use of emit-str versus what I was doing previously? > > I am giving up on the use of emit-str and I'm going to try a different > approach. But I would be grateful for any insights about why I might have > gotten the OutOfMemory error. > > > > > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en