I am still somewhat new to Clojure and the JVM. I am querying a database and trying to output some XML. For the XML, I am using this library:
https://github.com/clojure/data.xml Apparently the database queries worked, and I was able to get the data into an XML structure, but at some point the data gets lost and nothing is being output. I decided, for the sake of debugging, I would add in some println statements and output it to the terminal. I was storing the xml in an atom called recent-activity. I attempted to store this as clojure.data.xml elements in a vector. I got no output so I decided t switch to emit-str and store it as a string. Then suddenly I got an OutOfMemory error. I found this very surprising. Are strings that expensive on memory? The code looks like this. The first function does the query against the database: (defn update-recent-discourse-from-this-site [db k] (jdbc/with-connection db (jdbc/with-query-results database-results [(str " SELECT d.id, d.description, d.created_at, p.id as profile_id, p.first_name, p.last_name, u.username " " FROM discourse as d, sf_guard_user as u, sf_guard_user_profile as p " " WHERE d.user_id=p.user_id " " AND d.user_id=u.id " " AND p.user_id=u.id " " AND d.question_id = 0 " " AND d.answer_id = 0 " " AND d.discourse_id = 0 " " AND d.created_at > ? " " ORDER BY d.created_at DESC " " LIMIT 100 ") (das/one-month-ago-as-a-string-for-the-database)] (let [feed (transform-posts-into-a-feed database-results k "discourse")] (swap! recent-activity concat feed @recent-activity))))) and this function was suppose to put the content into an atom. At first I did not use emit-str, but I added that on my last attempt to figure out what is going on. I was using "conj", but then switched to "concat" when I switched to using emit-str. This is the function that formed the XML: (defn transform-posts-into-a-feed [database-results k what-type-of-item-is-this] (let [site-url (make-site-url k) map-of-xml-elements (reduce (fn [feed db-row] (conj feed (xml/emit-str (xml/element :item {} (xml/element :what-type-of-item-is-this {} (str what-type-of-item-is-this)) (xml/element :username {} (str (make-user-nice-name db-row))) (xml/element :user-profile-url {} (str (make-profile-url site-url (:profile_id db-row)))) (xml/element :in-response-to-url {} (str (make-in-response-to-url site-url what-type-of-item-is-this (:in_response_to_id db-row)))) (xml/element :site-url {} (str site-url)) (xml/element :title {} (str (:title db-row))) (xml/element :item-url {} (str (make-item-url site-url what-type-of-item-is-this (:id db-row)))) (xml/element :description {} (str (:description db-row))) (xml/element :date {} (str (:created_at db-row))) )))) [] database-results)] map-of-xml-elements)) For awhile (before I used emit-str), at the end of this function, I had this: (println (apply str map-of-xml-elements)) and I could see that the data in map-of-xml-elements was what I expected. And yet the atom "recent-activity" seemed to remain empty, which I found very confusing. The function that basically drives this app (called from main) is the one that throws an error: (defn iterate-through-sites-and-output-files [] "2012-11-10 - The TMA server might have 10 or 20 or more websites, each with their own database config. We need to update every site. fg/database-connections holds a map where the key is the name of the site and the value is another map that has all of the info needed to connect to that sites database." (println "Now we will iterate over the sites again. The time is: " (das/current-time-as-string)) (doseq [[k db] @fg/database-connections] (println (apply str " We will now update " (str k))) (ura/update-recent-activity-from-this-site db k) (println (mem/show-stats-regarding-resources-used-by-this-app)) (println (apply str (debug/thread-top))) (println " We are processing " (str k)) (println "This is what we currently have stored up as recent activity: ") (println "We will now wait 1 hour, then iterate over all of the sites again. The time is: " (das/current-time-as-string)) (. java.lang.Thread sleep 3600000) (iterate-through-sites-and-output-files)) The error I get: Exception in thread "Thread-0" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Unknown Source) at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source) at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source) at java.lang.AbstractStringBuilder.append(Unknown Source) at java.lang.StringBuilder.append(Unknown Source) at clojure.core$str$fn__3501.invoke(core.clj:500) at clojure.core$str.doInvoke(core.clj:502) at clojure.lang.RestFn.applyTo(RestFn.java:139) at clojure.core$apply.invoke(core.clj:600) at recent_activity.core$iterate_through_sites_and_output_files.invoke(core.clj:33) at clojure.lang.AFn.run(AFn.java:24) at java.lang.Thread.run(Unknown Source) line 33 is: (println "We will now wait 1 hour, then iterate over all of the sites again. The time is: " (das/current-time-as-string)) That line looks innocent and it did not cause a problem before. It just stared causing a problem when I starting using emit-str to store the XML as a string in the atom recent-activity. That leads to 2 questions: 1.) are strings expensive on memory? 2.) what are the simplest profiling tools I can use to compare the memory use of emit-str versus what I was doing previously? I am giving up on the use of emit-str and I'm going to try a different approach. But I would be grateful for any insights about why I might have gotten the OutOfMemory error. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en