If the jvm does have enough memory, you may want to try building up the
map using a transient.

And not sure if this is faster, (maybe it's slower), but you can spell the
function you pass to reduce more succinctly:

(fn [G [v1 v2]] (update-in G [v1] (fnil conj []) v2))

Robert

On Thu, Apr 12, 2012 at 06:22:34PM -0400, David Nolen wrote:
> How much memory do Python & Go consume when you do this? Are you giving the
> JVM enough memory?
> 
> On Thu, Apr 12, 2012 at 6:17 PM, László Török <ltoro...@gmail.com> wrote:
> 
> > Hi,
> >
> > I'm trying figure out how to load a huge file that contains some 800k pair
> > of integers (two integers per line) which represent edges of a directed
> > graph.
> >
> > So if the ith line has x and y, it means that there is an edge between x
> > and y vertex in the graph.
> >
> > The goal is to load it in an array of arrays representation, where the kth
> > array contains all the nodes, where there is a directed edge from the kth
> > node to those nodes.
> >
> > I've attempted multiple variants of with-open reader and line-seq etc. but
> > almost always ended up with OutMemoryException or sg VERY slow.
> >
> > My latest attempt that also does not work on the large input:
> >
> > (defn load-graph [input-f]
> >   (with-open [rdr (io/reader input-f)]
> >     (->> (line-seq rdr)
> >         (map (fn [row]
> >                (let [[v1str v2str] (str/split row #"\s")]
> >                    [ (Integer/parseInt v1str) (Integer/parseInt v2str) ]))
> >   )
> >         (reduce (fn [G [v1 v2]]
> >                   (if-let [vs (get G v1)]
> >                     (update-in G [v1] #(conj % v2))
> >                     (assoc G v1 [v2])))  { }  ))))
> >
> > I'm getting a bit frustrated as there are Python, Go implementations that
> > load the graph in less the 5 seconds.
> >
> > What am I doing wrong?
> >
> > Thanks
> >
> > --
> > László Török
> >
> >  --
> > You received this message because you are subscribed to the Google
> > Groups "Clojure" group.
> > To post to this group, send email to clojure@googlegroups.com
> > Note that posts from new members are moderated - please be patient with
> > your first post.
> > To unsubscribe from this group, send email to
> > clojure+unsubscr...@googlegroups.com
> > For more options, visit this group at
> > http://groups.google.com/group/clojure?hl=en
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to