In fact, it is enough to replace (drop-last sibs) with (remove seq?
sibs).
.
On Feb 12, 9:54 pm, Marko Topolnik wrote:
> On Feb 12, 7:55 pm, Marko Topolnik wrote:
>
> > How about replacing
> > (drop-last sibs)
> > with
> > (remove vector? sibs)
> > ?
>
> This was slightly naive. We also need
On Feb 12, 7:55 pm, Marko Topolnik wrote:
> How about replacing
> (drop-last sibs)
> with
> (remove vector? sibs)
> ?
This was slightly naive. We also need these changes:
In siblings:
:end-element [[(rest s)]]
In mktree:
(cons
(struct element (:name elem) (:attrs elem) (remove
Also, the xpp-based parser is almost an order of magnitude slower than
the sax-based one. The only thing it lacks is a couple of type hints:
(defn- attrs [^XmlPullParser xpp]
(defn- ns-decs [^XmlPullParser xpp]
(let [step (fn [^XmlPullParser xpp]
These hints increase the performance from 400%
How about replacing
(drop-last sibs)
with
(remove vector? sibs)
?
remove will not access the next seq member in advance and the only
vector in sibs is the last element. I tried this change and it works
for the test code from the original post.
On Feb 12, 4:43 pm, Chouser wrote:
> On Sat, Feb
On Sat, Feb 12, 2011 at 4:16 AM, Marko Topolnik
wrote:
>> > Just guessing, but is it something to do with this (from the docstring
>> > of parse-seq)?
>>
>> > "it will be run in a separate thread and be allowed to get
>> > ahead by queue-size items, which defaults to maxint".
>
> As I've figured
> > Just guessing, but is it something to do with this (from the docstring
> > of parse-seq)?
>
> > "it will be run in a separate thread and be allowed to get
> > ahead by queue-size items, which defaults to maxint".
As I've figured it out, when there's XPP on the classpath, and I'm
using it, the
On Fri, Feb 11, 2011 at 2:35 PM, Chris Perkins wrote:
> On Feb 11, 5:07 am, Marko Topolnik wrote:
>> http://db.tt/iqTo1Q4
>>
>> This is a sample XML file with 1000 records -- enough to notice a
>> significant delay when evaluating the code from the original post.
>>
>> Chouser, could you spare a
On Feb 11, 5:07 am, Marko Topolnik wrote:
> http://db.tt/iqTo1Q4
>
> This is a sample XML file with 1000 records -- enough to notice a
> significant delay when evaluating the code from the original post.
>
> Chouser, could you spare a second here? I've been looking and looking
> at mktree and sibl
I can confirm that the same thing is happening on my end as well. The
XML is parsed lazily:
user=> (time (let [root (parse-trim (reader "huge.xml"))] (->
root :content type)))
"Elapsed time: 45.57367 msecs"
clojure.lang.LazySeq
...but as soon as I try to do anything with the struct map for the
D
http://db.tt/iqTo1Q4
This is a sample XML file with 1000 records -- enough to notice a
significant delay when evaluating the code from the original post.
Chouser, could you spare a second here? I've been looking and looking
at mktree and siblings for two days now and can't for the life of me
find
Can you post a link to a (sanitized, if need be) sample file?
On Feb 11, 1:21 am, Marko Topolnik wrote:
> Right now I'm working with a 300k-record file, but the code must scale
> into the millions, and, as I mentioned, it is already spewing
> OutOfMemoy errors. Also, on a more abstract level, it'
Right now I'm working with a 300k-record file, but the code must scale
into the millions, and, as I mentioned, it is already spewing
OutOfMemoy errors. Also, on a more abstract level, it's just not right
to thrash the memory of a concurrent server-side component for
absolutely no good reason.
--
On Thu, 10 Feb 2011 07:22:55 -0800 (PST)
Marko Topolnik wrote:
> I am required to process a huge XML file with 300,000 records. The
> structure is like this:
>
>
>
>
>
>
> ...
> ...
> ... 299,998 more
>
>
>
> Obviously, it is of key importance not to allocate
I am required to process a huge XML file with 300,000 records. The
structure is like this:
...
...
... 299,998 more
Obviously, it is of key importance not to allocate memory for all the
records at once. If I do this:
(use ['clojure.contrib.lazy-xml :only ['pars
14 matches
Mail list logo