Hi Boris,

It seems to me that you have 1 global ref 'year', but have 8 agents
competing to update it... and then are applying functions which
dereference it also. That's a lot of contention and I don't think
models what you want. Maybe you would do better to have a year per
agent and let them behave asynchronously, or if chronology is
important send to agents their tasks per year increment.


Regards,
Tim.


On Dec 23, 6:46 pm, bOR_ <boris.sch...@gmail.com> wrote:
> Seems fairly reproducable.
>
> With a population of a 1000 people:
>
> 126288   living:    939   infected:  933
>  ave VL:  5.5225080385852285
> pro alleles in population:  6    3.9313139960273147    (621 467 404
> 360 24 2)
> tap alleles in population:  4    2.7024232960074537    (830 718 317
> 13)
> mhc alleles in population:  5    3.8634863880746124    (706 490 288
> 280 114)
> "Elapsed time: 28014.93 msecs"
>
> With a population of 10,000 people:
> 12626   living:    9068   infected:  8980
>  ave VL:  5.234532293986656
> pro alleles in population:  22    4.068465156597464    (8164 2554 1903
> 1143 1022 717 712 483 422 391 274 160 84 74 9 9 5 3 3 2 1 1)
> tap alleles in population:  21    3.2913665666051255    (9429 2163
> 1495 1461 891 596 487 443 423 361 289 49 23 8 6 4 2 2 2 1 1)
> mhc alleles in population:  17    11.51415636214343    (2636 2010 1914
> 1834 1400 1322 1260 1234 1162 998 894 506 454 292 198 20 2)
> "Elapsed time: 984358.517 msecs"
>
> So. It actually happens 10 times earlier with 10,000 people than with
> a 1000 ones. Puzzling.
>
> On Dec 22, 3:33 pm, bOR_ <boris.sch...@gmail.com> wrote:
>
> > * So far it happened in both instances that I ran the simulation for
> > more than 100k simulated years, so while this is reproducable, it does
> > take a number of hours to get there. I can see if I can get the effect
> > faster with a smaller population or something.
>
> > * When I start the simulation, the memory usage is 2.4% of the
> > available memory (16gb), and it is happily running on 8 Intel(R) Xeon
> > (R) CPU X5482  @ 3.20GHz 's.
> > (from 'top').
>
> > * inc-year:
>
> > (defn inc-year
> >   [_]
> >   (dosync (commute year inc)))
>
> > *Whole source is 
> > here:http://clojure.googlegroups.com/web/eden.clj?gsc=rQ4WoRYAAAB68Q78LH5o...
>
> > *gather indeed scans all refs, but is only called once every 1000
> > years, and right after an 'await', so I figured everything should have
> > been free then.
>
> > On Dec 22, 2:56 pm, Rich Hickey <richhic...@gmail.com> wrote:
>
> > > On Dec 22, 7:41 am, bOR_ <boris.sch...@gmail.com> wrote:
>
> > > > Hi all,
>
> > > > Long post, but it boils down that I'm running into a transaction
> > > > failed after retry limit after running my simulation for a couple of
> > > > hours. I chatted briefly with fyuryu in #clojure, and am now pasting
> > > > some of the hopefully relevant information into this post. Hope anyone
> > > > can shed a light. The recommendation of fyuryu was to use 'await-for'
> > > > rather than await, but I'm a but worried that that is just a way to
> > > > ignore some underlying problem.
>
> > > > I've the simulation still online and in limbo (long live emacs --
> > > > daemon), so I can answer additional questions.
>
> > > > I'll paste part of the program, the output, the agent-errors and some
> > > > additional things I tried below.
>
> > > Generally, you can get retry limit failures when a long-running
> > > transaction contends for the same refs as short-running transactions.
> > > It is hard to see what is going on with your sim without all the
> > > source.
>
> > > How many cores?
> > > What is the memory utilization?
> > > Do you have any blocking calls anywhere?
> > > What does inc-year do?
>
> > > Calls like 'gather' in a dosync can cause congestion, as I presume it
> > > does a scan of all refs?
>
> > > > I started mucking with it a bit more and find that I can't change a
> > > > single ref. Everything seems to be locked. If I make 'death' do a
> > > > println each time it is tried, I see that it is indeed trying to apply
> > > > itself to ref 1 about several thousand times.
>
> > > I don't like the sound of that. If you could create a reproducible
> > > test case I'll chase it down.
>
> > > Rich
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to