Re: transaction failed after retry limit

Timothy Pratley Tue, 23 Dec 2008 16:47:18 -0800

Hi Boris,

> * the global time ref might indeed be incorrect. I just half-
> understood how to make a global incrementing counter and implemented
> it in a way that seemed to work.


It is a fine globally incrementing counter. However...
You describe every agent to be calculating a different year. A global
year can only have one value. So if you did want to calc separate
years you would pass each agent in a year which they would use for
calculation (and would have to pass to their function calls).
Having the counter updated within each agent creates a race to update
and deref.
The counter should either be updated fully before calculation, or
different years passed to different agents. Sounds like you wanted to
do the latter.

In any case I maintain that doing multiple years in parallel is
incorrect. Consider a machine with 100 cpu. Every time the 100th cpu
evaluates a location first it will kill it due to age, meanwhile
births are occurring, but the population is not representative of the
year in which the birth occurs, so the birth rate will be wrong
(leading to more errors when the ages are evaluated for death). The
effect will be less for 8 cpu, but still there.


> * I wanted to avoid that 8 cpu's were all starting with trying to do a
> birth event at position 0, as that seemed to me would cause an
> unneccesary amount of collisions between different cpu's. So I shuffle
> the order of the positions to check once per year per cpu.

Fair enough, but that is essentially binning them randomly, explicit
binning would have less interaction.


> * I also wanted to avoid a bias in the number of living people
> reported. If birth always happened before death, then report (coming
> after 'birth' and 'death') would show a lower number of living hosts
> than that were actually on average alive. Therefore I shuffle the
> order of the events once per year per cpu.

If I were modeling it I would always do death then birth. I would have
birth be dependent on pop. Having death before birth does mean that it
is ignored the chance of mating just before death, but that is a tiny
consideration next to the mammoth issue of how good a model birth by
pop is. In reality birth is much more complex, age... sex... food...
health... my simple model doesn't capture any of it. So its a false
accuracy to think about birth before death. But leaving that aside,
you seem to be evaluating birth as a 50% probability at empty
locations, which is not obvious to me why unless you are dealing with
a saturated population. I would propose
age death check every living entity,
total pop birth rate check,
random infections/evolutions based on probabilities applied only as
needed
[need to control the pop from exploding by having some downward factor
either in birth or death relative to the population - infection
increasing death rate is enough to satisfy this?]

> Why don't I use full randomization (dotimes [i (* popsize num_events)]
> (random event, random location):
> * I need to keep track of time somehow, and thus that I need to know
> the number of events that happened per year. The doseq gives me that.
> * Normally, if you do a 1000x (random event, random location) type
> scheme, only ~666 positions get visited. 333 people therefore escape
> their yearly death-chance check. Next year, that is 111 people, next
> year that is 33 people.. it basically makes some people live longer
> than they are supposed to, and then the model is no longer following
> the death-chance formula that came from the literature, but a variant
> on it.

I agree that death checks should be applied to everyone, but birth
infection and evolution are a different story.


> Why don't I separate the population into 8 groups, rather than in the
> weird-for-humans separation of 8 years happening simultaneously.
> * I couldn't think of a bias in what would go wrong if I calculated 8
> years simultaneously. As long as I don't need information of what
> happened when within one of these 8-year bouts, I think I am fine. It
> seems to be equivalent to a single-processor simulation where my doseq
> [birth death infect ..etc] is just 8 times bigger,
> and somewhat randomized.

As your death and birth are year dependent, I would argue there is
definitely a bias (albeit randomly applied so you probably wouldn't
notice with only 8 cpu).


> * I think (but I am not sure) that an 8 years can handle it better if
> someone else decides to log onto this computer I'm running on and use
> up 2-3 processors.

Explicit partitioning of the work is always going to fail if someone-
else uses some cpu no matter how you dice it. That's one really strong
argument for using work queues that can be dynamically taken by
different threads (implicit balancing). Even though in theory if you
explicitly partition work you can avoid overhead, in reality it is
more likely you end up loosing out to something like this: ie, 1cpu is
not avaiable and so the whole sim has to wait.

So not only would modeling each location as an agent simplify the
model, it would probably perform better in a realistic computer.


> Why don't I generate a to-do list for every position and let god sort
> the mess out ;).
> * I saw that there were a number of options in how to change the
> single-threaded model I had to a multithreaded one 
> (seehttp://groups.google.com/group/clojure/browse_thread/thread/a4395b433...
> ), and the one I am using now happened to be
> an easy conversion.
> * This variant seems to be light on the number of threads. My
> population size is now a 1000 hosts, but I will probably run sims with
> 10,000 and 100,000 hosts as well. Not knowing that much about how
> threads work inside a machine, I worried about sending off a 100,000
> agents.

Given the number of agents isn't an issue as Chris described, I really
think this is the best approach.


> Why do I :gather the way I do.
> * Rich also commented on this one. I'm not sure how to do it better.
> I'll look into it.

Assuming you were to make every location an agent, it would make sense
to instead of having a world of locations, instead only have a
collection of living hosts which can increase and decrease in size.
Then actually the only gathering you would be doing is in the
reporting (to look at infections). Another advantage to many agents?

Apologies for going so far off the original question, its an
interesting simulation :)
Thanks for showing the code.


Regards,
Tim.


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: transaction failed after retry limit

Reply via email to