On 6 oct, 23:18, puzzler <[EMAIL PROTECTED]> wrote:
> > I'm afraid you failed to give enough informations about your
> > "intensivecomputation" (ie: what kind ofcomputation, what input data
> > does it requires, what does it outputs, is it mostly IO bound or cpu
> > bound or else, etc), nor about the expected load. According to these
> > informations, answers can range from "just call the function directly"
> > to "you'll obviously need a cluster" - which of course makes a big
> > difference.
>
> The computationally intensive function takes a string as an input, and
> generates a string of XML as the output.
You may want to consider json or yaml instead - a less verbose format
eats less memory (and/or requires less IO if you have to stream to
save on memory).
> It does not rely on any
> additional data.
Ok, so you don't need access to the database to do your computation,
and you can easily split this computation from the server process.
> The input string represents a problem space to be
> searched, and the output is the solution, or an indication that no
> solution is possible. It's not really a "number-crunching" problem,
> but more like a breadth-first search of a large tree.
For which definition of "large" ? And where does this tree lives ?
> So no I/O.
> Just CPU and memory intensive.
memory intensive can easily become 'IO bound' - when your server
starts to swap. Believe me, it's the worse possible thing for a web
server.
> I understand how picking the right data structures and algorithms
> impacts performance. I am predicting it will still take about 4-5
> minutes to generate the solution.
I don't know much about your background so please forgive me if I give
you obvious advises, but perhaps you'd be better *first* coding the
solution to your problem - eventually looking for help on
comp.lang.python. Then you'll have real facts, not estimations.
Specially if your computation happens to be memory-hungry, since the
best way to put a server on it's knees is to make it go swapping. So
in your case, you may want to look for space optimisation instead of
time optimisation.
> Once the solution is generated, it
> will be permanently stored in the database. Future queries to solve
> the same problem will first consult the database to see if it has
> already been solved.
You may want to only store metadata in the database, and keep the xml
(or json or yaml or whatever) on the file system. This would let you
serve the xml (etc) as 'static' file.
> I'd like to allow for the possiblity of, say,
> 100 users using this service at a time.
On a shared hosting ? 100 concurrent executions of a long (wrt/ an
average HTTP request processing time) cpu+memory hungry computation ?
Well, I'm not a sys-admin, but I'm a bit skeptical.
> I'd be using a Django webhosting service, specifically webfaction, so
> I'm somewhat restricted in terms of resources, and what can be
> installed. I have no idea if they have psyco as an option.
Psyco only works on x86 - if they use some other processor
architecture, then no luck. But anyway, psyco trades space for speed,
which might not actually solve your problem.
Serously : first implement a module doing the computation, and only
the computation. Write it with a main entry point so you can either
use it as a module or as a script. Then start benchmarking, so you
really know how much resources (cpu, ram, space etc) you'll need for a
single instance of the whole computation. Until this is done,
everything else is just blah.
My 2 cents...
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---