On Fri, 2007-06-08 at 14:01 +0000, David the Dude wrote:
> Hi,
> 
> I want to build a web-application using Django and before I get
> started I would like some opinions on the feasibility of it all.
> 
> Here is the general layout:
> 
> I have a huge remote oracle database containing time series, huge =
> several billion entries and for that reason a local copy will be
> created on the server where the final application should run.
> 
> What the authorized user is presented with on the website will be the
> result of some serious preprocessing on the database entries followed
> by equally serious number crunching using algorithms implemented in
> Numpy and the like - time series analysis/machine learning essentially
> - say it could take up to 2 hours on a fast machine depending on the
> exact task.
> 
> Since the computation of said results is so expensive the user will
> only be allowed to fetch the pre-calculated results - as opposed to
> being able to initialize the task himself. This pre-calculation will
> take place whenever the remote (and thus the local database) is
> updated  - for now, once a day.
> 
> My question is really: how much of the workload should be handled by
> the Django app and how much should be out-sourced? Would it be wise to
> have the huge database containing the actual raw data + the Django
> tables serve as the app's database? With the user interaction set-up I
> had in mind, i.e - "user only gets to see pre-calculated results" this
> is not really a requirement but I may have to extend the app to being
> able to handle custom tasks specified/intialized by the user. Also,
> the update-frequency of the remote/local dbs may increase.

One possible answer...

Any long-running or periodic processes are best done outside of Django.
They don't really fit into the request/response sort of model and it
doesn't sound like you are going to get much benefit at all from using
Django's ORM over the raw data. If it's already taking hours and using
billions of rows, adding in the overhead to convert things to model
instances might not be worth it. Of course, this is all "wild ass guess"
territory based on a healthy dose of intuition and your frequent use of
the word "huge". However, my WAGs and intuition aren't completely
unproven in this sort of area.

So I would be tempted to do the raw processing however you are doing it
now and then use Django as the way to present the pre-calculated
results. Note that you can build models that talk to an existing
database table (because you can customise column and table names) and
you don't have to extract every column from the database in the model.
So using Django's ORM as a way of putting a Python interface over *some*
(not necessarily even all) of the columns of an existing table is quite
possible and sometimes make the information presentation easier.

Regards,
Malcolm



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to