> I'm developing a large webapp using django. One of my requirements > is > that it needs to be able to handle 10,000+ different entities or > models that need to be associated with a user. A single user have > needs to be able to to associate himself with any of the existing > models and have one record per model. Each model will have an average > of 30 fields (each need to be searchable). There will be several > hundred thousands and in some occasions millions of records per model > and we expect to have millions of users using the webapp. > > My questions are: > > 1) Do you recommend using a django model per entity or should I try a > different approach?
Looking at things from a different perspective may help reframe the problem in a more manageable context: class Entity(Model): name = CharField(...) class Values(Model): entity = ForeignKey(Entity) value = CharField(...) class Person(Model): name = CharField(...) entity_values = ManyToMany(Values) # other stuff This allows you to set up an arbitrary number of entities, each with their own number of allowed values. Thus, you might have Entities such as "Manager" (with values "John Smith", "Jane Miller"), "Pet's Name" (with values "Spot", "Fluffy", and "Rex"), "Favorite Breakfast Cereal" (with values "Cheerios", "Oatmeal", and "Chocolate Frosted Sugar Bombs"), etc. Entities and their values can be added arbitrarily, and users can be associated with as many of them as you need. The only caveat comes with searching...until the query-set refactor hits the trunk, you have to do some spiffy SQL extra() calls to do things that would ordinarily be written something like p = Person.entity_values.filter( Q(entity__name='Manager', value='John Smith'), Q(entity__name='Favorite Breakfast Cereal', value='Chocolate Frosted Sugar Bombs') ) # yes, the Q()'s are redundant, but it makes the # problem's intent clearer to find people that have John Smith as their manager and Chocolate Frosted Sugar Bombs as their favorite breakfast cereal. However because of the way the SQL is currently generated, this produces a null set because it's asking for an impossible condition in the join (that a single field, named "entity__name" or "value" be assigned multiple values at the same time). I've posted my interim solution several times here on the ML (to use an extra() call and an IN/EXISTS clause) if you want an example of how to work around the problem in such a context[1]. I've worked on some large-scale "enterprise" applications[2] in my life and having 1000+ tables all associated with a given entity generally indicates a design flaw. -tim [1] http://groups.google.com/group/django-users/browse_thread/thread/dbf9068482849d7/d0de78597fa6b9f7#d0de78597fa6b9f7 http://groups.google.com/group/django-users/browse_thread/thread/8e265aeb33f3ec32/5c169c88eef79409?#5c169c88eef79409 http://groups.google.com/group/django-users/browse_thread/thread/9517fe61d1e8e20f/aab62f3a3e1f5ba0#aab62f3a3e1f5ba0 http://groups.google.com/groups/search?q=exists&qt_s=Search&enc_author=f_5GNh4AAADtSWJxSFR4zrj8u9Z0bQb_U9DkLoOxide0N_XCIlgvOQ [2] the system used by the PA Dept of Corrections, with several hundreds of tables for everything from inmate intake to tracking litigation to keeping tabs on the cable-TV privileges allotted to various inmates. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---