Hello Tim and Richard, Sorry for the late reply but I was offline for the last days.
I asked the same question to Brazilian Django users and I've got the same answer 'focus on your sgbd, Django is not going to be your problem' so I decided to have a better look on Postgres and start to do some tests BEFORE writing my Django project. Tks ! Tkm On 9/25/07, Tim Chase <[EMAIL PROTECTED]> wrote: > > > > I'm developing a Django project that's going to handle with > > big sets of data and want you to advise me. I have 10 internal > > bureaus and each of then has a 1.5 million registers database > > and it really looks to keep growwing on size on and on. I > > intend to use Postgres. > > > > The question: what's the best way to handle and store this > > data? I tought about breaking the app model into 10 smaller > > ones (Bureau_1, Bureau_2, Bureau_3 etc) cause the main reports > > are splited by Bureau. Response time matters. What do you > > think? > > I deal with fairly large datasets (my employer does cell-phone > management, tending tens of thousands of phones, for hundreds of > companies, with historical statement detail for each phone, and > about 2.8e6 records of call-detail for those clients that require > the 3mo worth that we'll keep...and only growing). > > I can't say that splitting across multiple databases makes a very > useful partitioning and it forces you to design your application > around performance. It also becomes a maint. headache as you > have to touch each DB (or script) when performing changes. > Rather than just adding a column to a table, you have to spew > your ALTER TABLE statement across each DB. It (unless each is on > its own machine, where it doesn't matter) would also not be able > to agressivly cache common tables. > > Learning the ins-and-outs of Postgresql's EXPLAIN command can > help you find bottle-necks (such as missing indexes). I'm afraid > I haven't become adroit at this. > > The VACUUM ANALYZE can find and fix areas of usage that > Postgresql can optimize. > > I have had some performance problems with that call-detail table > (with its 2.8e6 rows or so), but find that as it's indexed, as > long as I pull from a joined table and only pull in the records I > care about, it can be pretty snappy. It's mostly sluggish when I > try and do operations across the whole table rather than a subset > of it, but even then, it's not too bad. > > Fast disks (a raid configuration helps) and loads of memory (as > much as your server will hold, or at least a couple gigs if > you've got a super-server) will go a long way towards easing your > data pains. Multiple processors can help too, but most notably > after you've eased the IO/memory bottle-necks. > > Just my obeservations from the field, > > -tim > > > > > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---

