No one is recommending multi-db because it's not done yet. I'm not sure I understand exactly what you mean by "creating new order and order_item data with keys to data in the warehouse" but that doesn't sound like a recommended approach. A data warehouse (in my opinion) should be read only and no other system should need to know about it. An important concept to keep in mind with a warehouse is that it is never the originator of the base data (it may create new computations of the data but it's always based on data from other systems). As such it is for reporting purposes only and no other system should treat it as the system of record.
If you need to link one order item to other data in the warehouse it should only either be done in the warehouse or between the systems outside of the warehouse. It sounds like you want to modify the order entry application to reference data in other systems and still display the data in the order entry app. If this is the case the only solution I would implement is between the two existing systems. As far as OLAP goes, it's not mutually exclusive from ER. It is simply a method of extracting multidimensional data. An ER model can most certainly be dimensional, you just need to integrate the proper timestamp structure. On Thu, Sep 24, 2009 at 4:35 PM, Tony Schmidt <tschm...@sacfoodcoop.com>wrote: > > Well, I guess the answer to the thread is looking like: > > - a data warehouse is preferable to multiple db connections and cross- > vendor joins > > But that's only because I haven't heard anyone voice support for the > multi-db idea. > > Now the question is: > > How should the warehouse be constructed in tandem with new > functionality and data entities? > > What I've read about data warehouses seems to limit them to read-only > OLAP, with special data modelling (namely dimensional, as opposed to > the ER modelling I'm used to). But in the example I give in my first > post, I'm talking about creating new order and order_item data with > keys to data in the warehouse. Should that data be stored in the > warehouse, as well, so I don't have to go back to multiple db > connections for my joins? If it co-exists with the read-only data, > should it all be modelled the same? (all ER or all dimensional) Or > can it be mixed? > > I would start a new thread, but whereas the multi-db issue related > directly to developing Django technology, I'm not sure the issue of > the warehouse design is an appropriate topic for this group (although > I'm also not sure where I should discuss it). > > On Sep 24, 2:36 am, Joshua Russo <josh.r.ru...@gmail.com> wrote: > > If he wants to discuss it here I can do that too. It was just a little > off > > topic, but not not too much I suppose. > > > > On Thu, Sep 24, 2009 at 7:41 AM, nausikaa <g.n.muel...@gmail.com> wrote: > > > > > Hi Joshua > > > > > Thanks for sharing your knowledge. > > > Too bad, I would've liked to read your suggestion to Tony as well. ; ) > > > I'm a just graduate and at work I'm working on something very similar > > > to what has been discussed here. > > > > > On Sep 23, 9:51 pm, Joshua Russo <josh.r.ru...@gmail.com> wrote: > > > > I have many years of database design experience and it sounds like > you > > > are > > > > getting lost in terminology. > > > > I would start out with a set of tables in the warehouse that mirror > the > > > > existing data. In these tables you can either wipe and reload each > time, > > > or > > > > create a mechanism to track each individual load. > > > > > > You can then simply write your application against these tables or if > you > > > > feel adventurous you can add a more unified set of tables with > linking > > > > foreign keys and such and have your process also update the > cleaner/more > > > > unified data set. This second part get's tricky and takes practice to > > > know > > > > you have a good structure. You will also have to track changes, > > > additions, > > > > and deletions. If you want some help or advice I would be happy to > take a > > > > gander at your structures to show you what I might do. Feel free to > email > > > me > > > > directly at josh.r.ru...@gmail.com. > > > > > > On Wed, Sep 23, 2009 at 6:35 PM, Tony Schmidt < > tschm...@sacfoodcoop.com > > > >wrote: > > > > > > > Thanks for your reply, Joshua. > > > > > > > Complexity is exactly what I'm trying to avoid - but after doing a > > > > > little research on data warehouses (I've never built one), that > seems > > > > > like a very complex route as well! > > > > > > > I would like to just come up with my "dream schema" and run update > > > > > scripts on a daily or event-driven basis, but there seem to be a > > > > > million concepts related to data warehousing: data marts, > dimensions, > > > > > star schemas, snowflakes, EAV tables, and so on. > > > > > > > It seems more straight forward if I could just connect directly to > the > > > > > operational systems. I wonder if all that theory is really > necessary > > > > > for a basic data warehouse... > > > > > > > On Sep 23, 12:09 pm, Joshua Russo <josh.r.ru...@gmail.com> wrote: > > > > > > The multi-db branch is just now in the process of being baked > into > > > the > > > > > core > > > > > > and from what I can tell it's not quite done yet. Even when it is > > > > > completely > > > > > > done I would recommend the data warehouse approach. I view the > > > multi-db > > > > > > functionality more as a last resort, where you really don't have > an > > > > > option > > > > > > to merge the data. It adds a lot of complexity so if you can do > > > without I > > > > > > would recommend an alternative. > > > > > > Just my 2 cents > > > > > > Josh > > > > > > > > On Wed, Sep 23, 2009 at 4:05 PM, Tony Schmidt < > > > tschm...@sacfoodcoop.com > > > > > >wrote: > > > > > > > > > Hi, Nausikaa. Thanks for your reply. > > > > > > > > > Unfortunately, the legacy systems must remain in place until > they > > > are > > > > > > > gradually (if ever) phased out. There's a whole bunch of > > > > > > > functionality that I don't want to have to recreate in those > > > systems > > > > > > > (POS, inventory/accounting, products, etc.). I just want to > build > > > new > > > > > > > functionality that accesses that data (like the example of an > order > > > > > > > entry form that creates new order/item data entities in a new > DB > > > with > > > > > > > keys to entities in the other DBs or to their versions in the > > > > > > > warehouse). Most of it should be read only. > > > > > > > > > I'm starting to hear that a data warehouse is the way to go - > but > > > then > > > > > > > there's the question of data warehouse vs. data mart, Inmon vs. > > > > > > > Kimball, and how I can get started building one in python. I'm > not > > > > > > > hearing many suggestions for the multi-db approach (which makes > me > > > > > > > wonder what the Django muli-db branch is for?) > > > > > > > > > On Sep 22, 2:01 am, nausikaa <g.n.muel...@gmail.com> wrote: > > > > > > > > Hi snfctech > > > > > > > > > > With warehouse I assume you mean keeping the datasources and > > > periodic > > > > > > > > transfer into a central db (the warehouse). > > > > > > > > Why not migrate all your datasources into e.g. a PostgreDQL > db? > > > > > > > > It is easy to write forms and implement logins/access rights > in > > > > > > > > django so that your non-technical users can read or edit the > > > > > > > > data. Besides you'd remove some (unnecessary) heterogenity > and > > > > > thereby > > > > > > > > complexity from your system. > > > > > > > > But since I don't know your system I might be missing the > point > > > > > > > > completely. > > > > > > > > > > Nausikaa > > > > > > > > > > On Sep 22, 3:10 am, snfctech <tschm...@sacfoodcoop.com> > wrote: > > > > > > > > > > > I understand that there is a Django branch being actively > > > worked on > > > > > > > > > for connections to multiple DB vendors, or that Django + > Elixir > > > may > > > > > be > > > > > > > > > a good option. But I'm wondering if building a single data > > > > > warehouse > > > > > > > > > may still be a better way to go? > > > > > > > > > > > Here's an example of some of the relations I'm going to > have to > > > > > build > > > > > > > > > for my project: > > > > > > > > > > > I've got order and order_item tables with their own data > and > > > > > relations > > > > > > > > > to members (Access DB), products (flat file) and employees > > > (MySQL). > > > > > > > > > > > I initially thought that the best way to manage this would > be > > > to > > > > > > > > > create a new DB for the order and order_item tables, and > then > > > > > create > > > > > > > > > cross-vendor joins in the ORM. But then I came across an > > > > > unexpected > > > > > > > > > advantage of having all the data in an updated warehouse - > my > > > semi- > > > > > > > > > technical staff could still use products like OOBase, that > are > > > > > limited > > > > > > > > > to a single vendor connection, to make reports and forms > based > > > on > > > > > the > > > > > > > > > warehouse data. > > > > > > > > > > > So now I'm wondering - are direct connections to multiple > > > databases > > > > > > > > > really the best way to go? Or are there more advantages to > > > > > building a > > > > > > > > > data warehouse (keeping in mind the complexities of the > data > > > > > > > > > replication, scripts for pushing and pulling data, etc.) > > > > > > > > > > > Thanks in advance for any tips. > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---