On 27 Apr, 05:01, "Carbon Man" <dar...@nowhere.com> wrote: > I have a program that is generated from a generic process. It's job is to > check to see whether records (replicated from another system) exist in a > local table, and if it doesn't, to add them. I have 1 of these programs for > every table in the database.
Some people have discussed whether you need one or many programs. In practice, you're going to need to execute many "tasks" to process all your tables, and potentially this could mean generating many SQL statements, as I will describe below. [...] > I am thinking that dynamically generating the programs to run might not be > such a good idea. It would be a shame to drop it because the system needs to > be generic and it runs from an XML file so the resulting code could be > pretty complex, and I am new to Python. The program did generate a pyc so it > was able to compile. > Thoughts anyone? This problem sounds as if it could be solved adequately using only the database system, although I accept that sometimes some logic or additional processing could be done in Python. If I were to receive updates from a system, presumably as files, I'd want to bulk copy them into the database and then perform the necessary inserts using SQL. If the raw updates were not directly compatible with the database, perhaps because data representations might differ, then I would want to process them before doing the bulk copy. In other words, the workflow for a single table would look like this: 1. Obtain update files. 2. Process files in order to make the data compatible with the database. 3. Create temporary tables for the updates in the database. 4. Bulk copy the files into the temporary tables (using COPY or LOAD DATA commands). 5. Do the necessary inserts (where you have to either use the non- standard INSERT OR UPDATE or an INSERT involving a join between existing and temporary tables). 6. Drop the temporary tables (if not done automatically). Now, it's almost certain that the details of the above workflow would vary from table to table and from update to update. Even in the case where you don't have to process the files, you still need to copy the files into the database and to work with a different table each time. This is where generating "something" for each table is unavoidable, and one solution might be to have a generic template and to substitute the table name and update filename into that template as you process each table and its associated data. Whether you actually generate an SQL command file for each table, or whether you have a program issue the commands directly, is a matter of preference. Anyway, from what you've described, that's how I would approach this problem. Certainly, it's a lot more straightforward than dealing with object-relational abstractions in programs generated by other programs. Paul -- http://mail.python.org/mailman/listinfo/python-list