Martin P. Hellwig wrote:
MacRules wrote:
<cut>
What I am looking for is this.
Oracle DB in data center 1 (LA, west coast)
MSSQL DB in data center 2 (DC, east coast)
So network bandwidth is an issue, I prefer to have gzip fist and
deliver the data.
If bandwidth is really an issue, you should send compressed delta's.
I need 2 python daemons or a web service here in the future.
I will enter the Oracle table name, user id and password.
So the task is dump out Oracle data (Linux) and insert that to MSSQL.
That is assuming the table is the same and the columns of the table have
the same type with the same restrictions and don't even get me started
about encoding differences.
I can try first with 1 daemon python. Take the Oracle data file, and
let the daemon connects to MSSQL (with pyodbc) and load the data in.
I think that you are underestimating your task, I would recommend to
'design' your application first using an UML like approach, whether you
need one, two or bazillion daemons should not be a design start but a
consequence of the design.
Anyway here is a sum up and some further pointers to take in regard for
your design:
- Can I do delta's? (if yes how do I calculate them)
- Are the tables comparable in design
Specifically if there is a limit on the character width, does Oracle
and MS-SQL think the same about newlines? (one versus two characters)
- How about encoding, are you sure it works out right?
- Is ATOMIC an issue
- Is latency an issue, that is how long may the tables be out of sync
- Can it be manual or is it preferably automatic
How about a trigger in the database that sets the sync going, if that
is too much burden how about a trigger that set a flag on the disk and a
scheduled job that reads that flag first
- Is security an issue, do I need an encrypted channel
In the past I've wrote a program that had the same starting point as you
had, over time it grew in design to be a datawarehouse push/pull central
server with all other databases as an agency. The only reason I wrote it
was because the more standard approach like business objects data
integrator was just way too expensive and oracles solutions didn't play
nice with PostgreSQL (though over time this issue seemed to be resolved).
You understand my issue clearly. For now, I only look for 1 time event,
I do not need push/pull yet.
For a good push/pull + compression + security + real time in mind, you
can sell it for decent money.
--
http://mail.python.org/mailman/listinfo/python-list