On Fri, Oct 16, 2009 at 2:31 PM, Carl Trachte <ctrac...@gmail.com> wrote: > On 10/16/09, Ramdas S <ram...@gmail.com> wrote: >> Has anyone worked/seen any project which involves migrating unstructured >> data, mostly text files to a reasonably indexed databas preferably written >> in Python or has Python APIs. >> I am even ok if its commercial project. >> > > FWIW, when I worked in a Microsoft SQL environment, I used DTS for SQL > 7 or 2000 with the win32com modules and SSIS for with IronPython for > later versions. > > It was usually a standard process of glueing together a bunch of data > in a csv file with Python, then automating the DTS or SSIS program to > dump the data to a database table or series of tables. > > You could probably do something similar with MySQL or Postgres. The > hard part was always writing the Python to do the situation-specific > initial crunch of the data.
I believe what you are looking for is a an ETL (extraction, tranformation and loading) application. It can be as simple as couple of python scripts, especially if it is a one-off job. You can use web.py's sql module or sqlalchemy(more work..) to generate sql statements, if you don't like writing sql statements yourself. If the data loading/cleaning/transformation has to be on a regular basis, you may want to investigate something like http://www.pentaho.com/products/data_integration/. I have had fairly decent success with using Pentaho Chef suite (link above) in doing ETL for telco OLTP data with postgresql as the destination DB. +PG _______________________________________________ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers