I have it working here and worked on it to improve the dictionary. It
needs more work in the id field area because they are converted to an
integer, also boolean fields which get represented as char(1) would
become string. My testing has been to take the database produced by my
model files and run it through this script to see what I get. This
script came from another person as a project for their own use and
Massimo asked if anyone would help test because he uses postgresql so
it has not had a large selection of databases pushed though it to
verify it works for all schemas. Basically a regular expression
pattern matcher pulls fields out of a line from mysqldump and for the
correct lines and for one of the fields does a dictionary lookup to
see what the correct web2py field type should be in order to emit the
model code.

I usually put a print name, d_type after line 89 so I can see when it
fails and go from there looking into the mysqldump file for the table
being processed. Any patches you come up with are welcome for
consideration.

Ron

On Oct 6, 4:56 pm, DJ <sebastianjaya...@gmail.com> wrote:
> The legacy database to Web2py conversion would be a great add-on. I
> get the following errors when I tried this script on a database with
> tables having primary key 'id' set to autoincrement.
>
> C:\Program Files (x86)\web2py\scripts>extract_mysql_models.py
> bio:b...@nrcf
> Traceback (most recent call last):
>   File "C:\Program Files (x86)\web2py\scripts
> \extract_mysql_models.py", line 106
> , in <module>
>     print mysql(m.group(3),m.group(1),m.group(2))
>   File "C:\Program Files (x86)\web2py\scripts
> \extract_mysql_models.py", line 82,
>  in mysql
>     table_name = re.search('CREATE TABLE .(\S+). \(',
> sql_lines[0]).group(1)
> AttributeError: 'NoneType' object has no attribute 'group'
>
> Has anyone figured out how to get this script working?
>
> Thanks,
> S
>
> On Sep 17, 10:25 am, ron_m <ron.mco...@gmail.com> wrote:
>
> > Program mysqldump produces a line like this for an id field
>
> >   `id` int(11) NOT NULL AUTO_INCREMENT,
>
> > and because there is no special case processing int becomes integer.
>
> > It looks like a possible solution would be to recognise AUTO_INCREMENT
> > and for that case over ride the int translation to 'integer' replacing
> > with 'id' instead.
>
> > I am running the database created by a set of model files through the
> > script to test the result, not exactly a legacy database but a round-
> > trip test is a good test too. In my case the id field should be
> > dropped because it is auto generated. That brings the question if the
> > field is named id and is AUTO_INCREMENT should it be dropped? If the
> > source is a legacy database then that might be okay because the legacy
> > use is consistent with the way web2py uses the id field.
>
> > Another one that would be nice is the varchar(len) could be translated
> > to 'string', length=len in the emitted Field constructor.
>
> > Another problem I see is a boolean in the model becomes a char(1) in
> > the db which becomes a string in the extract but maybe I am pushing
> > the envelope since my source is a model to start with. Would it be
> > better to recognisu a char(1) as a boolean? Possibly not.
>
> > At this point the script provides an initial cut at a model.py file
> > and then "some human intelligence" needs to be applied to finish the
> > result. Of course it would be nice to have nirvana which would be a
> > 100% accurate and complete translation with all possible Field()
> > parameters filled in for a given database which requires a lot more
> > work. I am willing to help out but my workload is high right now. Of
> > course it should not crash as it does for Johann right now.
>
> > If you would like to provide comments on what you think is most
> > important I can help out.
>
> > Ron
>
> > On Sep 17, 6:24 am, mdipierro <mdipie...@cs.depaul.edu> wrote:
>
> > > I see a problem here:
>
> > >     Field('id','integer'),
>
> > > should be
>
> > >     Field('id','id'),
>
> > > On Sep 9, 12:01 am, ron_m <ron.mco...@gmail.com> wrote:
>
> > > > On Sep 8, 1:30 pm, mdipierro <mdipie...@cs.depaul.edu> wrote:
>
> > > > > > So maybe tonight do you want me to go through the manual and find 
> > > > > > all
> > > > > > the missing datatypes and try to add them to the map?
>
> > > > > I would not stop you. ;-)
>
> > > > Here is a replacement data_type_map, it was shuffled a bit to put like
> > > > types together for easier maintenance. I didn't know what to do with
> > > > the YEAR type so it is commented out. I am not sure the decimal type
> > > > should be mapped to integer since the DAL accepts decimal(n,M) as a
> > > > field type.
>
> > > > data_type_map = dict(
> > > >         varchar = 'string',
> > > >         int = 'integer',
> > > >         integer = 'integer',
> > > >         tinyint = 'integer',
> > > >         smallint = 'integer',
> > > >         mediumint = 'integer',
> > > >         bigint = 'integer',
> > > >         float = 'double',
> > > >         double = 'double',
> > > >         char = 'string',
> > > >         decimal = 'integer',
> > > >         date = 'date',
> > > >         #year = 'date',
> > > >         time = 'time',
> > > >         timestamp = 'datetime',
> > > >         datetime = 'datetime',
> > > >         binary = 'blob',
> > > >         blob = 'blob',
> > > >         tinyblob = 'blob',
> > > >         mediumblob = 'blob',
> > > >         longblob = 'blob',
> > > >         text = 'text',
> > > >         tinytext = 'text',
> > > >         mediumtext = 'text',
> > > >         longtext = 'text',
> > > >         )
>
> > > > I also fixed the remaining problem in the line match re.search so line
> > > > 75 or line 87 after above dict is changed
>
> > > > 75c87
> > > > <                 hit = re.search('(\S+)\s+(\S+)( .*)?', line)
> > > > ---
>
> > > > >                 hit = re.search('(\S+)\s+(\S+)(,| )( .*)?', line)
>
> > > > this fixes the matching on lines like
>
> > > > `description` longtext,
>
> > > > The comma immediately after the type got included as part of the type
> > > > string match in error.
>
> > > > Here is the output for the auth_event table
>
> > > > legacy_db.define_table('auth_event',
> > > >     Field('id','integer'),
> > > >     Field('time_stamp','datetime'),
> > > >     Field('client_ip','string'),
> > > >     Field('user_id','integer'),
> > > >     Field('origin','string'),
> > > >     Field('description','text'),
> > > >     migrate=False)
>
> > > > The id field should not be printed since it will be added
> > > > automatically.
>
> > > > The varchar fields have a length in () e.g. varchar(64) which could be
> > > > used to add on a length=N value for the Field constructor. The same
> > > > holds true for int fields but I don't think there is a real use for
> > > > that.
>
> > > > There is no recognition of foreign keys, fixing this would probably be
> > > > a significant effort. Some human intervention required as it
> > > > stands. ;-)
>
> > > > It certainly is a great start to getting a model file for an existing
> > > >MySQLdatabase.
>
> > > > Ron
>
>

Reply via email to