Stephen Moretti <[EMAIL PROTECTED]> wrote on 01/05/2005 07:22:18 AM: > Kentucky Families wrote: > > >Very new to MySQL and this list. Thanks in advance for your > patience with elementary questions. I'm in process of converting a > current web site to a mysql database. As this database will > eventually contain millions of records, I want to set it up properly. > > > >The database will contain transcriptions of birth, marriage and > death records. For surname fields, I would like for my transcribers > to be able to enter multiple values for records that are difficult > to read. For instance, a surname that could be read Stotts or Statts > or Stutts. If I use a VARCHAR or TINYEXT field to enter these values > and I want to be able to retrieve all records where the surname > field contains the whole word Stotts, how would I enter these values: > > > >stotts or statts or stutts > >stotts,statts,stutts > >other? > > > > > You need to do an IN query. > > SELECT column,list,here > FROM tblBMD > WHERE Surname IN ('stotts','statts','stutts') > > http://dev.mysql.com/doc/mysql/en/Comparison_Operators.html#IDX1235 > > Regards > > Stephen >
Stephen, I don't think you understood the question. She wants to allow for multiple values within a single field, not to check several rows of a single-value field. She explained why very well. I can't make a solid recommendation but I would assume that a full-text index on your surname field is what you are after. If you separated each name possibility with a simple space, the FT indexer would pick up on the word breaks and index each name individually (assuming that the name is not on the "stop words" list and that it is at least 4 characters long). An only slightly more complex (but more normalized) design would use a separate table of surnames. Each surname has its own ID. Actually, you would need 2 new tables: a surnames table and a table to associate a surname to a record. The surnames table could look like: CREATE TABLE surname ( id int auto_increment , name varchar(150) not null , PRIMARY KEY (ID) , UNIQUE(name) ) and the table that would link your surnames list to the "record" table (birth record, death record, etc.): CREATE TABLE record_surname ( record_id int not null , surname_id int not null , confidence float , PRIMARY KEY(record_id, surname_id) ) This not only gives you the ability to store each surname individually, it helps with _partial_ name matches(!!), and it gives you the ability to assign a confidence level to what the guess is. For your example if a name looks like stotts, statts, or stutts the transcriptionist could weigh one as being more likely than the others. What you use to represent those levels are up to you (percents, scale 1-to-10, simple order,... it doesn't matter how you rank them but this lets you do it.) It would be very fast to do a surname search and if not found, stop there. If you do find a surname (or list of surnames that match a pattern) you could easily take those ids and query the other tables to retrieve whatever other details you may have. You may consider doing the same thing with "given names" (first names) as they could be just as hard to decipher from the older records. That way if you have Jeb or Job as a first name and Statt or Stott as a last name you could store all 4 combinations in a fairly compact and very searchable way. Just a suggestion, Shawn Green Database Administrator Unimin Corporation - Spruce Pine