Re: re.compile for names

2007-05-21 Thread John Machin
On 22/05/2007 12:02 AM, Paul McGuire wrote: > On May 21, 8:46 am, brad <[EMAIL PROTECTED]> wrote: >> The goal of the list is to have enough strings to identify files that >> may contain the names of people. Missing a name in a file is unacceptable. >> Seems to me the OP is looking for people-name

Re: re.compile for names

2007-05-21 Thread John Machin
On 22/05/2007 12:09 AM, brad wrote: > Marc 'BlackJack' Rintsch wrote: > >> What about names with letters not in the ASCII range? > > Like Asian names? The names we encounter are spelled out in English... > like Xu, Zu, Li-Cheng, Matsumoto, Wantanabee, etc. "spelled out in English"? "English" h

Re: re.compile for names

2007-05-21 Thread John Machin
On 21/05/2007 11:46 PM, brad wrote: > I am developing a list of 3 character strings like this: > > and > bra > cam > dom > emi > mar > smi > ... > > The goal of the list is to have enough strings to identify files that > may contain the names of people. Missing a name in a file is unacceptable.

Re: re.compile for names

2007-05-21 Thread brad
Marc 'BlackJack' Rintsch wrote: > What about names with letters not in the ASCII range? Like Asian names? The names we encounter are spelled out in English... like Xu, Zu, Li-Cheng, Matsumoto, Wantanabee, etc. So the ASCII approach would still work. I guess. My first thought was to spell out n

Re: re.compile for names

2007-05-21 Thread Paul McGuire
On May 21, 8:46 am, brad <[EMAIL PROTECTED]> wrote: > I am developing a list of 3 character strings like this: > > and > bra > cam > dom > emi > mar > smi > ... > > The goal of the list is to have enough strings to identify files that > may contain the names of people. Missing a name in a file is u

Re: re.compile for names

2007-05-21 Thread Marc 'BlackJack' Rintsch
In <[EMAIL PROTECTED]>, brad wrote: > I am developing a list of 3 character strings like this: > > and > bra > cam > dom > emi > mar > smi > ... > > The goal of the list is to have enough strings to identify files that > may contain the names of people. Missing a name in a file is unacceptable.