Pierre Thomson wrote: > Bowie Bailey wrote: >> >> Some of the medication spams are using an obnoxious html table >> structure that makes the contents of each cell print vertically. >> >> For example: >> <table> >> <tr> >> <td>a d g</td> >> <td>b e h</td> >> <td>c f i</td> >> <td width=100%></td> >> <\tr> >> </table> >> >> This results in: >> a b c >> d e f >> g h i >> >> Has anyone else been having this problem? Any rules to catch >> medication names in those types of tables? >> > > Here's a simple rule I wrote a couple days ago: > > body PT_DRUG1 /([CVAXP] ){5}/ > describe PT_DRUG1 Drug names in table of 1-letter columns > score PT_DRUG1 3.0 > > It works for me, no FP's yet that I am aware of. There are also > variants for 2-letter and 3-letter bits of the same drug names. >
If anyone can formulate a regex to catch these letters in any order, while avoiding a repeating sequence like "A A A A A ", it would make this a safer rule. Pierre