Loren Wilton wrote:

>At a guess the table obfuscation stuff will have to be handled after table
>removal, assuming the rendered text ends up looking like the visible text.
>(I haven't checked to see if it does.)  At that point I'd probably go with
>metas on number of different drugs or other key phrases, since probably the
>drug names aren't further munged.
>
>  
>
Loren, After removal of the HTML tags the text looks nothing like the
words to be caught.

These spams fragment the words up into 1-4 character chunks, and then
interleave multiple words together.

Jim's example looks like this with the HTML tags removed:

VA
U
AG
 C
IS

Ll
M Vl
RA
lAL
 and many other

What they are doing is interleaving two table rows. I've inserted a
blank line above to delineate the two rows.

If you re-arrange each you can read the following in a zig-zag fashion:
VA   U       AG    C       IS
   LI   M VI     RA  1AL

And put it together:

VALIUM VIAGRA C1ALIS



Reply via email to