Greetings,
A small non-scientific sample of some SPAM subjects ... (and an actual serious question later in this message).
[incomprehensible SPAM]
Subject: ¡Ú±¹³»1À§ Á÷ÀåÀδ롤Ãâ ½ºÆä¼È·Ð 5000¸¸¿ø¿ø±îÁö ³â5~12% 100%½ÂÀÎ! Subject: ¡á¡áÇö±ÝÀÌ¿À°¡´Â Ä«Áö³ë °í½ºÅé.Æ÷Ä¿¡á¡á[À̹ÌÁöº¸±âŬ¸¯] cvkuhfq Subject: °í¼Óµµ·Î ¹× °¡¼Ó »ç°í¸¦ ¿¹¹æÇص帳´Ï´Ù »ç°í¹«!@ osouaoyoakx tokpf Subject: ³×ºñ°ÔÀÌ¼Ç »çÀºÇà»ç ¼³¹® À̺¥Æ®(³×½ºÆÌ ½ºÀ®Æù) Subject: ½Å¿ëÄ«µå »ç¿ëÀÚ´Â ¹«¹æ¹® ¹«¼·ù ´çÀÏ´ëÃâ 100% °¡´ÉÇÕ´Ï´Ù. Subject: [¹«¹æ¹®]´©±¸³ª °¡´ÉÇÑ´ëÃâ(Ä«µå,Á÷ÀåÀδëÃâ) °ø½Ä ÃÖÀú¼ö¼ö·á ¾÷ü!!! Subject: Áö±ÝÀÎÅͳݰ¡ÀÔÇϸé 6°³¿ù¹«·á+Çö±Ý6¸¸¿ø.»ï¼ºMP3.DVD.»ïõ¸®21´ÜÀÚÀü°Å.HPÄ®¶óÇÁ¸°ÅͰ¡ °øÂ¥!!! eocqxl lrmibl Subject: Re: ocJfD2xLTANk6 Subject: ÃÖ°íÀÇ ³ëÆ®ºÏ 50%ÇÒÀÎ ÆÇ¸Å!
[Russian SPAM}
Subject: =?Windows-1251?B?7/Du5efkIOIg9uXt8vAg8fLu6+j2+w==?=
Subject: =?Windows-1251?B?8e/w4OLu9+3o6iDk6/8g7/Du4/Dl8fHg?= From: =?Windows-1251?B?2ODt6O3gIMUuzy4=?= <[EMAIL PROTECTED]>
Subject: =?Windows-1251?B?4uX35fAg+PPy7uo=?= From: =?Windows-1251?B?1+Xw7e7i4CDOLsEu?= <[EMAIL PROTECTED]>
[English not spoken here SPAM]
Subject: Re: Reday 2 Odrer olinne Subject: willie illegitimacy Subject: consanguineous alternate
[Makes me think of tricking musicians into performing a P D Q Bach work]
Subject: AWARD NOTIFICATION !!!
The incomprehensible SPAM seems to be Korean or Japanese based on the few times one of those messages was accidentally opened.
The Russians at least tag the character set in the headers ... and the Cyrillic lettering in the headers looks nice among the rest of the SPAM.
The English not spoken ... category sometimes supplies a little humor ... the second and third examples in that category were actually offers to refinance my house! ... At competitive rates -- lucky me.
Now for my serious questions ...
(1) Is there a simple rule to detect the incomprehensible ... hint: for the most part, those letters have code values that are greater than 128.
In the same line of thinking, is there a way for the scripts to detect the character set when specified? IOW could someone code a filter rule that tested for Russian?
Thanks for listening ...
Martin