I could not create multi-byte encoded filenames, except with the utf8 encoding. I've tried to set the default language to Chinese but was unable to make Outlook produce anything but UTF8 encoded filenames. This is the reason I've requested help from anyone with a East Asian code page.
In the UTF8 encoding checking the "RAW" base64 decoding still matches the extention, since the english ascii letters are represented by single-byte characters. However I am not sertain that the only character sets for encoding filenames are single-byte of UTF8, I just couldn't make outlook use another multi-byte character set. I am not familiar with multi lingual support in Unix systems which is compatible with simple text editors. It is likely that they simply do not support multi-byte encodings. It does not seem neccesary, at this time, to invest the effort to allow filtering of general filenames (not extentions) in local character sets. It seems to be alot of work, not least on part of the editing of the quarantine-attachments.txt in the correct fation, it seems that if the file is to stay in ASCII a certain type of escaping should be added. By the way, I have noticed that for long filenames there are several segments of base64 encodings. Does the forthcomming "fix" correctly handles long filenames? Maybe its time to set a CVS or Beta version of Qmail Scanner for download so that people most bothered by a specificaly added feature or fix could test it in their environment. For instance a long filename which contains English, Russian, Hebrew, Arabic and Greek in UTF8 ------=_NextPart_000_0007_01C53894.C9F9E540 Content-Type: image/bmp; name="=?utf-8?B?zqXOkc6jzqPOn19IRUxMT1/Xqdec15XXnV/Qn9Cg0JjQktCV0KJf2YXYsdi u2Kg=?= =?utf-8?B?2KdfzqXOkc6jzqPOn19IRUxMT1/Xqdec15XXnV/Qn9Cg0JjQktCV0KJf2YXYsQ==? = =?utf-8?B?2K7YqNinX86lzpHOo86jzp9fSEVMTE9f16nXnNeV151f0J/QoNCY0JLQldCiXw==? = =?utf-8?B?2YXYsdiu2KjYp1/Opc6RzqPOo86fX0hFTExPX9ep15zXldedX9Cf0KDQmNCS0JU=? = =?utf-8?B?0KJf2YXYsdiu2KjYpy5ibXA=?=" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="=?utf-8?B?zqXOkc6jzqPOn19IRUxMT1/Xqdec15XXnV/Qn9Cg0JjQktCV0KJf2YX Ysdiu2Kg=?= =?utf-8?B?2KdfzqXOkc6jzqPOn19IRUxMT1/Xqdec15XXnV/Qn9Cg0JjQktCV0KJf2YXYsQ==? = =?utf-8?B?2K7YqNinX86lzpHOo86jzp9fSEVMTE9f16nXnNeV151f0J/QoNCY0JLQldCiXw==? = =?utf-8?B?2YXYsdiu2KjYp1/Opc6RzqPOo86fX0hFTExPX9ep15zXldedX9Cf0KDQmNCS0JU=? = =?utf-8?B?0KJf2YXYsdiu2KjYpy5ibXA=?=" ------=_NextPart_000_0007_01C53894.C9F9E540 Content-Type: image/bmp; name="=?utf-8?B?zqXOkc6jzqPOn19IRUxMT1/Xqdec15XXnV/Qn9Cg0JjQktCV0KJf2YXYsdi u2Kg=?= =?utf-8?B?2KcuYm1w?=" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="=?utf-8?B?zqXOkc6jzqPOn19IRUxMT1/Xqdec15XXnV/Qn9Cg0JjQktCV0KJf2YX Ysdiu2Kg=?= =?utf-8?B?2KcuYm1w?=" ------=_NextPart_000_0007_01C53894.C9F9E540-- Regards, Moti -----Original Message----- From: Jason Haar [mailto:[EMAIL PROTECTED] Sent: Saturday, April 02, 2005 23:29 To: qmail-scanner-general@lists.sourceforge.net Cc: Moti Subject: Re: Anyone With Far Eastern Code Page on this list? Windows encoded filenames and perlscanner extention filtering Moti wrote: >Thanks for the reply, I've now read that discussion. > >First: > >If anyone with a Far Eastern Code Page (Japanese,Chinese,Korean) is on this >list, please send an example of how the attachment filenames in your local >code page are send in MIME. >Its for your own good :) > >It should look something like this: (specificaly the name and the filename >fields) > >------=_NextPart_000_0032_01C53787.386D0A70 >Content-Type: video/x-ms-wmv; > name="=?windows-1255?B?4ePp9+Qud212?=" >Content-Transfer-Encoding: 7bit >Content-Disposition: attachment; > filename="=?windows-1255?B?4ePp9+Qud212?=" > > > I have already "fixed" this problem in the next release by calling MIME::Base64 - but your comments about issues regarding multi-byte encoding are not in my area of expertise, so I'm sure if that changes anything. >The extention is always in plain ascii 7 , there is no specific "local >extentions". > > > Great - so that should mean if we simply "raw" base64 decode the filenames, the extensions should always be trustable? >In case of multi byte encoding there is a need to filter out only the ascii >characters and run perlscan on the. I.E. in some way discard or replace all >the non ASCII characters. This seems to much more complicated. I am not sure >there is a package for perl to work with Windows Code Pages. > > > Yuck - I don't like the smell of that! Can you create some empty files for me with the most exotic multi-byte filenames you can come up with, as examples, along with how you would write those same filenames within quarantine-attachments.txt using vi/vim? I want to know how the choice of Code Page would affect the filename (when you say "Code Page", I assume you are referring to charset?). Extensions is one thing, but we should also ensure that if you want to block specific filenames, that the quarantine-attachment.txt file is capable of handling such filenames too. How do Unix systems (and editors specifically) handle multi byte chars? I mean, if I am on a Linux box set for UTF8 (via LANG and LOCALE settings) and say in quarantine-attachments.txt to block: יקה.wmv<TAB>0<TAB>yucky Windows movie and someone sends in that filename - but encoded according to the "windows-1255" (Hebrew?) charset. Will MIME::Base64 decode that base64 encoded string back to the same "8bit" string referred to within quarantine-attachments.txt (i.e. so they match) I mean, I also assume there's a different (older) non-UTF ISO charset for Hebrew that you could have been using instead of UTF8 on your Unix box - would the string be different from the UTF8 version? This locale thing gets complicated REALLY quickly. :-( [esp for an ignorant English speaker like myself] -- Cheers Jason Haar Information Security Manager, Trimble Navigation Ltd. Phone: +64 3 9635 377 Fax: +64 3 9635 417 PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1 -- No virus found in this incoming message. Checked by AVG Anti-Virus. Version: 7.0.308 / Virus Database: 266.9.1 - Release Date: 4/1/2005 -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.308 / Virus Database: 266.9.1 - Release Date: 4/1/2005 ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Qmail-scanner-general mailing list Qmail-scanner-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/qmail-scanner-general