I could not create multi-byte encoded filenames, except with the utf8
encoding.
I've tried to set the default language to Chinese but was unable to make
Outlook produce anything but UTF8 encoded filenames.
This is the reason I've requested help from anyone with a East Asian code
page.

In the UTF8 encoding checking the "RAW" base64 decoding still matches the
extention,
since the english ascii letters are represented by single-byte characters.
However I am not sertain that the only character sets for encoding filenames
are single-byte of UTF8, I just couldn't make outlook use another multi-byte
character set.

I am not familiar with multi lingual support in Unix systems which is
compatible with simple text editors. It is likely that they simply do not
support multi-byte encodings.

It does not seem neccesary, at this time, to invest the effort to allow
filtering of general filenames (not extentions) in local character sets. It
seems to be alot of work, not least on part of the editing of the
quarantine-attachments.txt in the correct fation, it seems that if the file
is to stay in ASCII a certain type of escaping should be added.

By the way, I have noticed that for long filenames there are several
segments of base64 encodings.
Does the forthcomming "fix" correctly handles long filenames?
Maybe its time to set a CVS or Beta version of Qmail Scanner for download so
that people most bothered by a specificaly added feature or fix could test
it in their environment.

For instance a long filename which contains English, Russian, Hebrew, Arabic
and Greek in UTF8

------=_NextPart_000_0007_01C53894.C9F9E540
Content-Type: image/bmp;
        
name="=?utf-8?B?zqXOkc6jzqPOn19IRUxMT1/Xqdec15XXnV/Qn9Cg0JjQktCV0KJf2YXYsdi
u2Kg=?=
        
=?utf-8?B?2KdfzqXOkc6jzqPOn19IRUxMT1/Xqdec15XXnV/Qn9Cg0JjQktCV0KJf2YXYsQ==?
=
        
=?utf-8?B?2K7YqNinX86lzpHOo86jzp9fSEVMTE9f16nXnNeV151f0J/QoNCY0JLQldCiXw==?
=
        
=?utf-8?B?2YXYsdiu2KjYp1/Opc6RzqPOo86fX0hFTExPX9ep15zXldedX9Cf0KDQmNCS0JU=?
=
        =?utf-8?B?0KJf2YXYsdiu2KjYpy5ibXA=?="
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
        
filename="=?utf-8?B?zqXOkc6jzqPOn19IRUxMT1/Xqdec15XXnV/Qn9Cg0JjQktCV0KJf2YX
Ysdiu2Kg=?=
        
=?utf-8?B?2KdfzqXOkc6jzqPOn19IRUxMT1/Xqdec15XXnV/Qn9Cg0JjQktCV0KJf2YXYsQ==?
=
        
=?utf-8?B?2K7YqNinX86lzpHOo86jzp9fSEVMTE9f16nXnNeV151f0J/QoNCY0JLQldCiXw==?
=
        
=?utf-8?B?2YXYsdiu2KjYp1/Opc6RzqPOo86fX0hFTExPX9ep15zXldedX9Cf0KDQmNCS0JU=?
=
        =?utf-8?B?0KJf2YXYsdiu2KjYpy5ibXA=?="


------=_NextPart_000_0007_01C53894.C9F9E540
Content-Type: image/bmp;
        
name="=?utf-8?B?zqXOkc6jzqPOn19IRUxMT1/Xqdec15XXnV/Qn9Cg0JjQktCV0KJf2YXYsdi
u2Kg=?=
        =?utf-8?B?2KcuYm1w?="
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
        
filename="=?utf-8?B?zqXOkc6jzqPOn19IRUxMT1/Xqdec15XXnV/Qn9Cg0JjQktCV0KJf2YX
Ysdiu2Kg=?=
        =?utf-8?B?2KcuYm1w?="


------=_NextPart_000_0007_01C53894.C9F9E540--

Regards,
Moti

-----Original Message-----
From: Jason Haar [mailto:[EMAIL PROTECTED]
Sent: Saturday, April 02, 2005 23:29
To: qmail-scanner-general@lists.sourceforge.net
Cc: Moti
Subject: Re: Anyone With Far Eastern Code Page on this list? Windows
encoded filenames and perlscanner extention filtering


Moti wrote:

>Thanks for the reply, I've now read that discussion.
>
>First:
>
>If anyone with a Far Eastern Code Page (Japanese,Chinese,Korean) is on this
>list, please send an example of how the attachment filenames in your local
>code page are send in MIME.
>Its for your own good :)
>
>It should look something like this: (specificaly the name and the filename
>fields)
>
>------=_NextPart_000_0032_01C53787.386D0A70
>Content-Type: video/x-ms-wmv;
>        name="=?windows-1255?B?4ePp9+Qud212?="
>Content-Transfer-Encoding: 7bit
>Content-Disposition: attachment;
>        filename="=?windows-1255?B?4ePp9+Qud212?="
>
>
>

I have already "fixed" this problem in the next release by calling
MIME::Base64 - but your comments about issues regarding multi-byte
encoding are not in my area of expertise, so I'm sure if that changes
anything.

>The extention is always in plain ascii 7 , there is no specific "local
>extentions".
>
>
>

Great - so that should mean if we simply "raw" base64 decode the
filenames, the extensions should always be trustable?

>In case of multi byte encoding there is a need to filter out only the ascii
>characters and run perlscan on the. I.E. in some way discard or replace all
>the non ASCII characters. This seems to much more complicated. I am not
sure
>there is a package for perl to work with Windows Code Pages.
>
>
>
Yuck - I don't like the smell of that!

Can you create some empty files for me with the most exotic multi-byte
filenames you can come up with, as examples, along with how you would
write those same filenames within quarantine-attachments.txt using
vi/vim? I want to know how the choice of Code Page would affect the
filename (when you say "Code Page", I assume you are referring to
charset?). Extensions is one thing, but we should also ensure that if
you want to block specific filenames, that the quarantine-attachment.txt
file is capable of handling such filenames too.

How do Unix systems (and editors specifically) handle multi byte chars?
I mean, if I am on a Linux box set for UTF8 (via LANG and LOCALE
settings) and say in quarantine-attachments.txt to block:

יקה.wmv<TAB>0<TAB>yucky Windows movie

and someone sends in that filename - but encoded according to the
"windows-1255" (Hebrew?) charset. Will MIME::Base64 decode that base64
encoded string back to the same "8bit" string referred to within
quarantine-attachments.txt (i.e. so they match)

I mean, I also assume there's a different (older) non-UTF ISO charset
for Hebrew that you could have been using instead of UTF8 on your Unix
box - would the string be different from the UTF8 version?

This locale thing gets complicated REALLY quickly. :-( [esp for an
ignorant English speaker like myself]

--
Cheers

Jason Haar
Information Security Manager, Trimble Navigation Ltd.
Phone: +64 3 9635 377 Fax: +64 3 9635 417
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1

--
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.1 - Release Date: 4/1/2005

--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.1 - Release Date: 4/1/2005




-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
_______________________________________________
Qmail-scanner-general mailing list
Qmail-scanner-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/qmail-scanner-general

Reply via email to