Re: mailer.py can produce subject header violates RFC 5321/5322 if truncate_subject is not set

Yasuhito FUTATSUKI Mon, 06 Jan 2020 16:44:17 -0800

On 2020/01/07 6:52, Yasuhito FUTATSUKI wrote:
> By the way, it seems another issue about truncate_subject that current
> implementation of truncate_subject may break utf-8 multi-bytes character
> sequence, but I didn't reproduce it(because I always use ascii
> characters only for file names...).


Probably it needs something like this (but it doesn't support conbining
characters, and I didn't any test...):
[[[
Index: tools/hook-scripts/mailer/mailer.py
===================================================================
--- tools/hook-scripts/mailer/mailer.py (revision 1872398)
+++ tools/hook-scripts/mailer/mailer.py (working copy)
@@ -159,7 +159,13 @@
       truncate_subject = 0
 
     if truncate_subject and len(subject) > truncate_subject:
-      subject = subject[:(truncate_subject - 3)] + "..."
+      # To avoid breaking utf-8 multi-bytes character sequence, we should
+      # search the top of the sequence if the byte of the truncate point is
+      # secound or later part of multi-bytes character sequence. 
+      idx = truncate_subject - 3
+      while  0x80 <= ord(subject[idx]) <= 0xbf:
+        idx -= 1
+      subject = subject[:idx] + "..."
     return subject
 
   def start(self, group, params):
]]]


Cheers,
-- 
Yasuhito FUTATSUKI <futat...@yf.bsdclub.org> / <futat...@poem.co.jp>

Re: mailer.py can produce subject header violates RFC 5321/5322 if truncate_subject is not set

Reply via email to