On Mon, Jun 05, 2017 at 02:44:51PM -0700, Kevin J. McCarthy wrote:
> On Mon, Jun 05, 2017 at 10:12:47PM +0200, Andries E. Brouwer wrote:
> > Clearly, this 10% test is completely bogus.
> > 
> > More in particular, I think that a file is binary if it contains even
> > a single NUL byte.
> > 
> > What is the reason for this test?
> > 
> > Should I propose a patch?
> 
> I proposed a patch quite awhile ago for this same problem.  Let me see
> if I can dig it up.

It was originally for ticket 2933.  I'm attaching it here.  I think I
didn't push it because one of the other committers suggested looking
into libmagic instead, but I'd be interested if this fixes the problem
for you.

-- 
Kevin J. McCarthy
GPG Fingerprint: 8975 A9B3 3AA3 7910 385C  5308 ADEF 7684 8031 6BDA
# HG changeset patch
# User Kevin McCarthy <ke...@8t8.us>
# Date 1496701910 25200
#      Mon Jun 05 15:31:50 2017 -0700
# Node ID 0053fd3b5296e024ff5821a0a697c1f445c7e85a
# Parent  a11770c2137b4973efe77b4e9d7356f22d2ae5f7
Make attachment type guessing more conservative. (closes #2933).

When guessing the type of an attachment (with no mime type or
extension), mutt currently considers 8-bit characters as "text" when
calculating the percentage of text characters in the file.

In some cases, such as for the sample attachment in this ticket, it leads
to a binary executable being labelled as text/plain, which results in the
the attachment being corrupted when mailed.

This patch considers 8-bit characters as binary for the calculation.  In
general, it's probably better to guess wrong on the conservative side
than possibly corrupt attachments.

diff --git a/sendlib.c b/sendlib.c
--- a/sendlib.c
+++ b/sendlib.c
@@ -1382,20 +1382,21 @@
   if ((info = mutt_get_content_info (path, att)) == NULL)
   {
     mutt_free_body (&att);
     return NULL;
   }
 
   if (!att->subtype)
   {
-    if (info->lobin == 0 || (info->lobin + info->hibin + info->ascii)/ 
info->lobin >= 10)
+    if ((info->lobin == 0 && info->hibin == 0) ||
+        (info->lobin + info->hibin + info->ascii) / (info->lobin + 
info->hibin) >= 10)
     {
       /*
-       * Statistically speaking, there should be more than 10% "lobin"
+       * Statistically speaking, there should be more than 10% "binary"
        * chars if this is really a binary file...
        */
       att->type = TYPETEXT;
       att->subtype = safe_strdup ("plain");
     }
     else
     {
       att->type = TYPEAPPLICATION;

Attachment: signature.asc
Description: PGP signature

Reply via email to