On Mon, 25 Apr 2016 09:35:48 -0400
Alex wrote:

> Hi all,
> 
> I'd like to get the pdf.pm plugin working to convert the text from
> within a PDF attachment into text to be scanned for bad links, etc.
> 
> I've downloaded it from here:
> 
> http://sa.zmi.at/
> 
> When I try to run, I receive:
> 
> Apr 25 09:29:38.733 [15956] warn: Use of uninitialized value $name in
> lc at /etc/mail/spamassassin/pdf.pm line 54.
> Apr 25 09:29:38.774 [15956] warn: readline() on closed filehandle PDF
> at /etc/mail/spamassassin/pdf.pm line 105.
> 
> It fails to detect the name of the attached PDF. Is someone able to
> take a look at how it works for me and use their perl skills to fix
> it?
> 
> It may even be related to spamassassin itself, as it uses
> parse_content_type() to figure it out:
> 
> my ( $ctype, $boundary, $charset, $name ) =
> Mail::SpamAssassin::Util::parse_content_type(
> $p->get_header('content-type') );
>     $ctype = lc($ctype);
>     $name = lc($name);

The mime headers for an attachment, typically look something like this:

  Content-Type: application/octet-stream; name="79421672.pdf"
  Content-Transfer-Encoding: base64
  Content-Disposition: attachment; filename="79421672.pdf"

with the filename duplicated. While this is very common, the "name" in
the Content-Type is superfluous and can be legitimately omitted. The
plugin doesn't allow for this possibility.

If this is the cause, it might work well enough to allow you to
evaluate the plugin without putting any effort into fixing it until
you know it's worth it.  Not all of the "uninitialized value"
warnings will be from pdf files anyway - it can happen on any
application/octet-stream without "name=".

Reply via email to