Hello,

Is it possible to scan PDFs for phishing URLs?

I tried using php-clamavlib-0.12a with ClamAV 0.91.2 on Ubuntu/x86 7.10
with the standard signatures and Sanesecurity's phishing and scam
signatures. I modified php-clamavlib to call cl_load with
"CL_DB_STDOPT|CL_DB_PHISHING|CL_DB_PHISHING_URLS" and cl_scanfile with
"CL_SCAN_STDOPT|CL_SCAN_PDF". As a test I scanned an HTML e-mail
containing a hex encoded URL which was detected as
"Phishing.Heuristics.Email.HexURL". I inserted the same URL as a
hyperlink in an OpenOffice.org 2.2 (Win32) document and exported it as a
PDF. Clamav didn't detect the phishing URL in the exported PDF. I took
the exported PDF and ran it through pdftohtml and added some e-mail
headers (Return-path, Content-Type, Subject, Date, To, From). The e-mail
that I made from the PDF was detected properly as
"Phishing.Heuristics.Email.HexURL". I also tried a URL with a spoofed
domain from the list in daily.pdb, but I got the same results as above
(detected in e-mails but not PDFs).

--
Tom Cort
Systems Developer
Vermont Department of Taxes

_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://lurker.clamav.net/list/clamav-users.html

Reply via email to