RE: Plugin extracting text from docs (was: new spam using large images)

2009-07-01 Thread Giampaolo Tomassoni
lto:jonas_li...@frukt.org] > > Sent: Wednesday, June 24, 2009 1:34 PM > > To: users@spamassassin.apache.org > > Subject: Plugin extracting text from docs (was: new spam using large > > images) > > > > Jason Haar wrote: > > > > > Speaking of image/

RE: Plugin extracting text from docs (was: new spam using large images)

2009-07-01 Thread Rosenbaum, Larry M.
24, 2009 1:34 PM > To: users@spamassassin.apache.org > Subject: Plugin extracting text from docs (was: new spam using large > images) > > Jason Haar wrote: > > > Speaking of image/rtf/word attachment spam; is there any work going > on > > to standardize this so t

Re: Plugin extracting text from docs (was: new spam using large images)

2009-06-25 Thread Matus UHLAR - fantomas
> Jason Haar wrote: > >> Speaking of image/rtf/word attachment spam; is there any work going on >> to standardize this so that the textual output of such attachments could >> be fed back into SA? On 24.06.09 19:33, Jonas Eckerman wrote: > Just as a note: > > I'm currently working on a modular plug

Plugin extracting text from docs (was: new spam using large images)

2009-06-24 Thread Jonas Eckerman
Jason Haar wrote: Speaking of image/rtf/word attachment spam; is there any work going on to standardize this so that the textual output of such attachments could be fed back into SA? Just as a note: I'm currently working on a modular plugin for extracting text and add it to SA message parts.

Re: new spam using large images

2009-06-19 Thread LuKreme
On 19 Jun, 2009, at 14:38 , Karsten Bräckelmann wrote: On Fri, 2009-06-19 at 13:57 -0600, LuKreme wrote: On 19 Jun, 2009, at 06:12 , Karsten Bräckelmann wrote: I just received this: http://pastebin.com/m54006b68 420K in size - standard configuration of SA wouldn't have even run over this

Re: new spam using large images

2009-06-19 Thread Karsten Bräckelmann
On Fri, 2009-06-19 at 13:57 -0600, LuKreme wrote: > On 19 Jun, 2009, at 06:12 , Karsten Bräckelmann wrote: > >> I just received this: http://pastebin.com/m54006b68 > >> > >> 420K in size - standard configuration of SA wouldn't have even run over > >> this message. [...] > > > > SA would have scann

Re: new spam using large images

2009-06-19 Thread LuKreme
On 19 Jun, 2009, at 06:12 , Karsten Bräckelmann wrote: On Fri, 2009-06-19 at 13:04 +1200, Jason Haar wrote: Hi there, just a FYI I just received this: http://pastebin.com/m54006b68 420K in size - standard configuration of SA wouldn't have even run over this message. [...] SA would have sc

Re: new spam using large images

2009-06-19 Thread Theo Van Dinter
Once you have a part you can use the documented methods in Message::Node to access data (see "perldoc Mail::SpamAssassin::Message::Node"). You will probably want $p->decode() which returns a decoded (base64, quoted-printable) string of the part contents. On Fri, Jun 19, 2009 at 7:00 PM, Rosenbau

RE: new spam using large images

2009-06-19 Thread Rosenbaum, Larry M.
> From: felic...@kluge.net On Behalf Of Theo Van Dinter > > On Fri, Jun 19, 2009 at 3:04 AM, Jason Haar > wrote: > > Speaking of image/rtf/word attachment spam; is there any work going > on > > to standardize this so that the textual output of such attachments > could > > be fed back into SA? > >

Re: new spam using large images

2009-06-19 Thread Theo Van Dinter
On Fri, Jun 19, 2009 at 4:42 PM, Charles Gregory wrote: > H. Big question for developers: Does the performance 'burden' of a large > e-mail come from the 'reading' of that mail into spamassassin and initial > processing? Or is the 'cost' of a large message only 'paid' when SA attempts > to run

Re: new spam using large images

2009-06-19 Thread Charles Gregory
On Fri, 19 Jun 2009, Jason Haar wrote: Hi there, just a FYI I just received this: http://pastebin.com/m54006b68 420K in size... H. Big question for developers: Does the performance 'burden' of a large e-mail come from the 'reading' of that mail into spamassassin and initial processing? Or

Re: new spam using large images

2009-06-19 Thread Karsten Bräckelmann
On Fri, 2009-06-19 at 13:04 +1200, Jason Haar wrote: > Hi there, just a FYI > > I just received this: http://pastebin.com/m54006b68 > > 420K in size - standard configuration of SA wouldn't have even run over > this message. [...] SA would have scanned it by default just fine. The default size li

Re: new spam using large images

2009-06-19 Thread Theo Van Dinter
On Fri, Jun 19, 2009 at 3:04 AM, Jason Haar wrote: > Speaking of image/rtf/word attachment spam; is there any work going on > to standardize this so that the textual output of such attachments could > be fed back into SA? That functionality already exists (has for almost 3 years, actually), but as

new spam using large images

2009-06-18 Thread Jason Haar
Hi there, just a FYI I just received this: http://pastebin.com/m54006b68 420K in size - standard configuration of SA wouldn't have even run over this message. Also the inline image is too large for FuzzyOCR to trigger - I would guess FuzzyOCR has the (screen) size limit as a mechanism to reduce F