On Wed, 12 Feb 2014, Axb wrote:

On 02/12/2014 10:46 PM, John Hardin wrote:
 On Wed, 12 Feb 2014, Axb wrote:

>  On 02/12/2014 10:06 PM, John Hardin wrote:
> > > > Perhaps something like this: > > > > body __HEXHASHWORD /\b[0-9a-f]{30,}\s[a-z]{1,10}\b/
> >   tflags    __HEXHASHWORD   multiple maxhits=5
> >   meta      HEXHASH_WORD    __HEXHASHWORD > 4
> >   describe  HEXHASH_WORD    Hexadecimal hash followed by a word
> > > > Added to my sandbox, just in case. > > John, > > Isn't {30,} (without a limit) dangerously expensive?

 Potentially expensive; the character class and the fact that the
 following atom is not in that class limits the risk - backtracking isn't
 a possibility. However, point taken - recommend {30,64} instead.

imo, you don't even need to count that much - I'd stop at sweet 16, anything above is pink noise and not waste time chasing spaces & co.

That increases the FP risk, though. Having just hex strings in a email is not inherently a good spam sign, I would think, thus the desire to match long hex string + word with no intervening punctuation.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  WSJ on the Financial Stimulus package: "...today there are 700,000
  fewer jobs than [the administration] predicted we would have if we
  had done nothing at all."
-----------------------------------------------------------------------
 Today: Abraham Lincoln's and Charles Darwin's 205th Birthdays

Reply via email to