On Wed, 12 Feb 2014, Axb wrote:
On 02/12/2014 10:46 PM, John Hardin wrote:
On Wed, 12 Feb 2014, Axb wrote:
> On 02/12/2014 10:06 PM, John Hardin wrote:
> >
> > Perhaps something like this:
> >
> > body __HEXHASHWORD /\b[0-9a-f]{30,}\s[a-z]{1,10}\b/
> > tflags __HEXHASHWORD multiple maxhits=5
> > meta HEXHASH_WORD __HEXHASHWORD > 4
> > describe HEXHASH_WORD Hexadecimal hash followed by a word
> >
> > Added to my sandbox, just in case.
>
> John,
>
> Isn't {30,} (without a limit) dangerously expensive?
Potentially expensive; the character class and the fact that the
following atom is not in that class limits the risk - backtracking isn't
a possibility. However, point taken - recommend {30,64} instead.
imo, you don't even need to count that much - I'd stop at sweet 16, anything
above is pink noise and not waste time chasing spaces & co.
That increases the FP risk, though. Having just hex strings in a email
is not inherently a good spam sign, I would think, thus the desire to
match long hex string + word with no intervening punctuation.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
WSJ on the Financial Stimulus package: "...today there are 700,000
fewer jobs than [the administration] predicted we would have if we
had done nothing at all."
-----------------------------------------------------------------------
Today: Abraham Lincoln's and Charles Darwin's 205th Birthdays