On Wed, 23 Nov 2016, Rich Wales wrote:
/The RE at that line looks pretty firmly anchored... Can you gzip up a
sample that fails for you and send it to me?/
Sure. See the attachment.
OK, I can repro on trunk:
Nov 23 19:17:00.141 [18349] dbg: message: HTML::Parser utf8_mode on (assumed
UTF-8 octets)
Nov 23 19:17:00.187 [18349] warn: Complex regular subexpression recursion limit
(32766) exceeded at lib/Mail/SpamAssassin/HTML.pm line 745.
Nov 23 19:17:00.193 [18349] dbg: message: spaces (octets) in HTML: 952 out of
3954
It's that very long block of QP blanks right at the end. If you edit out
all those =20s after the </td> it stops emitting that warning.
That would be a workaround for you to make sa-learn shut up about your
corpus until the problem is fixed. Blanks don't affect Bayes (at least,
not until we implement multi-word tokens) so it shouldn't affect what gets
learned.
Please open a bug and attach that spample as a repro test case. I'm not
too familiar with that bit of the code so I don't have a fast fix.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
338 days since the first successful real return to launch site (SpaceX)