Re: UTF8 character in [] doesn't match

2018-12-24 Thread John Hardin
bodyLOCAL_JANO /\bjno?\b/i replace_rules LOCAL_JANO unfortunately, this doesn't work for Ján, only for Jano. I find that odd. Okay, review of the replacetags replacements shows that a lot of basic UTF8 codepoints are missing - I wasn't thorough enough the las

Re: UTF8 character in [] doesn't match

2018-12-24 Thread Henrik K
On Mon, Dec 24, 2018 at 06:48:51PM +, RW wrote: > On Mon, 24 Dec 2018 10:16:58 +0200 > Henrik K wrote: > > > On Sun, Dec 23, 2018 at 11:11:39PM +, RW wrote: > > > On Sun, 23 Dec 2018 20:04:28 +0100 > > > Matus UHLAR - fantomas wrote: > > > > > > > Hello, > > > > > > > > I have tried to

Re: UTF8 character in [] doesn't match

2018-12-24 Thread RW
On Mon, 24 Dec 2018 10:16:58 +0200 Henrik K wrote: > On Sun, Dec 23, 2018 at 11:11:39PM +, RW wrote: > > On Sun, 23 Dec 2018 20:04:28 +0100 > > Matus UHLAR - fantomas wrote: > > > > > Hello, > > > > > > I have tried to create rule that will match names "ján" and > > > "jano" (john and john

Re: UTF8 character in [] doesn't match

2018-12-24 Thread Matus UHLAR - fantomas
s no difference. If message is UTF-8, neither of them will match anyway. The only correct solution: (?:[aá]|\xc3\xa1) works for any message in latin1 or utf8, regardless of normalize_charset setting. Config contents are never converted to anything, you need to make sure regex contains raw byt

Re: UTF8 character in [] doesn't match

2018-12-24 Thread Pedro David Marco
On Monday, December 24, 2018, 9:49:11 AM GMT+1, Henrik K wrote: >... so for general file portability this would be even better: > >(?:[a\xe1]|\xc3\xa1) I fully agree with Henrik, but would add a small detail... in some cases i have found problems using BODY to locate special chars  (most li

Re: UTF8 character in [] doesn't match

2018-12-24 Thread Henrik K
> > > any idea what can cause this? > > > > > > normalize_charset converts to UTF-8 but the tests are still done on > > > bytes, so á isn't a character, it's a string. You need (?:a|á) instead > > > of [aá]. > > > > Nope, that makes no

Re: UTF8 character in [] doesn't match

2018-12-24 Thread Henrik K
n't a character, it's a string. You need (?:a|á) instead > > of [aá]. > > Nope, that makes no difference. If message is UTF-8, neither of them will > match anyway. > > The only correct solution: (?:[aá]|\xc3\xa1) works for any message in latin1 > or utf8, regardl

Re: UTF8 character in [] doesn't match

2018-12-24 Thread Henrik K
> any idea what can cause this? > > normalize_charset converts to UTF-8 but the tests are still done on > bytes, so á isn't a character, it's a string. You need (?:a|á) instead > of [aá]. Nope, that makes no difference. If message is UTF-8, neither of them will match anywa

Re: UTF8 character in [] doesn't match

2018-12-23 Thread RW
On Sun, 23 Dec 2018 20:04:28 +0100 Matus UHLAR - fantomas wrote: > Hello, > > I have tried to create rule that will match names "ján" and > "jano" (john and johnny in slovak languages). > > I have created rule: > > body LOCAL_JANO /\bJ[aá]no\b/i > > however, it does not match. > > Ap

Re: UTF8 character in [] doesn't match

2018-12-23 Thread Matus UHLAR - fantomas
o. even J[á]n did not match "Ján" - is there a problem with perl and utf8? # echo Ján | perl -ne 'if (/J[á]n/) {print "OK\n"} else { print "KO\n"}' KO after consulting google: https://stackoverflow.com/questions/21092427/perl-regex-replace-with-utf-8-c

Re: UTF8 character in [] doesn't match

2018-12-23 Thread John Hardin
On Sun, 23 Dec 2018, Matus UHLAR - fantomas wrote: Hello, I have tried to create rule that will match names "ján" and "jano" (john and johnny in slovak languages). I have created rule: body LOCAL_JANO /\bJ[aá]no\b/i however, it does not match. The "o" is not optional in that RE, s

UTF8 character in [] doesn't match

2018-12-23 Thread Matus UHLAR - fantomas
Hello, I have tried to create rule that will match names "ján" and "jano" (john and johnny in slovak languages). I have created rule: body LOCAL_JANO /\bJ[aá]no\b/i however, it does not match. Apparently the [á] does not match even when normalize_charset is set to '1'. any idea what

Re: amavis[14826]: (14826-10) SA info: dns: new_dns_packet: domain is utf8 flagged:

2018-10-08 Thread Benny Pedersen
zahn skrev den 2018-10-08 12:49: amavis[14826]: (14826-10) SA info: dns: new_dns_packet: domain is utf8 flagged: ns2.yandex.net Is this an error message or can I ignore it - thanks for the reply. SA info, in case of error imho it would say SA error its harmless info

Re: amavis[14826]: (14826-10) SA info: dns: new_dns_packet: domain is utf8 flagged:

2018-10-08 Thread Kevin A. McGrail
4826-10) SA info: dns: new_dns_packet: domain is utf8 > flagged: ns2.yandex.net > > Is this an error message or can I ignore it - thanks for the reply. > > -- > > Schöne Grüsse aus Oberdiessbach Martin Zahn > > Akadia AG > Martin Zahn > Software Ing. HTL > Oracle Certificate P

amavis[14826]: (14826-10) SA info: dns: new_dns_packet: domain is utf8 flagged:

2018-10-08 Thread zahn
Hello I migrated spamassassin from 3.4.1 to 3.4.2 and now I get the following message in the logfile. amavis[14826]: (14826-10) SA info: dns: new_dns_packet: domain is utf8 flagged: ns2.yandex.net Is this an error message or can I ignore it - thanks for the reply. -- Schöne Grüsse aus

Re: Can SpamAssasin convert UTF8 into ISO-8859-1?

2015-05-20 Thread Philip Prindeville
On Apr 15, 2015, at 7:07 PM, @lbutlr wrote: > On Apr 13, 2015, at 09:03, John Hardin wrote: >> The proper place for that sort of thing would be the tool that does final >> delivery to the user's mailbox. > > There is no proper place for that. > No, it’s not. But Mimedefang is. -Philip

Re: Can SpamAssasin convert UTF8 into ISO-8859-1?

2015-04-15 Thread @lbutlr
On Apr 13, 2015, at 09:03, John Hardin wrote: > The proper place for that sort of thing would be the tool that does final > delivery to the user's mailbox. There is no proper place for that. -- LOOSE TEETH DON'T NEED MY HELP Bart chalkboard Ep. AABF16

Re: Can SpamAssasin convert UTF8 into ISO-8859-1?

2015-04-13 Thread John Hardin
On Mon, 13 Apr 2015, Winfried wrote: I was wondering if SpamAssasin could be configured to convert UTF-8 e-mails into ISO-8859-1 prior to sanitization? No. That isn't SA's job. It is a scanning tool. Please don't try to turn it into a swiss army knife. See if there are some options in your

Re: Can SpamAssasin convert UTF8 into ISO-8859-1?

2015-04-13 Thread Benny Pedersen
Winfried skrev den 2015-04-13 12:22: I'm using an old e-mail client that doesn't support UTF-8, which means that any incoming e-mails that used UTF-8 has all its accented characters turned into garbage. +1 I was wondering if SpamAssasin could be configured to convert UTF-8 e-mails into IS

Re: Can SpamAssasin convert UTF8 into ISO-8859-1?

2015-04-13 Thread Reindl Harald
ndering if SpamAssasin could be configured to convert UTF-8 e-mails into ISO-8859-1 prior to sanitization? besides that no MTA / filter should magnle incoming mail to not break signatures and so on it's technically impossible to convert UTF8 in a sane way to ISO-8859-1 because it supports a lot of

Can SpamAssasin convert UTF8 into ISO-8859-1?

2015-04-13 Thread Winfried
n? If not, does someone know of a POP3 proxy (Windows or Linux) that I could add to the mix? Thank you. -- View this message in context: http://spamassassin.1065346.n5.nabble.com/Can-SpamAssasin-convert-UTF8-into-ISO-8859-1-tp115759.html Sent from the SpamAssassin - Users mailing list

Re: utf8

2009-01-19 Thread Bogun Dmitriy
В Сбт, 17/01/2009 в 18:43 +0300, Sergey Kovalev пишет: > Bogun Dmitriy пишет: > > I have upgraded to 3.59(was 3.56). But it not help... it still not > > converting body and not match my test rule. I have tried with utf8, > > koi8-r, cp1251... all not working. But

Re: utf8

2009-01-17 Thread Sergey Kovalev
Bogun Dmitriy пишет: I have upgraded to 3.59(was 3.56). But it not help... it still not converting body and not match my test rule. I have tried with utf8, koi8-r, cp1251... all not working. But when I have disabled normalize_charset, message in UTF8 hit into my rule... all other(koi8-r

Re: utf8

2009-01-16 Thread Bogun Dmitriy
> > 3.59 here in my gentoo I have upgraded to 3.59(was 3.56). But it not help... it still not converting body and not match my test rule. I have tried with utf8, koi8-r, cp1251... all not working. But when I have disabled normalize_charset, message in UTF8 hit into my rule... all other(koi8-r,cp1

Re: utf8

2009-01-15 Thread Benny Pedersen
On Thu, January 15, 2009 17:27, Bogun Dmitriy wrote: > perldoc Mail::SpamAssassin::Conf say, that I need Encode::Detect 1.01 here > HTML::Parser version 3.46 or later. I have them both. 3.59 here in my gentoo -- Benny Pedersen Need more webspace ? http://www.servage.net/?coupon=cust37098

Re: utf8

2009-01-15 Thread Benny Pedersen
On Thu, January 15, 2009 12:03, Justin Mason wrote: > it should work, assuming you have the required CPAN module > installed. what cpan module is it ? i have olso seen problems with some utf-7 :/ -- Benny Pedersen Need more webspace ? http://www.servage.net/?coupon=cust37098

Re: utf8

2009-01-15 Thread Bogun Dmitriy
> On Wed, Jan 14, 2009 at 21:27, Bogun Dmitriy wrote: > > Hello. > > > > Is there any way to make configuration option "normalize_charset" working? > > As I understand it didn't work because of broken utf8 support. But without > > it, there is no wa

Re: utf8

2009-01-15 Thread Justin Mason
it should work, assuming you have the required CPAN module installed. --j. On Wed, Jan 14, 2009 at 21:27, Bogun Dmitriy wrote: > Hello. > > Is there any way to make configuration option "normalize_charset" working? > As I understand it didn't work because of broke

utf8

2009-01-14 Thread Bogun Dmitriy
Hello. Is there any way to make configuration option "normalize_charset" working? As I understand it didn't work because of broken utf8 support. But without it, there is no way to normal use of spamassassin for not English messages. I am not like rules like this. #bo

Re: plugin: eval failed: Can't locate utf8.pm in @INC

2008-06-08 Thread Matus UHLAR - fantomas
On 07.06.08 16:58, Michał Jęczalik wrote: > To: Matus UHLAR - fantomas <[EMAIL PROTECTED]> > cc: users@spamassassin.apache.org And, please, do NOT send me private copies. I do not need nor want them. If I wouldn't read this list, I would not answer your message because I wouldn't know about it.

Re: plugin: eval failed: Can't locate utf8.pm in @INC

2008-06-08 Thread Matus UHLAR - fantomas
> On Sat, 7 Jun 2008, Matus UHLAR - fantomas wrote: > > >>But spamassassin tries to find it in 5.8.8. And that's probably the > >>reason. How to tell perl (or spamassassin?) to include > >>/usr/share/perl/5.10.0 in @INC as well? Now this variable has only 5.8.8 > >>paths. I've created a symlink 5.

Re: plugin: eval failed: Can't locate utf8.pm in @INC

2008-06-07 Thread Michal Jeczalik
On Sat, 7 Jun 2008, Jari Fredriksson wrote: Aaargh, I guess somebody at debian developers team did not coordinated perl and spamassassin packages somehow, but don't know why and how to fix it. ;( $ perl --version This is perl, v5.8.8 built for i486-linux-gnu-thread-multi Copyright 1987-2006,

Re: plugin: eval failed: Can't locate utf8.pm in @INC

2008-06-07 Thread Jari Fredriksson
only 5.8.8 paths. >>> I've created a symlink 5.8.8 -> 5.10.0, maybe it will >>> work as a temporary solution. >> >> Haven't you upgraded perl w/o reinstalling spamassassin? >> Or, doesn't spamassassin use old version of libperl? > > >

Re: plugin: eval failed: Can't locate utf8.pm in @INC

2008-06-07 Thread Michał Jęczalik
-> 5.10.0, maybe it will work as a temporary solution. Haven't you upgraded perl w/o reinstalling spamassassin? Or, doesn't spamassassin use old version of libperl? OK, now I have a symlink and there's another problem: plugin: eval failed: Undefined subroutine utf8::SWASHGET call

Re: plugin: eval failed: Can't locate utf8.pm in @INC

2008-06-07 Thread Michał Jęczalik
On Sat, 7 Jun 2008, Matus UHLAR - fantomas wrote: But spamassassin tries to find it in 5.8.8. And that's probably the reason. How to tell perl (or spamassassin?) to include /usr/share/perl/5.10.0 in @INC as well? Now this variable has only 5.8.8 paths. I've created a symlink 5.8.8 -> 5.10.0, may

Re: plugin: eval failed: Can't locate utf8.pm in @INC

2008-06-07 Thread Matus UHLAR - fantomas
> On Sat, 7 Jun 2008, Matus UHLAR - fantomas wrote: > > >% locate utf8.pm > >/usr/share/perl/5.8.8/DBM_Filter/utf8.pm > >/usr/share/perl/5.8.8/utf8.pm > >% dlocate /usr/share/perl/5.8.8/utf8.pm > >perl-base: /usr/share/perl/5.8.8/utf8.pm > > > >se

Re: plugin: eval failed: Can't locate utf8.pm in @INC

2008-06-07 Thread Michał Jęczalik
On Sat, 7 Jun 2008, Matus UHLAR - fantomas wrote: % locate utf8.pm /usr/share/perl/5.8.8/DBM_Filter/utf8.pm /usr/share/perl/5.8.8/utf8.pm % dlocate /usr/share/perl/5.8.8/utf8.pm perl-base: /usr/share/perl/5.8.8/utf8.pm seems to be from perl, at least from debian's perl package... Which

Re: plugin: eval failed: Can't locate utf8.pm in @INC

2008-06-07 Thread Matus UHLAR - fantomas
> > spamassassin 3.2.4-2, debian package. I'm getting > > occasional errors about missing utf8.pl file: > > > > plugin: eval failed: Can't locate utf8.pm in @INC (@INC > > contains: /usr/share/perl5 /etc/perl > > /usr/local/lib/perl/5.8.8 /usr/loca

Re: plugin: eval failed: Can't locate utf8.pm in @INC

2008-06-07 Thread Jari Fredriksson
> spamassassin 3.2.4-2, debian package. I'm getting > occasional errors about missing utf8.pl file: > > plugin: eval failed: Can't locate utf8.pm in @INC (@INC > contains: /usr/share/perl5 /etc/perl > /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 > /usr/l

plugin: eval failed: Can't locate utf8.pm in @INC

2008-06-07 Thread Michał Jęczalik
spamassassin 3.2.4-2, debian package. I'm getting occasional errors about missing utf8.pl file: plugin: eval failed: Can't locate utf8.pm in @INC (@INC contains: /usr/share/perl5 /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/lib/perl/5.8 /usr/

Problem with ERROR: invalid byte sequence for encoding "UTF8": 0x8a

2007-07-30 Thread Andrew R Jackson
I keep seeing these in my postgresql log file. What did I do wrong? ERROR: invalid byte sequence for encoding "UTF8": 0xd255 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding"

utf8

2007-06-25 Thread Tom Allison
I'm not sure how/if this is done. But I was wondering if anyone has looked into decoding all the charsets into utf8 for bayesian analysis. octets is not readily visible to the user the way it's done today.

Trouble installing 3.0.2 (failed t/body_mod and t/utf8)

2004-12-22 Thread Parker Morse
I tried a CPAN installation of SA 3.0.2 and both those tests (t/body_mod and t/utf8) failed all sub-tests. `make test' otherwise ran OK. The failure, in both cases, started with "Modification of a read-only value attempted at..." and then a series of "compilation abor