Logging module gives duplicate log entries
Hi, I am getting duplicate log entries with the logging module. The following behaves as expected, leading to one log entry for each logged event: logging.basicConfig(level=logging.DEBUG, filename='/tmp/foo.log') But this results in two entries for each logged event: applog = logging.getLogger() applog.setLevel(logging.DEBUG) hdl = logging.FileHandler('/tmp/foo.log') applog.addHandler(hdl) The app is based on the web.py framework, so I guess my problem may be connected to be some interaction with other uses of logging within the framework. This is not specific to the root logger, the same happens with logging.getLogger('foo'). Any clue would be more than welcome. best, ShiaoBu -- http://mail.python.org/mailman/listinfo/python-list
Re: Logging module gives duplicate log entries
> > You need to remove the handler from the logging object > > # remove the handler once you are done > applog.removeHandler(hdl) > > Cheers, > amit. > I'm not sure how this could help. -- http://mail.python.org/mailman/listinfo/python-list
Re: Logging module gives duplicate log entries
Maybe my question wasn't very clear. What I meant is that these four lines lead in my case to two entries per logged event: applog = logging.getLogger() applog.setLevel(logging.DEBUG) hdl = logging.FileHandler('/tmp/foo.log') applog.addHandler(hdl) However if I REPLACE the above by: logging.basicConfig(level=logging.DEBUG, filename='/tmp/foo.log') things work as expected. -- http://mail.python.org/mailman/listinfo/python-list
Unicode regex and Hindi language
The regex below identifies words in all languages I tested, but not in Hindi: # -*- coding: utf-8 -*- import re pat = re.compile('^(\w+)$', re.U) langs = ('English', '中文', 'हिन्दी') for l in langs: m = pat.search(l.decode('utf-8')) print l, m and m.group(1) Output: English English 中文 中文 हिन्दी None From this is assumed that the Hindi text contains punctuation or other characters that prevent the word match. Now, even more alienating is this: pat = re.compile('^(\W+)$', re.U) # note: now \W for l in langs: m = pat.search(l.decode('utf-8')) print l, m and m.group(1) Output: English None 中文 None हिन्दी None How can the Hindi be both not a word and "not not a word"?? Any clue would be much appreciated! Best. -- http://mail.python.org/mailman/listinfo/python-list
Identifying unicode punctuation characters with Python regex
Hello, I'm trying to build a regex in python to identify punctuation characters in all the languages. Some regex implementations support an extended syntax \p{P} that does just that. As far as I know, python re doesn't. Any idea of a possible alternative? Apart from manually including the punctuation character range for each and every language, I don't see how this can be done. Thank in advance for any suggestions. John -- http://mail.python.org/mailman/listinfo/python-list
Re: Identifying unicode punctuation characters with Python regex
On Nov 14, 11:27 am, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > I'm trying to build a regex in python to identify punctuation > > characters in all the languages. Some regex implementations support an > > extended syntax \p{P} that does just that. As far as I know, python re > > doesn't. Any idea of a possible alternative? > > You should use character classes. You can generate them automatically > from the unicodedata module: check whether unicodedata.category(c) > starts with "P". > > Regards, > Martin Thanks Martin. I'll do this. -- http://mail.python.org/mailman/listinfo/python-list
Re: Identifying unicode punctuation characters with Python regex
On Nov 14, 12:30 pm, "Mark Tolonen" <[EMAIL PROTECTED]> wrote: > "Mark Tolonen" <[EMAIL PROTECTED]> wrote in message > > news:[EMAIL PROTECTED] > > > > > > > "Shiao" <[EMAIL PROTECTED]> wrote in message > >news:[EMAIL PROTECTED] > >> Hello, > >> I'm trying to build a regex in python to identify punctuation > >> characters in all the languages. Some regex implementations support an > >> extended syntax \p{P} that does just that. As far as I know, python re > >> doesn't. Any idea of a possible alternative? > > >> Apart from manually including the punctuation character range for each > >> and every language, I don't see how this can be done. > > >> Thank in advance for any suggestions. > > >> John > > > You can always build your own pattern. Something like (Python 3.0rc2): > > >>>> import unicodedata > > Po=''.join(chr(x) for x in range(65536) if unicodedata.category(chr(x)) == > > 'Po') > >>>> import re > >>>> r=re.compile('['+Po+']') > >>>> x='我是美國人。' > >>>> x > > '我是美國人。' > >>>> r.findall(x) > > ['。'] > > > -Mark > > This was an interesting problem. Need to escape \ and ] to find all the > punctuation correctly, and it turns out those characters are sequential in > the Unicode character set, so ] was coincidentally escaped in my first > attempt. > > IDLE 3.0rc2>>> import unicodedata as u > >>> A=''.join(chr(i) for i in range(65536)) > >>> P=''.join(chr(i) for i in range(65536) if u.category(chr(i))[0]=='P') > >>> len(A) > 65536 > >>> len(P) > 491 > >>> len(re.findall('['+P+']',A)) # ] was naturally > >>> escaped > 490 > >>> set(P)-set(re.findall('['+P+']',A)) # so only missing \ > {'\\'} > >>> P=P.replace('\\','').replace(']','\\]') # escape both of them. > >>> len(re.findall('['+P+']',A)) > > 491 > > -Mark Mark, Many thanks. I feel almost ashamed I got away with it so easily :-) -- http://mail.python.org/mailman/listinfo/python-list