New submission from Alessandro Vesely: SYMPTOM: When used in a multithreaded program, instances of a class derived from HTMLParser may convert an entity or leave it alone, in an apparently random fashion.
CAUSE: The class has a static attribute, entitydefs, which, on first use, is initialized from None to a dictionary of entity definitions. Initialization is not atomic. Therefore, instances in concurrent threads assume that initialization is complete and catch a KeyError if the entity at hand hasn't been set yet. In that case, the entity is left alone as if it were invalid. WORKAROUND: class Dummy(HTMLParser): """this class is defined here so that we can initialize its base class""" def __init__(self): HTMLParser.__init__(self) # Initialize HTMLParser by loading htmlentitydefs dummy = Dummy() dummy.feed('<a href="&">') del dummy, Dummy ---------- components: Library (Lib) messages: 291256 nosy: ale2017 priority: normal severity: normal status: open title: HTMLParser class is not thread safe type: behavior versions: Python 2.7 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30011> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com