Greetings,
I have Python 3.6 script on Windows to scrape comment history from a
website. It's currently set up this way:
Requestor (threads) -> list -> Parser (threads) -> queue -> CVSWriter
(single thread)
It takes 15 minutes to process ~11,000 comments.
When I replaced the list with a queue between the Requestor and Parser
to speed up things, BeautifulSoup stopped working.
When I changed BeautifulSoup(contents, "lxml") to
BeautifulSoup(contents), I get the UserWarning that no parser wasn't
explicitly set and a reference to line 80 in threading.py (which puts it
in the RLock factory function).
When I switched back to using list between the Requestor and Parser, the
Parser worked again.
BeautifulSoup doesn't work with a threaded input queue?
Thank you,
Chris Reimer
--
https://mail.python.org/mailman/listinfo/python-list