On Sun, Mar 5, 2017 at 3:22 AM, Wanderer <wande...@dialup4less.com> wrote:
> I mostly just lurk and view the post titles to see if something interesting 
> is being discussed. This code gets me a web page without the spam. You need 
> to compile it to a pyc file and create a bookmark. Probably not useful for 
> most people who don't use their browsers the way I do, but here it is.
>
> # remove authors with mostly caps
>
> USERAGENTBASE = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:40.0) 
> Gecko/20100101 '
> BROWSERPATH = 'C:\\"Program Files"\\Waterfox\\waterfox.exe'
> FILENAME = 'C:\\PyStuff\\pygroup.htm'
> WEBPAGE = 
> "https://groups.google.com/forum/?_escaped_fragment_=forum/comp.lang.python";
>

Interesting. Any particular reason to screen-scrape Google Groups
rather than start with the netnews protocol? You can get a
machine-readable version of the newsgroup much more simply that way, I
would have thought.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to