Re: Scraping Wikipedia with Python

2009-08-13 Thread Andre Engels
On Tue, Aug 11, 2009 at 8:53 PM, David C Ullrich wrote: > Try reading a little there! Starting there I went to > > http://en.wikipedia.org/wiki/Wikipedia:Creating_a_bot > > where I found a section on existing bots, comments on how the "scraping" > is not what you want, and even a Python section wi

Re: Scraping Wikipedia with Python

2009-08-12 Thread Paul Rubin
Dotan Cohen writes: > > maybe you want dbpedia. > I did not know about this. Thanks! You might also like freebase/metaweb. -- http://mail.python.org/mailman/listinfo/python-list

Re: Scraping Wikipedia with Python

2009-08-12 Thread Dotan Cohen
> http://pypi.python.org/pypi?%3Aaction=search&term=wikipedia ? > Thanks, Thorsten, I will go through those. I did not know about that resource, I am not a regular coder. One more resource to add to the toolbox! -- Dotan Cohen http://what-is-what.com http://gibberish.co.il -- http://mail.pyth

Re: Scraping Wikipedia with Python

2009-08-12 Thread Dotan Cohen
> maybe you want dbpedia. I did not know about this. Thanks! That is the reason why I ask. This list has an unbelievable collective knowledge and I am certain that asking "how much is 2+2" would net an insightful answer that would teach me something. Thank you, Paul, and thank you to the entire

Re: Scraping Wikipedia with Python

2009-08-12 Thread Thorsten Kampe
* Dotan Cohen (Tue, 11 Aug 2009 21:29:40 +0300) > >    Wikipedia has an API for computer access.  See > > > >        http://www.mediawiki.org/wiki/API > > > > Yes, I am aware of this as well. Does anyone know of a python class > for easily interacting with it, or do I need to roll my own. http://

Re: Scraping Wikipedia with Python

2009-08-11 Thread Paul Rubin
Dotan Cohen writes: > Thanks. I read the first bit of that page, but did not finish it. > Grepping it for Python led to to what I need. maybe you want dbpedia. -- http://mail.python.org/mailman/listinfo/python-list

Re: Scraping Wikipedia with Python

2009-08-11 Thread Brian
On Tue, Aug 11, 2009 at 12:29 PM, Dotan Cohen wrote: > >Wikipedia has an API for computer access. See > > > >http://www.mediawiki.org/wiki/API > > > > Yes, I am aware of this as well. Does anyone know of a python class > for easily interacting with it, or do I need to roll my own. >

Re: Scraping Wikipedia with Python

2009-08-11 Thread Dotan Cohen
> Try reading a little there! Starting there I went to > > http://en.wikipedia.org/wiki/Wikipedia:Creating_a_bot > > where I found a section on existing bots, comments on how the "scraping" > is not what you want, and even a Python section with a link to something > labelled  PyWikipediaBot... > T

Re: Scraping Wikipedia with Python

2009-08-11 Thread David C Ullrich
On Tue, 11 Aug 2009 21:29:40 +0300, Dotan Cohen wrote: >>    Wikipedia has an API for computer access.  See >> >>        http://www.mediawiki.org/wiki/API >> >> > Yes, I am aware of this as well. Does anyone know of a python class for > easily interacting with it, or do I need to roll my own. Try

Re: Scraping Wikipedia with Python

2009-08-11 Thread Dotan Cohen
>    Wikipedia has an API for computer access.  See > >        http://www.mediawiki.org/wiki/API > Yes, I am aware of this as well. Does anyone know of a python class for easily interacting with it, or do I need to roll my own. -- Dotan Cohen http://what-is-what.com http://gibberish.co.il -- h

Re: Scraping Wikipedia with Python

2009-08-11 Thread John Nagle
Dotan Cohen wrote: I plan on making a geography-learning Anki [1] deck, and Wikipedia has the information that I need in nicely formatted tables on the side of each country's page. Has someone already invented a wheel to parse and store that data (scrape)? Wikipedia has an API for computer

Scraping Wikipedia with Python

2009-08-11 Thread Dotan Cohen
I plan on making a geography-learning Anki [1] deck, and Wikipedia has the information that I need in nicely formatted tables on the side of each country's page. Has someone already invented a wheel to parse and store that data (scrape)? It is probably not difficult to code, and within the Wikipedi