from:"Schronos"

Re: Parsing Baseball Stats

2006-07-26 Thread Schronos

Hi.

  The webpage you need to parse is not very wellformed (I think), but
no problem. perhaps the best option is to locate the portion of HTML yo
want, in this case from "Actual Pitching
Statistics " to "". Between this you have a few entries
like this one: " 19 http://www.baseballprospectus.com/dt//1914BOS-A.shtml>1914
BOS-A   2   1   0   3.914396   23.0   21   12   101
   73   0   0   0   0   1   0".

I'll put here a little portion of code using RE that I think will help
you to develop the rest of the app.

import re
data=" 19 http://www.baseballprospectus.com/dt//1914BOS-A.shtml>1914
BOS-A   2   1   0   3.914396   23.0   21   12   101
   73   0   0   0   0   1   0"
pt=re.compile("(|)") # this and the next line delete the html
tags
data1=pt.sub("",data) # Now data1 doesn't contain any html tag
pt=re.compile(" +") # This sentence and te next will substitute spaces
by "-"
data2=pt.sub("-",data1)
arrange_data=data2.aplit("-") # this make a list with data

after this few sentences you'll have a list with the data you need,
like the next:
['', '19', '1914', 'BOS', 'A', '2', '1', '0', '3.91', '4', '3', '96',
'23.0', '21', '12', '10', '1', '7', '3', '0', '0',
'0', '0', '1', '0']

I think is a good init for you.

Tell me if you can resolve the the problem with this or if you need
more help.

Bye

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Connecting to internet under proxy

2006-07-28 Thread Schronos

I don't know where is the problem, but I tried the same that you put
and it failed.

I tested it under cygwin, a cmd, linux with 2.2, 2.3 and 2.4 python
version. I think taht the problem is my corporate proxy. In this
sentences we use always a http proxy, but perhaps is not the suitable
kind of proxy. :-( I don't know, it is only an idea, 'cause in my other
programs I always use an http proxy and it work properly.

thank's for all.

 Chema.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Parsing Baseball Stats

Re: Connecting to internet under proxy

2 matches

Site Navigation

Mail list logo

Footer information