Re: string encoding regex problem

2014-08-23 Thread Peter Otten
Philipp Kraus wrote: > I have create a short script: > > - > #!/usr/bin/env python > > import re, urllib2 > > > def URLReader(url) : > f = urllib2.urlopen(url) > data = f.read() > f.close() > return data > > > print re.match( "\.*\<\/small\>", > URLReader("http://sour

Re: string encoding regex problem

2014-08-23 Thread Philipp Kraus
Hi, On 2014-08-16 09:01:57 +, Peter Otten said: Philipp Kraus wrote: The code works till last week correctly, I don't change the pattern. Websites' contents and structure change sometimes. My question is, can it be a problem with string encoding? Your regex is all-ascii. So an encod

Re: string encoding regex problem

2014-08-16 Thread Peter Otten
Philipp Kraus wrote: > The code works till last week correctly, I don't change the pattern. Websites' contents and structure change sometimes. > My question is, can it be a problem with string encoding? Your regex is all-ascii. So an encoding problem is very unlikely. > found = re.search( "

Re: string encoding regex problem

2014-08-15 Thread Steven D'Aprano
Philipp Kraus wrote: > The code works till last week correctly, I don't change the pattern. My > question is, can it be > a problem with string encoding? Did I mask the question mark and quotes > correctly? If you didn't change the code, how could the *exact same code* not mask the question mark

Re: string encoding regex problem

2014-08-15 Thread Roy Smith
In article , Philipp Kraus wrote: > The code works till last week correctly, I don't change the pattern. OK, so what did you change? Can you go back to last week's code and compare it to what you have now to see what changed? > My question is, can it be a problem with string encoding? Did I

Re: string encoding regex problem

2014-08-15 Thread Philipp Kraus
On 2014-08-16 00:48:46 +, Roy Smith said: In article , Philipp Kraus wrote: found = re.search( "http://sourceforge.net/projects/boost/files/boost/";) ) if found == None : raise MyError.StopError("Boost Download URL not found") But found is always None, so I cannot get the correc

Re: string encoding regex problem

2014-08-15 Thread Roy Smith
In article , Philipp Kraus wrote: > found = re.search( " href=\"/projects/boost/files/latest/download\?source=files\" > title=\"/boost/(.*)", > Utilities.URLReader("http://sourceforge.net/projects/boost/files/boost/";) > ) > if found == None : > raise MyError.StopError("Boost Download U

string encoding regex problem

2014-08-15 Thread Philipp Kraus
Hello, I have defined a function with: def URLReader(url) : try : f = urllib2.urlopen(url) data = f.read() f.close() except Exception, e : raise MyError.StopError(e) return data which get the HTML source code from an URL. I use this to get a part of a HTML