On 6/28/10 10:29 AM, Ken D'Ambrosio wrote:
Hi, all.  I've got a file which, in turn, contains a couple thousand
filenames.  I'm writing a web front-end, and I want to return all the
filenames that match a user-input value.  In Perl, this would be something
like,

if (/$value/){print "$_ matches\n";}

But trying to put a variable into regex in Python is challenging me --
and, indeed, I've seen a bit of scorn cast upon those who would do so in
my Google searches ("You come from Perl, don't you?").

First of all, if you're doing this, you have to be aware that it is *very* possible to write a pathological regular expression which will can kill your app and maybe your web server.

So if you're letting them write regular expressions and they aren't like, smartly-trusted-people, be wary.

Here's what I've got (in ugly, prototype-type code):

file=open('/tmp/event_logs_listing.txt' 'r')   # List of filenames
seek = form["serial"].value                    # Value from web form
for line in file:
    match = re.search((seek)",(.*),(.*)", line) # Stuck here

Now, if you don't need the full power of regular expressions, then what about:

    name, foo, bar = line.split(",")
    if seek in name:
        # do something with foo and bar

That'll return True if the word 'seek' appears in the first field of what appears to be the comma-delimited line.

Or maybe, if you're worried about case sensitivity:

    if seek.lower() in name.lower():
        # la la la

You can do a lot without ever bothering to mess around with regular expressions. A lot.

Its also faster if you're doing simpler things :)

If they don't need to do the full power of regular expressions, but just simple globbing? Then maybe change it to:

   seeking = re.escape(seek)
   seeking = seeking.replace("\\*", ".*")
   seeking = seeking.replace("\\?", ".")

   match = re.search(seeking + ",(.*),(.*)", line)

FIrst, we escape the user input so they can't put in any crazy regular expression characters. Then we go and *unescape* "\\*" and turn it into a ".*" -- becaues when a user enters "*", the really mean ".*" in traditional glob-esque. Then we do the same with the question mark turning into a dot.

Then! We go and run our highly restricted regular expression through basically just as you were doing before-- you just didn't concatenate the 'seek' string to the rest of your expression.

If you must give them full regex power and you know they won't try to bomb you, just leave out the 'seeking = ' lines and cross your fingers.


--

   ... Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+list/python (AT) ixokai (DOT) io
   ... Blog: http://meh.ixokai.io/

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to