Re: Regular Expression problem

Paul McGuire Thu, 13 Jul 2006 23:15:51 -0700

Pyparsing is also good for recognizing basic HTML tags and their
attributes, regardless of the order of the attributes.


-- Paul

testText = """sldkjflsa;faj

<link href="mystylesheet.css" rel="stylesheet" type="text/css">

here it would be 'mystylesheet.css'. I used the following regex to get
this value(I dont know if it

I thought I was doing fine until I got stuck by this tag >>

<link rel="stylesheet" href="mystylesheet.css" type="text/css">  : same

tag but with 'href=' part

tags are like these? >>

<link rel="stylesheet" href="mystylesheet.css" type="text/css">
-OR-
<link href="mystylesheet.css" rel="stylesheet" type="text/css">
-OR-
<link type="text/css" href="mystylesheet.css" rel="stylesheet">

"""
from pyparsing import makeHTMLTags,line

linkTag = makeHTMLTags("link")[0]
for toks,s,e in linkTag.scanString(testText):
    print toks.href
    print line(s,testText)
    print

Prints out:

mystylesheet.css
<link href="mystylesheet.css" rel="stylesheet" type="text/css">

mystylesheet.css
<link rel="stylesheet" href="mystylesheet.css" type="text/css">  : same


mystylesheet.css
<link rel="stylesheet" href="mystylesheet.css" type="text/css">

mystylesheet.css
<link href="mystylesheet.css" rel="stylesheet" type="text/css">

mystylesheet.css
<link type="text/css" href="mystylesheet.css" rel="stylesheet">

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression problem

Reply via email to