Hi Massimo, Good to know you finally made it! :-)

Albeit not knowing where and when to use this new feature, I came up
with an HTML Optimizier such as [1], in a dozen lines of web2py code.

[1] http://www.iwebtool.com/html_optimizer

[2] Put this inside your controller.

def easter(): # This code release in public domain
    from gluon.html import web2pyHTMLParser
    form = FORM(
        TEXTAREA(_name='input'), BR(),
        INPUT(_type='submit', _value='Optimize!'), )
    result = ''
    if form.accepts(request.vars, keepvalues=True):
        result = web2pyHTMLParser(form.vars.input).tree
    return {'':DIV(
        'Insert your HTML code to optimize:',
        form,
        FIELDSET(PRE(str(result))),)}

Well, not exactly an html optimizer, because our version does not
strip spaces inside text content. Just for fun.

Regards,
Iceberg

On May25, 4:27am, mdipierro <mdipie...@cs.depaul.edu> wrote:
> Good suggestion. Now you can do
>
>     >>> from gluon.html import web2pyHTMLParser
>     >>> tree = web2pyHTMLParser('hello<div a="b">world</
> div>').tree
>     >>> tree.element(_a='b')
> ['_c']=5
>     >>>
> str(tree)
>     'hello<div a="b" c="5">world</div>'
>
> works great!
>
> On May 24, 5:11 am, Iceberg <iceb...@21cn.com> wrote:
>
>
>
> > I did not try but I assume the builtin python module HTMLParser
> > already handle at least (1) tags like <input />, not sure about (2)
> > and (3).
>
> > On May24, 4:32am, mdipierro <mdipie...@cs.depaul.edu> wrote:
>
> > > hmmm.... somehow I did not save comments in the file.
>
> > > This does not handle well:
>
> > > 1) tags like <input />
> > > 2) attributes that contain > in quotes <a onclick="if(a>b)alert()">
> > > 3) attributes that contain escaped quotes <a onclick="var a=\"x\"">
>
> > > On May 23, 10:46 am, Massimo Di Pierro <mdipie...@cs.depaul.edu>
> > > wrote:
>
> > > > Anybody interested in helping with this?
>
> > > > It scrapes an html files and converts into a tree hierarchy of web2py  
> > > > helpers
>
> > > > '<div>xxx</div>' -> DIV('xxx')
>
> > > > It kind of works but fails at three exceptions described in the file.
>
> > > > Massimo
>
> > > >  parsehtml.py
> > > > 1KViewDownload

Reply via email to