I changed the syntax. Now it is more flexible: >>> a=TAG('<h1>Header</h1><p>this is a test</p>') >>> def markdown(text,tag=None,attributes={}): >>> if tag==None: return re.sub('\s+',' ',text) >>> elif tag=='h1': return '#'+text+'\n\n' >>> elif tag=='p': return text+'\n' >>> return text ... >>> a.flatten(markdown) '#Header\n\nthis is a test\n'
Moreover elements accept jQuery syntax: >>> a=TAG('<div><span><a id="1">hello</a></span><p class="this is a >>> test">world</p></div>') >>> for e in a.elements('div a#1, p.is'): print e.flatten() hello world >>> a.elements('a[id=1]')[0].xml() '<a id="1">hello</a>' Please check it. On May 25, 12:24 pm, Iceberg <iceb...@21cn.com> wrote: > On May26, 12:35am, mdipierro <mdipie...@cs.depaul.edu> wrote: > > > > > I cannot push it until tonight but I have this: > > > >>> a=TAG('<h1>Header</h1><p>this is a test</p>') > > >>> print a > > > <h1>Header</h1><p>this is a test</p>>>> a.flatten() > > > 'Headerthis is a test'>>> a.flatten(filter=lambda x: re.sub('\s+',' > > ',x)) > > > 'Headerthis is a test'>>> a.flatten(filter=lambda x: re.sub('\s+','-',x)) > > > 'Headerthis-is-a-test'>>> a.flatten(render=dict(h1=lambda x: > > '#'+x+'\n\n'),filter=lambda x: x.replace(' ','-')) > > > '#Header\n\nthis-is-a-test' > > > filter is applied to text and render is applier to tags. > > so your > > > result = web2pyHTMLParser(form.vars.input).tree > > > could be written as > > > result = TAG(form.vars.input).flatten(filter=lambda x: re.sub('\s > > +',' ',x)), render=dict(br=lambda x:'\n',p=lambda x: x+'\n')) > > > Can somebody propose better names for "filter" ad "render"? I could > > not come up with anything better. > > > Massimo > > Since render={...} does render html tags into another form, so I think > "render" is good name. > > filter=lambda... is not very good because the python has a reserved > keyword "filter" for built-in filter() which acts in different logic. > We should avoid conflict and confusing. How about we just use > "replace"? I mean .flatten(replace=lambda x:x, render={...}) > > Regards, > Iceberg