Thomas Ploch schrieb:
> John Nagle schrieb:
>> Very true. HTML is LALR(0), that is, you can parse it without
>> looking ahead. Parsers for LALR(0) languages are easy, and
>> work by repeatedly getting the next character and using that to
>> drive a single state machine. The first character-l
John Nagle schrieb:
>
> Very true. HTML is LALR(0), that is, you can parse it without
> looking ahead. Parsers for LALR(0) languages are easy, and
> work by repeatedly getting the next character and using that to
> drive a single state machine. The first character-level parser
> yields toke
Thomas Ploch wrote:
> Marc 'BlackJack' Rintsch schrieb:
>
>>In <[EMAIL PROTECTED]>, Thomas Ploch
>>wrote:
>>>Alright, my prof said '... to process documents written in structural
>>>markup languages using regular expressions is a no-no.'
Very true. HTML is LALR(0), that is, you can parse it
In <[EMAIL PROTECTED]>, Thomas Ploch
wrote:
> This is how my regexes look like:
>
> import re
>
> class Tags:
> def __init__(self, sourceText):
> self.source = sourceText
> self.curPos = 0
> self.namePattern = "[A-Za-z_][A-Za-z0-9_.:-]*"
> self.tagPattern = re
Marc 'BlackJack' Rintsch schrieb:
> In <[EMAIL PROTECTED]>, Thomas Ploch
> wrote:
>
>> Alright, my prof said '... to process documents written in structural
>> markup languages using regular expressions is a no-no.' (Because of
>> nested Elements? Can't remember) So I think he wants us to use rege