This is kind of a "I'm tired of thinking about this and not making much progress for the amount of time I'm putting in question" but here it is:
I'm trying to parse descriptions from HTML meta elements. I can't use Soup because there isn't a working GemStone port. I've got it to work with the structure: <meta name="description" content="my description"> and <meta name="Description" content="my description"> but I'm running into instances of: <meta http-equiv="description" content="my description"> and <meta http-equiv="Description" content="my description"> and am having trouble adapting my parsing code (such as it is). The parsing code that addresses the first two cases is: parseHtmlPageForDescription: htmlString | startParser endParser ppStream descParser result text lower str doubleQuoteIndex | lower := 'escription' asParser. startParser := '<meta name=' asParser , #'any' asParser , #'any' asParser. endParser := '>' asParser. ppStream := htmlString readStream asPetitStream. descParser := ((#'any' asParser starLazy: startParser , lower) , (#'any' asParser starLazy: endParser)) ==> #'second'. result := descParser parse: ppStream. text := (result inject: (WriteStream on: String new) into: [ :stream :char | stream nextPut: char. stream ]) contents trimBoth. str := text copyFrom: (text findString: 'content=') + 9 to: text size. doubleQuoteIndex := 8 - ((str last: 7) indexOf: $"). ^ str copyFrom: 1 to: str size - doubleQuoteIndex I can't figure out how to change the startParser parser to accept the second idiom. And maybe there's a better approach altogether. Anyway. If anyone has any ideas on different approaches I'd appreciate learning them. Thanks for giving it some thought Paul