On Wed, Aug 6, 2014 at 5:46 PM, Mark Morgan Lloyd <markmll.fpc-pas...@telemetry.co.uk> wrote: > Marcos Douglas wrote: >> >> On Wed, Aug 6, 2014 at 2:54 PM, Rainer Stratmann >> <rainerstratm...@t-online.de> wrote: >>> >>> On Wednesday 06 August 2014 19:50:44 you wrote: >>>> >>>> Hi, >>>> >>>> Someone knows a fast html parser to use in Pascal code? >>>> >>>> I need something like this: >>>> >>>> HTML: >>>> <select name="sel_x"> >>>> <option>1</option> >>>> <option>2</option> >>>> </select> >>>> >>>> I need a function/object to give me only the values: >>>> 1 >>>> 2 >>>> >>>> Something like: >>>> S := GetHTMLValues('sel_x'); >>> >>> It's not that difficult to write yourself. >> >> >> You're right. But I'm searching the faster HTML parser to use in huge >> HTML files... thousands of files. > > > I disagree: it's damn difficult if one isn't working with tightly > constrained input, and the original question says HTML without specifying > it's a subset. > > There's a couple of places where I parse HTML files that I've created > myself, i.e. I know exactly what's in them, using- basically- a simple > recursive-descent parser with some rather flexible ideas about comments > (i.e. in the above example, name="sel_x" could be lost as a comment). > However if I'm doing a brute-force job over a large number of files I > usually use Lynx as a preprocessor, which allows me to use standard > text-processing utilities to pull named rows out of tabulated reports.
I know the tokens to search, but the HTML could be very different each other. I can't use a external tool. Need to be a application (that already exists). Thanks, Marcos Douglas _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal