[fpc-pascal] Re: html link extractor

L Tue, 03 Jul 2007 00:12:19 -0700

> > Is there a unit somewhere that can extract links from html pages? I want
> > to be able to recursively add pages to a chm archive.
>
> I created a program called GetLinks in a couple minutes:


Updated the files and changed the htmlutil functions a bit.

Also, created a recursive example that uses Synapse.. and grabs web links
*infinitely* until it finds no more links (using nested pascal function for the
recursion, Torvalds hates those).

The recursive demo probably won't work with file:// style links since it invokes
Synapse, and only simple http relative paths work as is. (not sure if CHM files
use file:// style links, guessing).

Latest download includes recursive extractor and getlinks demo:
http://z505.com/download/pascal/html/fast-html-parser-jul-02-2007.zip
http://sourceforge.net/project/showfiles.php?group_id=145841&package_id=212708&release_id=520417


_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

[fpc-pascal] Re: html link extractor

Reply via email to