Am trying to do a strange thing with php..

Extract the urls from the Internet Explorer index.dat history files (dont ask why!)

Here is a snippet
ð­:2002020720020208:[EMAIL PROTECTED]://www.lyricsworld.com/cgi-bin/search.cgi?q=Bruce
+Springsteen&m=phrase&ps=20&o=0&wm=wrd&ul=&wf=22221ð­ð­ð­ð­

it breaks down in the following elements
the :date: in the format yyyymmddyyyymmdd
The expression I'm using is
(":16[0-9]:")
ie matching the : with 16 digits :

I'm using to match the url part (taken from php manual - thanks).. however am getting all the illegal characters - nulls, binary, etc back as its a binary file

preg_match("/^(.*:\/\/)?([^:\/]+):?([0-9]+)?(.*)/", $contents, $match);

also I want to allow / ~ ? & characters as well as http: file: protocols

Complete string is
preg_match(":16[0-9]:/^(.*:\/\/)?([^:\/]+):?([0-9]+)?(.*)/", $contents, $match);


Of course it doesnt work !!!

any help would be appreciated

Pete

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Reply via email to