Owen Townend wrote:
2008/9/3 Andre Majorel <[EMAIL PROTECTED]>:
Is there is program to make all links relative in HTML documents
saved in wget -x fashion ? (http://foo.com/a/b.html saved as
./foo.com/a/b.html.)
For example,
- if ./foo.com/a/b.html contains <img src="/images/d.jpg">
and ./foo.com/images/d.jpg
exists, replace that tag <img src="../images/d.jpg">
- if ./foo.com/a/b.html contains <a href="http://bar.org/c.html">
and ./bar.org/c.html
exists, replace that tag by <a href="../../bar.org/c.html">
I know about wget -k and it doesn't do what I need. My goal is use
wget or some such to have an exact mirror of the web site and then
make a _copy_ of the mirror that can be navigated off-line.
One way to do this which would save downloading twice might be
something like this:
1) wget from foo.com to bar.local as exact mirror
2) apache virtual host for the exact mirror as foo.com
3) temporary hosts line/dns entry either on bar.local or your
workstation aliasing foo.com to bar.local
4) wget -k foo.com would pull from local exact copy as a local relative mirror.
An easier way would be to just run wget with the -k and -nc options from
the main site. Just make sure you are in the same starting directory
when you ran the original command. wget will not download any file that
is already present, but will instead read it locally from the disk and
make the link conversions.
Better make a backup though, in case something goes wrong.
--
If you can't explain it simply, you don't understand it well enough.
-- Albert Einstein
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]