I'd actually skip checking where the links are coming from, and just pretend any link could be .htm or .html.

Check for both cases, either using a regex or just fix the .html cases first. So do 2 passes.

Pass 1: Data.replace('.html', '.php')
Pass 2: Data.replace('.htm','.php')

I make no claims about what the surrounding code should look like. You've gotten a number of responses on that already.

Cheers,
Cliff

Sébastien N wrote:
It's a fact, but still I went on with the solution that overwrites all
the .htm because it's a really big site and about 70-80% of the links
are internal so we'll save time this way.

There's probably a way to analyse if a link is internal or external,
but I needed something fast. But still, I would be interested about
knowing how to do such a thing for the future.

On 8/24/07, J. Cliff Dyer <[EMAIL PROTECTED]> wrote:
 Tim Williams wrote:
 On 23/08/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:


 Hi,

I have a bunch of files that have changed from standard htm files to
php files but all the links inside the site are now broken because
they point to the .htm files while they are now .php files.

Does anyone have an idea about how to do a simple script that changes
each .htm in a given file to a .php

Thanks a lot in advance


 Something like:

Infile = open(f_name,'r+')
Data = Infile.read()
InFile.write(Data.replace('.htm','.php'))
Infile.close()

:)

 Yeah, but you'd better make darn sure that *all* links point to .htm files
(including external links), because you could very easily end up pointing to
http://some.othersite.com/index.phpl

 And that's just no good.





-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to