I had a long discussion with Infra today trying to find out why a change I had applied was not appearing. Analyzing it, it turns out that we have a problem already visible on over 400 pages and related to .htm files (as opposed to .html files).

Reproducing is easy:
1) Edit a .htm file, e.g., do this:
http://svn.apache.org/viewvc/openoffice/ooo-site/trunk/content/pt/about/newsletter.htm?r1=1413471&r2=1422592&diff_format=h

2) Publish the changes and you get file duplication:

http://www.openoffice.org/pt/about/newsletter.htm
(the existing URL, ending in .htm, not updated)

http://www.openoffice.org/pt/about/newsletter.html
(a new URL, containing the fix)

This silent change of URLs is quite scary and we already have 401 "duplicate" pages. For other examples see

http://www.openoffice.org/fr/Documentation/liens.htm
http://www.openoffice.org/fr/Documentation/liens.html

or

http://www.openoffice.org/ui/proposals/Readonly_mode.htm
http://www.openoffice.org/ui/proposals/Readonly_mode.html

Daniel Shahaf, who investigated the problem, suggests that we take a look at our path.pm.

Looking at it, I think the place to start investigating is line 14 of
http://svn.apache.org/viewvc/openoffice/ooo-site/trunk/lib/path.pm?revision=1413471&view=markup
which seems to actually turn .htm files into .html files, but it's probably best that someone familiar with the CMS does the change, since I definitely don't want to break the website.

Regards,
  Andrea.

Reply via email to