2014-09-04 22:46 GMT+04:00 Daniel Mikusa <dmik...@pivotal.io>:
> On Thu, Sep 4, 2014 at 1:48 PM, David P. Caldwell <
> da...@code.davidpcaldwell.com> wrote:
>
>> I have a small program that downloads and installs an arbitrary
>> version of Tomcat, using the API provided by Apache to select the
>> proper mirror, and so forth.
>>
>> The script currently takes the Tomcat version as an argument. My
>> script provides a default (which in my case is the latest version of
>> Tomcat 7), but I have to manually update that default whenever I
>> notice a new version has been released.
>>
>> What would be the best way for the script itself to determine the
>> latest available version? Obviously I would give points for "easy" and
>> points for "robust," knowing that those two things might be in
>> conflict.
>>
>> I can think of many horrifying ways to do it:
>>
>> * loop through integers starting with the last known version,
>> attempting to download 7.0.x, until getting a 404
>> * scraping and parsing the HTML at
>> http://archive.apache.org/dist/tomcat/tomcat-7/, which I expect is
>> rather stable
>>
>
> I did this recently for Tomcat 8.  Here's the command I used, which works
> on my Mac.
>
>    LATEST_VERSION=$(curl -s http://tomcat.apache.org/download-80.cgi | grep
> "<h3 id=\"8.0." | xpath '/h3/text()' 2>/dev/null)
>
> A slight variation works on Ubuntu if you install xpath.
>
>    LATEST_VERSION=$(curl -s http://tomcat.apache.org/download-80.cgi | grep
> "<h3 id=\"8.0." | xpath -e '/h3/text()' 2>/dev/null)
>
> I'm sure there are other ways to do it, this was just the first one I put
> together that worked for me.
>

There also exist the following XML file.and the following page
http://tomcat.apache.org/doap_Tomcat.rdf
http://tomcat.apache.org/whichversion.html

>> So my challenge isn't coming up with *a* way to do it, but coming up
>> with the best way.

I would say that download-nn.cgi is the most reliable from the above
ones. A version cannot be released without updating the download page.

But there are the following concerns:
- server generates the page dynamically,
(but as a bonus it gives you a list of best mirrors for your IP address)
- stability of markup.
(I would prefer to parse a download link, instead of <h3> header tag)

If those are of concern, then the doap_Tomcat.rdf XML file would be a
better source of information.


Regarding parsing the download page of a mirror  (and
archive,apache.org is one of those mirrors):
- An announcement and tomcat.a.o site update are usually postponed by
a day to let the mirrors sync.

If you parse the page of a mirror, you may get version number that
have already been released to mirrors, but have not yet announced. The
version may be absent from other mirrors.

- If you parse the page of a mirror and there are several Tomcat
versions available (e.g. N and N-1), and your user chooses version
"N-1". It allows you to download the version "N-1" from the mirror
instead of archive.a.o site.

By the way, do you verify md5 hash or PGP signature of the files
downloaded from mirrors?

> * loop through integers starting with the last known version,
> attempting to download 7.0.x, until getting a 404

There have been several reports of mirrors that did not respond with
proper 404, but instead produced a redirection to some advertisement
page. Such behaviour is against ASF mirror policies, and those mirrors
have been unlisted y ASF infrastructure team, but it may happen again.

Sometimes a MiTM responds with a redirect (e.g. a mobile operator may
do so when there are some problems with your account).


Best regards,
Konstantin Kolinko

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to