Francis GALIEGUE wrote:
On Tue, Oct 4, 2011 at 21:40, André Warnier <a...@ice-sa.com> wrote:
[...]
I am not sure that I follow the depths of the Java implementation of all of
this, but please note that "\.googlebot\.com$" is a regexp /anchored/ at the
end of the string.
In other words, I would be surprised (and disappointed) if this did not
match the hostnames "bot1.googlebot.com" and "bot123.bots.googlebot.com"
It's quite simple really: .matches(), which is used, anchors the regex
at the beginning and end. .matches("re") is equivalent to
.lookingAt("^re$"), even if your re is already anchored.
Unfortunately, this method's misleading name and the prevalence of
Java has led a lot of people to believe that regex matching was done
on the whole input, which is of course false.
Having now consulted the java.util.regex package documentation (as mentioned in the Tomcat
Valves documentation), these are my own remarks :
I agree with Francis that the way the documentation is written, is confusing for anyone
not dedicating his life to Java programming (like the sysadmins and other perl programmers
who have to use this to configure Tomcat). In classical regex usage, if you want something
anchored, you have to say so explicitly. In classical regex usage, if you do use anchors
such as ^ and $, you expect them to take effect, and not to be silently ignored.
One thing that strikes me, is in :
http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html
it says "Instances of this class are immutable and are safe for use by multiple concurrent
threads. Instances of the Matcher class are not safe for such use."
(But the Matcher class itself seems silent about this).
And, it seems that the Pattern class, and its own .matches() method, does work in the way
that a non-exclusively-java programmer would expect, anchors and all.
So my question is : which of Matcher or Pattern is really used in the Valve's
code ?
Furthermore, about the Tomcat Valve documentation, I would opine :
- either the documentation remains as it is, and in the code, it should use the Pattern
class for matching (and thus not automatically anchor, but allow the usage of explicit
anchors in the provided patterns for allow and deny).
- or the documentation should be amended to indicate that the expression provided for
allow and deny is already automatically anchored at the beginning and end of the string.
(And also that this is not thread-safe, and may occasionally miss a host ?)
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org