Hi Luca,

sorry for my late reply, but your suggestion worked well!
It was really a problem with the (.*) pattern.

Kind regards
Uwe


From: Luca Toscano [mailto:toscano.l...@gmail.com] 
Sent: Tuesday, February 14, 2017 4:37 PM
To: users@httpd.apache.org
Subject: Re: [users@httpd] mod_substitute only replaces first pattern match

Hi!

2017-02-06 17:25 GMT+01:00 <mailto:uwe.pol...@amann.com>:
Hi,

I am trying a reverse proxy server based on apache httpd v2.4 on the most 
recent release of CentOS:

# httpd -version
Server version: Apache/2.4.6 (CentOS)
Server built:   Nov 14 2016 18:04:44

# uname -a
Linux hostname.domain.tld 3.10.0-514.6.1.el7.x86_64 #1 SMP Wed Jan 18 13:06:36 
UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

# cat /etc/centos-release
CentOS Linux release 7.3.1611 (Core)

Within this configuration I have to use mod_substitute to rewrite URLs from 
some applications.
For this I am using mod_filter with the SUBSTITUTE Filter as follows:

  ProxyRequests Off
  ProxyPass /my-location https://my-server.domain.tld/

  <Location /my-location/>
    ProxyPassReverse    /my-location

    FilterDeclare       AGFILTER

    FilterProvider      AGFILTER SUBSTITUTE "%{resp:Content-Type} =~ 
m#^text/html#"
    FilterProvider      AGFILTER SUBSTITUTE "%{resp:Content-Type} =~ m#.*/css#"
    FilterProvider      AGFILTER SUBSTITUTE "%{resp:Content-Type} =~ m#.*/json#"
    FilterProvider      AGFILTER SUBSTITUTE "%{resp:Content-Type} =~ 
m#.*/javascript#"

    FilterChain         AGFILTER

    Substitute          
"s#/(css|js|images|management|system|help)/(.*)#/my-location/$1/$2#fi"
  </Location>

It works fine if there is only one occurrence of the search pattern in a line 
in the html code. This occurrence will be replaced properly.
However, if there are two or more occurrences of the search pattern in one html 
line, only the first one is replaced. It looks like this example:

<tr><th colspan=3 nowrap></th><th colspan=3 nowrap><a 
href="overview.epl?hide=1"><img border=0 src="/my-location/images/hide.gif" 
alt=" Spalte ausblenden"></a> <a href="overview.epl?cl=1&pl=1"><img 
src="/images/right.gif" border=0 alt=" Spalte nach rechts 
schieben"></a></th><th colspan=3 nowrap><a href="overview.epl?cl=2&pl=-1"><img 
src="/images/left.gif" alt=" Spalte nach links schieben" border=0 ></a> <a ....

Here you see: The first one is replaced, the second image URL is the same as 
before.

Is this works-as-designed?

I think that the issue is in the (.*) of your regex. In your example it will 
match the first occurrence of the pattern (like "images/") and will end up 
eating all the rest of the chars (greedy behavior as far as I can see). The 
following matches more than on occurrences in your example string, because it 
checks for the .something extension:

(css|js|images|management|system|help)\/(\w+\.\w)

So mod_substitute seems to be working fine, the regex would needs a bit of 
tuning imo. The documentation might need to mention the greedy behavior, but I 
need to triple check that what I just said makes sense :)

Hope that helps!

Luca

 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org

Reply via email to