Id agree with this. Its easy enough to always get the pristine url by just 
making the api call.  However getting the remapped url is not as easy. So it 
makes sense that the remapped becomes the default and if any plugin needs the 
pristine then it can get it anyway

On 2/11/19, 1:42 PM, "Gancho Tenev" <gan...@apache.org> wrote:

    
    I looked into the issues #4929, #4026, #2877 and PR #4531 which are all 
related to this proposal about TSRemapRequestInfo::requestUrl (requestUrl for 
short) and plugin ordering and I would like to revive the thread. (links to 
issues, PRs, IRC logs referenced at the end of this email)
    
    I definitely support *not* having the first plugin special but would like 
to propose to rewrite the url *before* running all plugins instead of *after* 
which would guarantee that:
    all plugins would get the same requestUrl (first plugin not special case)
    all plugins would treat requestUrl the same way consistently as a 
*remapped* URL which makes the first plugin logic really no different from the 
rest
    there would be a remapped URL default in case the remap rule had no plugins 
OR none of the plugins modified the remapped URL.
    
    
    
    
    The following are some thoughts and information which I hope would justify 
the proposal.
     
    Plugin design:
    ==============
    
    Based on my understanding of the relevant code and the comments in it, the 
plugin execution by design allows the plugins to be ordered in a pipeline which 
results in “calculating” the final remapped URL. requestUrl is meant to be used 
in read/write mode as the *input* and the *output* for each individual plugin 
in the pipeline.
    
    Each plugin from the pipeline could read requestUrl if it needs to check 
what would be the eventual remapped URL if not modified and to decide to modify 
it if necessary (as a whole or only parts of it by using TS API calls). Then 
the next plugin in the pipeline could follow the same read and/or modify action 
and so on and so forth. The pipeline progress can be continued or aborted based 
on any plugin return code which also marks if the remapped URL was modified.
    
    At least this was the behavior before PR #4531, except that for some 
reason, not well understood, the special first plugin got the *pristine* URL as 
input and the rest of the plugins got the *remapped* URL.
     
    
    Even after the PR #4531 fix the first plugin still feels special?
    =================================================================
     
    Now let us consider the following use case after the PR #4531 fix. 
    
    The first plugin gets the *pristine* URL (as before instead of the 
*remapped* URL) which means it will have to treat its input as *pristine* URI. 
If the first plugin decides to check (read input) and modify (write output) 
requestUrl then the second plugin will have to treat the input requestUrl 
differently as *remapped* URL since the first plugin already modified it and if 
no other plugin modifies requestUrl the content of requestUrl will end up being 
the final *remapped* URL.
    
    Wait! The first plugin gets *pristine* URL and the rest get the *remapped* 
URL? This still feels like inconsistent design since it makes the first plugin 
a special case again which was something that we tried to fix with PR #4531.
    
    Also PR #4531 kept the backward compatibility for the first plugin and 
broke compatibility for the rest. We cannot avoid breaking compatibility but I 
would prefer to break compatibility for the one (first) plugin in the chain 
rather than for the rest (most) of the plugins. Based on what I have seen, most 
of the configurations run with much more than a single plugin per remap. 
    
    Looking further into the usage, the overhead of setting a default by 
overwriting URL before all plugins run on every matching transaction will be 
negligible since most of the plugins don’t modify the requestUrl or if they do, 
it is the case for a relatively small number of the transactions, hence we 
would hit the overwrite most of the time regardless. The only exception are 
plugins like regex_remap which constantly overwrite URL but even then setting 
the default before regex_remap modifies the URL does not seem too expensive.
    
    I have a PR #4964 which I believe should address all the aforementioned 
issues. 
    
    Please let me know what you think!
    
    Cheers,
    --Gancho
    
    
    Reference:
    https://github.com/apache/trafficserver/issues/4929 - “Cachekey plugin 
assumes requestUrl in TSRemapRequestInfo is remapped URL”
    https://github.com/apache/trafficserver/issues/4026 - “Plugin ordering 
changes remap URI even with empty plugin”
    https://github.com/apache/trafficserver/issues/2877 - 
“Unexpected/wrong/undocumented behavior when using regex_remap plugin”
    https://github.com/apache/trafficserver/pull/4531 - “Rewrite url after 
running all remap plugin”
    https://github.com/apache/trafficserver/pull/4964 
<https://github.com/apache/trafficserver/pull/4964> - "Rewrite url before all 
remap plugins run” (NEW)
    https://wilderness.apache.org/channels/?f=traffic-server/2018-11-29 - IRC 
discussion about “Rewrite url after running all remap plugin”
    
    
    
    
    > On Dec 3, 2018, at 12:09 PM, Alan Carroll 
<solidwallofc...@oath.com.INVALID> wrote:
    > 
    > This came up again on IRC. The effect of the PR is
    > 
    > Before:
    > If the first remap plugin does not do a remap, then the URL is remapped to
    > the "from" URL in the remap rule, then the rest of the remap plugins run.
    > 
    > After:
    > All remap plugins are run, and if *none* of them remap, then the "from" 
URL
    > from the remap rule is used.
    > 
    > The primary point of this is to not have the first remap plugin be 
special.
    > In particular this means a plugin that's change from first to second
    > doesn't see different behavior.
    > 
    > 
    > On Mon, Dec 3, 2018 at 5:50 AM Takuya Kitano <tkit...@yahoo-corp.jp> 
wrote:
    > 
    >> Thank you. This is exactly what I want to fix.
    >> 
    >> 
    >> 
    >> 2018/12/02 21:59 に、"宋辰伟" <616955...@qq.com> を書き込みました:
    >> 
    >>    Actually there is a problem that URL is only overwritten in first
    >> plugin in current logic. That means the request_url is overwritten in 
first
    >> plugin(which return NO_REMAP ) and never changed even if the remain 
plugin
    >> set map_to or map_form, except you set request_url directly.  If my
    >> understanding  is correctly.
    >> 
    >>    scw00 - Song Chenwei
    >> 
    >>> 在 2018年12月2日,上午11:26,Bret Palsson <bre...@gmail.com> 写道:
    >>> 
    >>> I don’t normally comment, but feel like I need to chime in here.
    >>> 
    >>> I agree in keeping with the stack behavior. This change has the
    >> potential to break a lot of plugin logic and will cause a lot of work
    >> across many dev teams that maintain many plugins. The pristine URL should
    >> be used in subsequent plugins as needed.
    >>> 
    >>> Having said that, I am also interested to know and understand the
    >> cause for wanting this behavior.
    >>> 
    >>> -Bret
    >>> 
    >>>> On Dec 1, 2018, at 20:15, Yongming Zhao <ming....@gmail.com> wrote:
    >>>> 
    >>>> I’d like to take a little step back ask about what is the root
    >> causing about this change:
    >>>> 
    >>>> in the origin design
    >>>> 1, the remap plugin works in ordered, which means you can
    >> stop(consider it done) remapping at any plugin if you want, and the
    >> following plugins will not run after.
    >>>> 2, the remap plugin works in stacks, which means the seconds plugin
    >> will continue work on the URL which is rewritten in the first plugin.
    >>>> 
    >>>> when you talking about the origin URL, there is always a pristine
    >> URL you can copy, if you really want, so the second plugin can see the
    >> origin un-mapped URL. this pristine URL is design for logging, but I 
think
    >> you can take it for your purpose if you like
    >>>> 
    >>>> 
    >>>> so, here rise my question:
    >>>> can the pristine URL full file your second plugin needs? if so, can
    >> we keep the origin design?
    >>>> 
    >>>> 
    >>>> - Yongming Zhao 赵永明
    >>>> 
    >>>>> 在 2018年11月5日,上午11:06,Takuya Kitano <tkit...@yahoo-corp.jp> 写道:
    >>>>> 
    >>>>> Hi,
    >>>>> 
    >>>>> I’d like to propose a change about the timing that ATS runs remap
    >> plugins.
    >>>>> 
    >>>>> Currently, we configure remap.config like the following snippet,
    >>>>> and each plugins use `rri->requestUrl`,
    >>>>> the behavior of a remap plugin as the first plugin is different
    >> from that as the second plugin.
    >>>>> 
    >>>>> ```
    >>>>> map http://before-remap.com/ http://after-remap.com/
    >> @plugin=<first plugin>.so @pparam=... @plugin=<second plugin>.so 
@pparam=...
    >>>>> ```
    >>>>> 
    >>>>> In detail, the first plugin get pre-remapped url information from
    >> `rri->requestUrl`,
    >>>>> but the second plugin get post-remapped one.
    >>>>> 
    >>>>> The cause of this behavior is ATS executes
    >> `url_rewrite_remap_request` function after running the first remap 
plugin.
    >>>>> 
    >>>>> My proposal is that ATS should execute all remap plugins and then
    >> rewrite url.
    >>>>> 
    >>>>> More ditails are in
    >>>>> - Issue : https://github.com/apache/trafficserver/issues/2877
    >>>>> - Pull Request : https://github.com/apache/trafficserver/pull/4531
    >>>>> 
    >>>>> 
    >>>>> What do you think about this?
    >>>>> 
    >>>>> Thank you.
    >>>>> 
    >>>> 
    >>> 
    >>> .
    >> 
    >> 
    >> 
    >> 
    >> 
    >> 
    > 
    > -- 
    > *Beware the fisherman who's casting out his line in to a dried up 
riverbed.*
    > *Oh don't try to tell him 'cause he won't believe. Throw some bread to the
    > ducks instead.*
    > *It's easier that way. *- Genesis : Duke : VI 25-28
    
    

Reply via email to