DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUGĀ·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=36275>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED ANDĀ·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=36275

           Summary: JSP TAG Markup Character Entities
                    attr="&lt;&amp;&quot;&gt;" passed into API badly
           Product: Tomcat 5
           Version: 5.5.9
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Jasper
        AssignedTo: tomcat-dev@jakarta.apache.org
        ReportedBy: [EMAIL PROTECTED]


Originally posted to tomcat-users to question is this is my understanding of a 
BUG.


The following example JSP page seems to be interpreted incorrectly by the time
the attribute values are passed into a setDynamicAttribute() API call.

<%@ page language="java" contentType="text/html; charset=UTF-8"
pageEncoding="UTF-8"%>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<%@ taglib prefix="g" uri="http://domain.com/taglibs/generic-0.1"; %>
<html>
<head>
<title>TAG GENERIC PAGE</title>
</head>
<body>
<g:generic attrOne="1" attrTwo="2" attrThree="&lt;&amp;&quot;&gt;"
attrFour="<&&quot;>"/>&lt; &lt;Tag Here<br/>
</body>
</html>

logged calls to setDynamicAttribute() showing the 3 arguments passed.

DEBUG 10:22:40,279 (GenericTag.java:setDynamicAttribute:47)  -null attrOne 1
DEBUG 10:22:40,285 (GenericTag.java:setDynamicAttribute:47)  -null attrTwo 2
DEBUG 10:22:40,287 (GenericTag.java:setDynamicAttribute:47)  -null attrThree
&lt;&amp;"&gt;
DEBUG 10:22:40,298 (GenericTag.java:setDynamicAttribute:47)  -null attrFour <&">

It seems the &quot; is correctly converted into " but other character entities
are not.  It is my understanding that all markup file parsing should follow
through the order of:

* characterize file from its encoding type (UTF-8, etc...)
* tokenize character stream looking for character entities and substitute what
they represent (no matter where they are in the file), any substituted character
may not be used as tokens that delimit markup elements in the next step
* now parse the markup in the resulting file

For performance reasons it probably doesn't happen exactly like that.  I am
expecting output like:

null attrThree <&">
null attrFour <&">


I also notice that it seems common place to use JSP tags like this:

<img height="10" src="<foo:tag name="value"/>" width="10"/>

Is the above recursivly reliable like this:  <x:outer attr="<x:middle
attr="<x:inner attr="foo"/>"/>"/>

A pureist representation of the same that would be recursivly reliable may look
something like this:

<merge:img>
<merge:attr height="10"/>
<merge:attr-body name="src"><foo:tag name="value"/></merge:attr-body>
<merge:attr width="10"/>
</merge:img>

I appreciate the former maybe done as lazy short hand, but it appears to break
something else which is a stronger binding standard.  There must be many
possible alternative approaches in JSP to this problem that won't conflict with
other elements of all the standards that come info play.

Is it possible to force a purist approach to this problem and switch off this
mode to get back a reliable behaviour (even if it does seem like I have to take
the long way around) call this idealized behaviour if you will.

Ultimatly our JSP authoring tools will be powerful enough to automatically hide
complex tag constructs like this and allow us to see at a glance the
representation we most like to see but really whats saved in the raw file maybe
the unrolled purist version.

-- 
Darryl L. Miles

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to