Hi,

right now we have the following methods in StringEscapeUtils:

escapeXml(String
escapeHtml3(String)
escapeHtml4(String)

These methods only escape the basic xml/html entities, though they may
produce invalid XML/HTML. LANG-955 [1] proposes to add new methods that
only produce valid XML, they should throw an exception if a character is
encountered that cannot be displayed in XML (not even by escaping).

Since the set of valid characters differs between XML 1.0 and XML 1.1, we
need two methods:

escapeXml_1_0(String)
escapeXml_1_1(String)

To clarify the behavior of the old method I've created LANG-963 [2]. The
idea is to rename escapeXml(String) to escapeXmlEntities(String) and
deprecate the old method.

Now I'm tempted to rename the HTML counterparts as well leading to either
of the following:

escapeHtml3Entities(String)
escapeHtml4Entities(String)

or:

escapeHtml_3_Entities(String)
escapeHtml_4_Entities(String)

or:

escapeHtml_3_0_Entities(String)
escapeHtml_4_0_Entities(String)

I find neither of the three very appealing, but for code symmetry we should
change this as well. Which one would you prefer?

Benedikt

P.S.: I'm planning to redesign great parts of the API. The "static util"
pattern is out dated and it is better to encode the information we're
trying to express here via fluent API. My proposal for lang 4.0 would be:

StringEscaping.escape(str).with(Escaping.HTML_4_0)
StringEscaping.escape(str).with(Escaping.XML_ENTITIES)

This way we don't have to encode everything into method names. I've created
LANG-964 [3] for this.

[1] https://issues.apache.org/jira/browse/LANG-955
[2] https://issues.apache.org/jira/browse/LANG-963
[3] https://issues.apache.org/jira/browse/LANG-964

-- 
http://people.apache.org/~britter/
http://www.systemoutprintln.de/
http://twitter.com/BenediktRitter
http://github.com/britter

Reply via email to