[LANG] Wanted - spec lawyer.
Now that the StringEscape system has a foundation to support whatever's needed (one hopes) the next step is to define exactly what escaping XML should do. As Jörg notes in LANG-66, XML is different for XML 1.0 and 1.1. Great, let's support both then. StringEscapeUtils can support the old method (for now) with whatever legacy we have to put in there, but EscapeUtils and UnescapeUtils can be 'correct'. A core question is what to do about > 0x7f unicode characters. Escaping them seems bad, yet we did it a lot. In escapeJava, in escapeXml, in escapeHtml. Also - a really easy patch in this area if someone wants to do it is LANG-507. Hen - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [LANG] Wanted - spec lawyer.
Hi Hen, Henri Yandell wrote at Dienstag, 30. Juni 2009 09:15: > Now that the StringEscape system has a foundation to support > whatever's needed (one hopes) the next step is to define exactly what > escaping XML should do. As Jörg notes in LANG-66, XML is different for > XML 1.0 and 1.1. Great, let's support both then. StringEscapeUtils can > support the old method (for now) with whatever legacy we have to put > in there, but EscapeUtils and UnescapeUtils can be 'correct'. > > A core question is what to do about > 0x7f unicode characters. > Escaping them seems bad, yet we did it a lot. In escapeJava, in > escapeXml, in escapeHtml. As pointed out http://www.w3.org/TR/2006/REC-xml11-20060816/#charsets and http://www.w3.org/TR/2006/REC-xml11-20060816/#charsets define the valid characters for XML 1.0 and 1.1. However, the escape functionality is actually different. If you transport XML (or HTML) in a UTF-8 encoded text file or one encoded by ASCII-7 is a big difference. In the former you don't have to encode anything, while you have to encode anything above 0x7f in the latter case. And this applies to XML, HTML or Java source files at equal level. The character set definition of the two XML versions is a vertical condition set. An attempt to encode a character outside the XML definition is actually a situation that cannot be handled and should raise an exception (like every XML parser will do anyway). Therefore the question is, whether (Un)EscapeUtils should actually be an instance initialized with the target character encoding. And that raises the question how close we're actually at reimplementing java.nio.Charset.encode. - Jörg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[g...@vmgump]: Project commons-configuration-test (in module apache-commons) failed
To whom it may engage... This is an automated request, but not an unsolicited one. For more information please visit http://gump.apache.org/nagged.html, and/or contact the folk at gene...@gump.apache.org. Project commons-configuration-test has an issue affecting its community integration. This issue affects 1 projects, and has been outstanding for 185 runs. The current state of this project is 'Failed', with reason 'Build Failed'. For reference only, the following projects are affected by this: - commons-configuration-test : Apache Commons Full details are available at: http://vmgump.apache.org/gump/public/apache-commons/commons-configuration-test/index.html That said, some information snippets are provided here. The following annotations (debug/informational/warning/error messages) were provided: -WARNING- Overriding Maven2 settings: [/srv/gump/public/workspace/apache-commons/configuration/gump_mvn_settings.xml] -DEBUG- (Gump generated) Maven2 Settings in: /srv/gump/public/workspace/apache-commons/configuration/gump_mvn_settings.xml -INFO- Failed with reason build failed -DEBUG- Maven POM in: /srv/gump/public/workspace/apache-commons/configuration/pom.xml -INFO- Project Reports in: /srv/gump/public/workspace/apache-commons/configuration/target/surefire-reports The following work was performed: http://vmgump.apache.org/gump/public/apache-commons/commons-configuration-test/gump_work/build_apache-commons_commons-configuration-test.html Work Name: build_apache-commons_commons-configuration-test (Type: Build) Work ended in a state of : Failed Elapsed: 1 min 59 secs Command Line: mvn --batch-mode --settings /srv/gump/public/workspace/apache-commons/configuration/gump_mvn_settings.xml test [Working Directory: /srv/gump/public/workspace/apache-commons/configuration] CLASSPATH: /usr/lib/jvm/java-6-sun/lib/tools.jar:/srv/gump/public/workspace/apache-commons/configuration/target/commons-configuration-1.7-SNAPSHOT.jar - testInitCopy(org.apache.commons.configuration.TestXMLConfiguration) testSaveWithDelimiterParsingDisabled(org.apache.commons.configuration.TestXMLConfiguration) testSetRootAttribute(org.apache.commons.configuration.TestXMLConfiguration) testLoadAndSaveFromFile(org.apache.commons.configuration.TestXMLConfiguration) testSaveToURL(org.apache.commons.configuration.TestXMLConfiguration) testSaveToStream(org.apache.commons.configuration.TestXMLConfiguration) testAutoSave(org.apache.commons.configuration.TestXMLConfiguration) testSaveAttributes(org.apache.commons.configuration.TestXMLConfiguration) testCloneWithSave(org.apache.commons.configuration.TestXMLConfiguration) testEmptyElements(org.apache.commons.configuration.TestXMLConfiguration) testSaveWithEncoding(org.apache.commons.configuration.TestXMLConfiguration) testSaveWithNullEncoding(org.apache.commons.configuration.TestXMLConfiguration) testSaveWithDoctype(org.apache.commons.configuration.TestXMLConfiguration) testSaveWithDoctypeIDs(org.apache.commons.configuration.TestXMLConfiguration) testSubsetWithReload(org.apache.commons.configuration.TestXMLConfiguration) testConfigurationAtWithReload(org.apache.commons.configuration.TestXMLConfiguration) testConfigurationsAtWithReload(org.apache.commons.configuration.TestXMLConfiguration) testGetKeysWithReload(org.apache.commons.configuration.TestXMLConfiguration) testSetTextRootElement(org.apache.commons.configuration.TestXMLConfiguration) testClearTextRootElement(org.apache.commons.configuration.TestXMLConfiguration) testAutoSaveWithSubnodeConfig(org.apache.commons.configuration.TestXMLConfiguration) testAutoSaveWithSubSubnodeConfig(org.apache.commons.configuration.TestXMLConfiguration) testSaveDelimiterParsingDisabled(org.apache.commons.configuration.TestXMLConfiguration) testSaveDelimiterParsingDisabledAttrs(org.apache.commons.configuration.TestXMLConfiguration) testMultipleAttrValuesEscaped(org.apache.commons.configuration.TestXMLConfiguration) testAutoSaveWithReloadingStrategy(org.apache.commons.configuration.TestXMLConfiguration) testAutoSaveAddNodes(org.apache.commons.configuration.TestXMLConfiguration) testAddNodesAndSave(org.apache.commons.configuration.TestXMLConfiguration) testRegisterEntityId(org.apache.commons.configuration.TestXMLConfiguration) testSaveAfterCreateWithCopyConstructor(org.apache.commons.configuration.TestXMLConfiguration) testCopyRootName(org.apache.commons.configuration.TestXMLConfiguration) testCopyRootNameNoDocument(org.apache.commons.configuration.TestXMLConfiguration) testSaveWithValidation(org.apache.commons.configuration.TestXMLConfiguration) testSaveWithValidationFailure(org.apache.commons.configuration.TestXMLConfiguration) Tests run: 1432, Failures: 2, Errors: 54, Skipped: 0 [INFO] [ERROR] BUILD FAILURE [INFO]
Re: svn commit: r789561 - /commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/EntityArrays.java
On 30/06/2009, bay...@apache.org wrote: > Author: bayard > Date: Tue Jun 30 05:09:01 2009 > New Revision: 789561 > > URL: http://svn.apache.org/viewvc?rev=789561&view=rev > Log: > Added todo note > > Modified: > > commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/EntityArrays.java > > Modified: > commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/EntityArrays.java > URL: > http://svn.apache.org/viewvc/commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/EntityArrays.java?rev=789561&r1=789560&r2=789561&view=diff > > == > --- > commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/EntityArrays.java > (original) > +++ > commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/EntityArrays.java > Tue Jun 30 05:09:01 2009 > @@ -22,6 +22,7 @@ > * > * @since 3.0 > */ > +// TODO: These need to be public - make methods to return them for security > purposes? These really ought to be private, with public getter, to prevent accidental or malicious changes to the entries. > class EntityArrays { > > static final String[][] ISO8859_1_ESCAPE = { > > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r789561 - /commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/EntityArrays.java
On 30/06/2009, sebb wrote: > On 30/06/2009, bay...@apache.org wrote: > > Author: bayard > > Date: Tue Jun 30 05:09:01 2009 > > New Revision: 789561 > > > > URL: http://svn.apache.org/viewvc?rev=789561&view=rev > > Log: > > Added todo note > > > > Modified: > > > commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/EntityArrays.java > > > > Modified: > commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/EntityArrays.java > > URL: > http://svn.apache.org/viewvc/commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/EntityArrays.java?rev=789561&r1=789560&r2=789561&view=diff > > > == > > --- > commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/EntityArrays.java > (original) > > +++ > commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/EntityArrays.java > Tue Jun 30 05:09:01 2009 > > @@ -22,6 +22,7 @@ > > * > > * @since 3.0 > > */ > > +// TODO: These need to be public - make methods to return them for > security purposes? > > > These really ought to be private, with public getter, to prevent > accidental or malicious changes to the entries. I see you have done just that in a later commit. Sorry for the noise. > > > class EntityArrays { > > > > static final String[][] ISO8859_1_ESCAPE = { > > > > > > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [LANG] Wanted - spec lawyer.
Jörg Schaible wrote: > As pointed out http://www.w3.org/TR/2006/REC-xml11-20060816/#charsets and > http://www.w3.org/TR/2006/REC-xml11-20060816/#charsets define the valid > characters for XML 1.0 and 1.1. > > However, the escape functionality is actually different. If you transport > XML (or HTML) in a UTF-8 encoded text file or one encoded by ASCII-7 is a > big difference. In the former you don't have to encode anything, while you > have to encode anything above 0x7f in the latter case. And this applies to > XML, HTML or Java source files at equal level. > > The character set definition of the two XML versions is a vertical condition > set. An attempt to encode a character outside the XML definition is > actually a situation that cannot be handled and should raise an exception > (like every XML parser will do anyway). > > Therefore the question is, whether (Un)EscapeUtils should actually be an > instance initialized with the target character encoding. And that raises > the question how close we're actually at reimplementing > java.nio.Charset.encode. As I understand it, the basic idea of StringEscapeUtils.escapeXml() is to convert arbitrary character data from memory (a String) into a character sequence that has the same meaning when it appears literally in XML character data. This is a conversion from character data to character data, so character encoding is not directly relevant for this use (and this is a fundamental difference from Charset.encode()). The characters that must be escaped for this purpose are well defined by the XML specifications. The appearance of an encoding attribute in the xml declaration notwithstanding, the character encoding of an XML document is a property of a representation of the document, not a property of the document itself. There is therefore a *separate*, albeit related, consideration of escaping characters that cannot be expressed in a particular character encoding, so as to be able to encode the document to a byte sequence without data loss. This is a useful thing to do, and it is compatible with the main objective, but I think it would be well to avoid conflating the two as an indivisible task. They can be performed in one pass by one method, but they are logically distinct behaviors. If StringEscapeUtils wants to support the second use, then it needs a way for the user to tell it which additional characters to escape. One possibility would be to pass it a Charset which the user intends to apply (later) to encode the characters. StringEscapeUtils could then escape those input characters for which Charset.canEncode() returns false. Yet another separate question has arisen as to how to handle input characters which cannot appear in any way in a well formed XML (1.0 / 1.1) document, even as character references (e.g. U+). I'm not so certain that StringEscapeUtils needs to be concerned about that, and it would simplify things immensely if it considered that out of scope. Among other effects, I believe that would moot the distinction between XML 1.0 and XML 1.1 (and future versions) for this class. In addition, I strongly suspect that there are multiple production applications that (mis)use XML in a way that would be broken if character references to characters outside the XML character set were flagged as application errors; it would be considerate for StringEscapeUtils to be compatible with such (mis)use. Best Regards, John -- John Bollinger thinma...@yahoo.com - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [Commons-JCI]
People use it but I personally did not have the cycles (or a usecase) for further development. What would you like to see fixed? cheers -- Torsten On Mon, Jun 29, 2009 at 21:43, Liam Coughlin wrote: > Is this project still active? there's a 1.1 release sort of sitting dormant > since ~2007... > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org