Re: Xerces.

2006-08-11 Thread keshlam
The bug here is in your expectations. Order of attributes is not significant in XML, and no XML applications promises to preserve a specific order. __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And the change that com

Re: xpath and namespace?

2006-08-16 Thread keshlam
That's correct behavior. To do XPaths against namespaced nodes, you must use prefixes and provide a namespace context... or (horribly ugly solution) write the XPaths with wildcards and test namespace URI and localname in predicates. __ "... Three things see no

Re: AW: AW: HowTo find out all schema validation errors, not only the first?

2006-08-30 Thread keshlam
There are very few parsers, for any language, which can "report all errors" reliably. Hitting a parse error often makes determining whether later code is correct very difficult. It's possible to write parsers which attempt to recover and continue, but there are generally performance costs in doing

Re: Custom validation

2006-09-20 Thread keshlam
Create a subset of the schema-for-schemas which implements your restrictions, and validate the schema document(s) against that? __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And the change that comes along. ..." -- "

RE: Additionnal #Text node

2006-09-27 Thread keshlam
> > Value > > > My DOM tree will look like: > > + ELEMENT: Root > + #TEXT: > + ELEMENT: FirstElement > + #TEXT: Value > + #TEXT: That is correct. Xerces, and the DOM specification, make no assumptions regarding whether your particular application considers the whitespace to be meaningful or n

Re: Xerces and Xpath

2006-10-02 Thread keshlam
>I think XPath is implemented by Xalan, not Xerces.  You can also look into XPath >API's such as Jaxen and commons-JXPath. Just wondering: Does Xerces currently implement the proposed DOM XPath API (http://www.w3.org/TR/DOM-Level-3-XPath/)? And if so, does it do so by invoking Xalan? I remember

RE: Getting SAX Exception Content is not allowed in prolog. when using my code on Unix platform

2006-10-17 Thread keshlam
>SAX Exception Content is not allowed in prolog. This means you have something before the root element which is not permitted to appear there according to the XML spec. Make sure that NOTHING comes before the XML declaration except the optional byte-order mark, and that NOTHING comes between the

Re: Reading UTF-16: Content is not allowed in prolog

2006-10-30 Thread keshlam
[Fatal Error] output.xml:1:40: Content is not allowed in prolog. You have something other than the Byte Order Mark, the XML Declaration, Processing Instructions, or whitespace before the document's root element. Fix the file so it's well-formed XML. __ "...

Re: XMLSerializer running out of memory....

2006-11-02 Thread keshlam
Just to get the stupid question out of the way : Have you tried using the -Xmx (or equivalent) option on your JVM to increase the amount of heap memory it's allowed to use before giving up? __ "... Three things see no end: A loop with exit code done wrong, A s

Re: serializing entity defs with quotes

2006-11-13 Thread keshlam
>   >     'wan-gw'              > > > coming out of the serialization it looks like > > wan-gw"> Uhm. An XSLT transformation, even an indentity transformation, really shouldn't be preserving entity declarations at all; they aren't part of the XPath/XSLT data model, so they should be expanded on t

Re: Entities

2006-11-15 Thread keshlam
> > I removed the "encoding", but am still getting the same result.  (The > source > > file is plain old ASCII but also using several of the characters in the > > range 128-255.  I'm not getting any problem with them.) > > Why dont'y you try the encoding apropriate to the characters you use ? Ol

Re: question on parsing of an XML document

2006-11-28 Thread keshlam
At a quick guess, that sounds more like a problem in how you're building the InputSource than in the parser itself... __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And the change that comes along. ..." -- "Threes" Rev

Re: Validate a SOAP 1.1 (fault)response message

2006-11-30 Thread keshlam
For what it's worth, recent discussion in comp.text.xml suggests that XMLSpy's validation is somewhat buggy. __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And the change that comes along. ..." -- "Threes" Rev 1.1 - Du

Re: no such method XMLSchemaLoader.loadGrammar(XMLInputSource) ?

2006-12-05 Thread keshlam
Probably the usual problem: http://xerces.apache.org/xerces2-j/faq-general.html#faq-4 __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And the change that comes along. ..." -- "Threes" Rev 1.1 - Duane Elms / Leslie Fish

Re: VTD-XML thoughts?

2006-12-13 Thread keshlam
How does this differ from all the (many!) other attempts to come up with a less verbose mapping of XML content? When that's been tried in the past, it has generally turned out to be not much more efficient than just processing compressed XML, sometimes less so... except in those cases where the au

Re: Problem with targetNamespace in schemavalidation

2007-01-19 Thread keshlam
The first problem to fix is that this isn't a legal namespace name. Namespace names must be absolute URI references -- which means they have to start with a scheme, and then should follow that scheme's syntax. They also use URI character escaping conventions when necessary; I haven't checked the s

Re: Problem with targetNamespace in schemavalidation

2007-01-19 Thread keshlam
On Friday, 01/19/2007 at 05:58 GMT, Mark Goodhand <[EMAIL PROTECTED]> wrote: > Where is the "absolute URI reference" requirement? Is it an XML > Namespaces constraint or a Schema constraint? XML Namespaces errata, after an extended and very painful debate over how namespaces should be tested fo

DST question

2007-01-22 Thread keshlam
> Could you please provide me some info about the impact of using xerces > regarding to DST (Daylight Saving Time)? Could you clarify your question? Are you asking whether Xylem is going to have trouble with the new definition of DST that takes effect this year?

Re: importNode()/adoptNode() and getElementById()

2007-01-30 Thread keshlam
> Joe-user expects an ID that > exists the source document to continue to exist in the destination document, > especially when the two documents use the same schema. Correction: You expect it. Not everyone does. Not everyone wants it. That's a clear argument for the DOM not doing it unless told t

Re: importNode()/adoptNode() and getElementById()

2007-01-30 Thread keshlam
Posted my understanding of this into that Jira entry. If anyone has doubts, checking with the DOM WG rather than taking my word for it would be perfectly reasonable. I didn't close the issue because I'm no longer part of the core Xerces development team, so it shouldn't be my call. The Xerces DOM

Re: Well Formed Checking

2007-02-07 Thread keshlam
Are you sure this isn't just a change in the error message? The document contains an unexpected end-tag (because the begin-tag is missing, but there's no way the lexer can determine that). The current message tells you what end-tag was expected rather than which one was found, but it's correct. _

Re: possible error in the faq

2007-02-22 Thread keshlam
>The example imports org.w3c.dom.DOMImplementationRegistry. That API name changed between the working draft and the final DOM Level 3 spec. The current name is indeed org.w3c.dom.bootstrap.DOMImplementationRegistry, and the example should be updated. See http://www.w3.org/TR/DOM-Level-3-Core/java

Re: How to find the group information by SAX-Parsing

2007-03-14 Thread keshlam
The standard solution is to structure the data properly, so you have a which contains all the information about a person, rather than relying on sequence of children to imply grouping. If you must rely on sequence, basically you're writing a simple FSM that accumulates data, acting on that accum

Re: How to find the group information by SAX-Parsing

2007-03-14 Thread keshlam
> if there have been already some standard solutions from SAX, No off-the-shelf libraries that I know of at the SAX level, since it's usually easy to hand-code. There are also data binding tools, which focus on parsing XML into application-specific data structures. Those do_generally produce str

Re: Xerces problem

2007-03-22 Thread keshlam
>We would like to maintain the HTML text in the XML file. Don't think of it as text, then; think of it (and build it as) XML document structure. Note that you will have to use XHTML rather than HTML (or some approximation of XHTML), or your file will almost certainly be ill-formed and hence not u

Re: Ignoring missing end tag errors

2007-04-18 Thread keshlam
If end tags are missing, the data simply isn't XML and you shouldn't expect XML tools to handle it. Part of the point of moving from SGML to XML was precisely to drive folks toward writing well-formed documents rather than trying to guess past their errors. If it's HTML (which is based on SGML),

Re: Passing XML as input parameter to XSLT Transformer

2007-05-16 Thread keshlam
Are you sure you're building a namespace-aware DOM? __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And the change that comes along. ..." -- "Threes" Rev 1.1 - Duane Elms / Leslie Fish (http://www.ovff.org/pegasus/songs

Re: Got the Issue

2007-07-24 Thread keshlam
Per the W3C spec, the DOM API does not promise to support multithread access. Doing so would impose performance/implementation limitations on all applications, even single-threaded ones, which was considered undesirable. Also, in most cases what you really need is threadsafety at a much higher lev

Re: Got the Issue

2007-07-24 Thread keshlam
Right. I wouldn't expect nodelists to be threadsafe, but do we reuse nodelists, or would independent calls to retrieve the same nodelist return separate (non-entangled) objects? I would expect the latter, since there could be several nodelist accesses in progress at once even for a single thread.

Re: Got the Issue

2007-07-25 Thread keshlam
http://www.w3.org/DOM/faq.html#SAXandDOM __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And the change that comes along. ..." -- "Threes" Rev 1.1 - Duane Elms / Leslie Fish (http://www.ovff.org/pegasus/songs/threes-rev

Re: Got the Issue

2007-07-26 Thread keshlam
When proposing changes to Nodelist, make sure you've considered its "live view" semantics, where changes to the model are immediately visible in the list. That behavior seriously complicates implementing this interface, and it's required for a correct DOM implementation. The Xerces DOM has experim

Re: Got the Issue

2007-07-27 Thread keshlam
The DOM is not promised to be threadsafe. The Xerces DOMs are not designed to be threadsafe; they are designed to be reasonably fast and compact. The fact that Node read access happens to be safe in one of our DOM implementations does not require that Nodelist also be made safe -- and in fact it

Re: Questions about XML Parser for Java

2007-08-01 Thread keshlam
>Would you be so kind as to provide me a rough estimate of the man hours that expended in developing the XML Parser Probably not possible, but it's a significant number of man-years. Xerces started off as an early prototype of IBM's XML4J parser, which went through several complete redesigns and

Re: Equivalent of DOMWriterImpl in Xerces2.7

2007-08-10 Thread keshlam
As far as XML is concerned, there is absolutely no semantic difference between a character and its Numeric Character Reference form. XML parsers generally discard this distinction; XML serializers generally write out the character unless the encoding can't represent it (forcing the numeric form).

RE: importNode, out of memory, big document

2007-10-09 Thread keshlam
As you said: Your original document is being parsed into Xalan's internal DTM data model, which is MUCH more compact than the more common Java-object-based DOM implementation. When you import it into a DOM, it's likely to get larger; depending on the structure of the document and the details of th

Re: Has the bug of static textNode on AttrImpl breaks thread-safety, mutations of independent documents been fixed or not

2007-10-23 Thread keshlam
The DOM API does not promise threadsafety. If that's an issue for you, perform application-level locking on access to the DOM. __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And the change that comes along. ..." -- "Th

Re: Create XML Schema with Xerces?

2007-10-24 Thread keshlam
XML Schema is an XML language. You should be able to use standard DOM or SAX programming techniques to build a Schema document. (I don't know whether the Xerces Schema-specific data model can be serialized back to XML syntax. If it can, that might be another approach -- but would be nonportable.)

Re: Split CDATA Sections and the division Symbol (x00f7)

2007-11-07 Thread keshlam
For what it's worth, 0xF7 is one of the characters which both the XML 1.0 and 1.1 recommendations suggest should be avoided by document authors. "? They are either control characters or permanently undefined Unicode characters" (http://www.w3.org/TR/REC-xml/#charsets)

Re: Split CDATA Sections and the division Symbol (x00f7)

2007-11-08 Thread keshlam
>I don't see 0xF7 in that list. You're right. Apparently I've still got a bit of dyslexia; I managed to misread #x7F as #xF7. Sorry about the confusion. ("Caution: To avoid damage to reputation, engage brain before putting fingers in gear.") __ "... Three thin

Re: Creating internal DTD subset with Xerces-J

2008-01-15 Thread keshlam
>I believe, this is a critical shortcoming in DOM (i.e., not able to >create/modify internal DTD subset in standard DOM). Is there some >place, where I can ask for this functionality in DOM? DOM Level 3 considered adding DTD/schema support, but in the end the validation module was all that surviv

Re: Problem parsing attribute with character 7F

2008-01-25 Thread keshlam
7F is a legal XML 1.0 character and XML 1.0 should accept it. And, yes, I believe that in UTF8 (are you SURE you're reading the file as UTF8 rather than some other encoding?) it should be a legitimate single byte. However, the XML 1.0 spec's section 2.2 says "Document authors are encouraged to av

Re: Configure xerces to accept '&' character

2008-02-19 Thread keshlam
In XML, a stand-alone & character must be escaped to keep it from being interpreted as introducing an entity reference or numeric character reference. See http://www.w3.org/TR/xml/#syntax Fix your input document. __ "... Three things see no end: A loop with exi

Re: Help with predefined entities and character references

2008-02-21 Thread keshlam
If you're working with XML tools, these entity references and numeric character references WILL be expanded. If that isn't what you intended, your document should have escaped the & character. __ "... Three things see no end: A loop with exit code done wrong,

Re: Xinclude

2008-03-03 Thread keshlam
is the carriage return character. Some systems use the sequence to break lines (MS systems among others); some just use (Unix systems, among others), and there are a few rare cases that use something else. XML parsers are able to tolerate any of these on input and will convert them all into

Re: How To Configure a Schema Validating Parser for Use With JAXP

2008-03-10 Thread keshlam
>Hmm. I can see the point about properties affecting everything running >under a single JVM in an app server but it still seems like there ought >to be a convenient way to do this from the command line where the JVM >instance is of course only going to apply to the one process being run. If you're

Re: xerces-j2 samples : What should one conclude from the o/p of these

2008-03-18 Thread keshlam
The samples illustrate the use of the APIs by performing simple (often trivial) tasks. Their main purpose is additional documentation. They are not necessarily intended to do anything actually useful, though they may contain code worth reusing. They are not intended to be a full tutorial in the u

Re: How to disable attribute normalization

2008-03-30 Thread keshlam
If you need to know entity-reference boundaries, you can get that information by asking the parser to generate a DOM which has Entity Reference Nodes and then looking at the children of the attribute nodes. (I'm not sure offhand whether there's a SAX equivalent.) If you want to suppress the oth

Re: How to disable attribute normalization

2008-03-30 Thread keshlam
> > If you need to know entity-reference boundaries, you can get that > > information by asking the parser to generate a DOM which has Entity > > Reference Nodes and then looking at the children of the attribute > > nodes. > > Xerces-J doesn't support that. See the rationale here [1] from Andy Cl

Re: New line & tab characters getting replaced by space

2008-04-18 Thread keshlam
Whitespace will always be normalized in XML attribute values. If you don't want that happening, put the text in a child element instead. http://www.w3.org/TR/REC-xml/#AVNormalize __ "... Three things see no end: A loop with exit code done wrong, A semaphore un

Re: encoded character references

2008-04-22 Thread keshlam
As far as XML is concerned, numeric character references are identical to the characters they represent. The XML APIs shouldn't make any distinction. For implementation reasons, you *may* find that SAX delivers these as separate characters() events. But that is not guaranteed. No XML applicati

Re: encoded character references

2008-04-22 Thread keshlam
> & might be treated as being the same as &, but these are both > distinct from ordinary text As far as XML is concerned, neither is "distinct from ordinary text" -- they're just representations of the & character. For comparison, consider A. XML doesn't distinguish between this and a simple

Re: Processing instrunction outside the root element

2008-05-19 Thread keshlam
> processing instruction outside the root element (at the very end of the document) By XML's grammar rules, nothing meaningful may follow the root element. That includes PIs. Any tool which is processing that PI is actually behaving incorrectly. Fix your document design? ___

Re: Processing instrunction outside the root element

2008-05-19 Thread keshlam
>That is not true. The definition of document in the XML 1.1 spec is: > ( prolog element Misc* ) Hmmm. You're right; my error. That's true even in 1.0. Tim Bray, in his Annotated XML Specification, said: "The fact that you're allowed some trailing junk after the root element, I decid

Re: document order guarantees in JAXP

2008-08-01 Thread keshlam
The order of _elements_ is guaranteed in any XML API. The order of _attributes_ is not. __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And the change that comes along. ..." -- "Threes" Rev 1.1 - Duane Elms / Leslie Fi

Re: document order guarantees in JAXP

2008-08-01 Thread keshlam
SequenceType and ItemType checked in. The class Javadocs spell out my understanding of how we're dividing these; critiques welcome. I believe the interpreter currently runs but compiler doesn't, due to my having made SequenceType no longer a stream at the FIL level. (On the other hand, I may ma

Re: Reading Hexadecimal values

2008-09-10 Thread keshlam
> That's a Numeric Character Reference. XML correctly converted it to the reference character. If you want to represent hex data, don't use &#; -- instead, use some other notation and convert it in your application. __ "... Three things see no end: A loop

Re: XInclude with unorthodox tags.

2008-09-27 Thread keshlam
> XInclude support in Xerces is only available as a parsing feature. > If there is some processing you need to do prior to XInclude you > need to serialize the document (after you've done your > preprocessing) and then feed it back to the parser. Alternatively, do the processing and then implem

Re: How to get exact location of schema validation errors?

2008-10-27 Thread keshlam
> Thanks for your response, but I'm not looking for column and row > numbers in the XML document. I might have been a little ambiguous > in using the word "location". What I need is the reference to the > Node object in the DOM tree that caused the validation error. The > perfect solution wo

Re: How to preserve an empty text node?

2008-12-16 Thread keshlam
As far as XPath/XSLT is concerned, there is no such thing as "an empty text node" -- if it's empty, it's absent. So the Xalan serializer will almost certainly treat empty text nodes as not existing. Any properly written XML application should treat and as IDENTICAL. If you really need to draw

Re: How to preserve an empty text node?

2008-12-16 Thread keshlam
The other kluge-around would be a postprocessor that converted the XML into the SGML . But if the problem really is that the next stage is an SGML tool, I'd try HTML mode serialization and see if you can get away with it. __ "... Three things see no end: A l

Re: XInclude and XPointer

2009-02-03 Thread keshlam
> If you need support for other kinds of XPointers then you may want to > take a look on the net at some of the other XInclude processors that > are available. e.g. perhaps XOM [4] supports what you're looking for. Another quick idea: It's possible to implement an XInclude subset as an XSLT styl

Re: Parsing problem

2009-02-19 Thread keshlam
> > TV Cap > HDTV > x264 > 720p > Action/Adv > Drama > English > > > The "report:attributes" node is returning as having 0 children, as > this debug output shows: Hard to believe, since this so

Re: carriage return in attribute

2009-02-26 Thread keshlam
Carriage return is ASCII 13, so or &xD; will represent that character. However, be sure you understand XML's rules for whitespace normalization in attribute values. Depending on what you're trying to do, you may want to replace that attribute with a child element... or replace the offending

Re: carriage return in attribute

2009-03-02 Thread keshlam
The purpose of an XML parser is to read correct XML. Get whoever's generating that file to produce XML that expresses their intent correctly, or throw in a filtering stage that corrects their error. Personally, I would apply a clue-by-four to the author of whatever's generating that document r

Re: carriage return in attribute

2009-03-02 Thread keshlam
Actually, or are technically "numeric character references", not entity references. Check the spec, but if I'm remembering the whitespace rules correctly, these may get converted early enough not to help in this case. You may need an actual &CR; entity defined in the DTD. __

Re: DOM: Are namespace declaration attributes real attributes?

2009-04-07 Thread keshlam
In the DOM, namespace declaration attributes are displayed as real attributes -- but are in fact optional in many cases. See current version of the DOM spec for discussions of Namespace Well-Formedness, Namespace Normalization, and normalization during serialization. In the XPath data model, na

Re: parsing an xml document chunk by chunk

2009-04-08 Thread keshlam
My solution would be to tell the parser to read from an in-memory stream acting as a FIFO buffer, and run it in its own thread; then push data into that stream from the communications thread as it becomes available. Of course the hard thing is going to be carrying this handshaking through to t

Re: doubt about utf8 and charactrers method in DefaultHandler (SaxParser)

2009-04-23 Thread keshlam
> UTF-8 and UTF-16 are character encodings [1], representing the > characters defined by Unicode as sequences of bytes. These encodings > have a representation for every character in Unicode. Like any of > the other encodings they're decoded into Java chars on input so it's > all the same to the

Re: Writing without a DOM?

2009-06-01 Thread keshlam
Xerces supports SAX. __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And the change that comes along. ..." -- "Threes" Rev 1.1 - Duane Elms / Leslie Fish ( http://www.ovff.org/pegasus/songs/threes-rev-11.html)

Re: Bug in Xerces 2.7.1 AbstractDOMParser.java line 1651 ?

2009-06-02 Thread keshlam
Since prefixes are considered "syntactic sugar", this is arguably legal; in a properly namespace-sensitive application it should have the right result. But not everything is fully namespace-aware (sigh) and I agree that this appears to be a typo. __ "... Thre

Re: Class casting an interface is bad

2009-06-02 Thread keshlam
> Thanks for the explanation although I am a bit confused as to why we > would bother to have an interface when we require a specific > implementation. Sorry, but this really is the way the W3C intends it to work. The interface standardizes most usages of the DOM. But for efficiency of implemen

Re: Name-value pairs in xml root tag

2009-06-29 Thread keshlam
Those are all namespace declarations. If you register the right SAX handler, you should have no trouble seeing them. __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And the change that comes along. ..." -- "Threes" Rev

Re: Wrapped by ?

2009-07-15 Thread keshlam
Are you sure you're reading the file you think you're reading? Try printing out the contents rather than parsing them. (This isn't something Xerces should be doing for XML, so I'm betting that you either are reading the wrong thing or are reading it from a server which is wrapping it as an HTML

Re: repairing document while parsing?

2009-07-22 Thread keshlam
Closet thing I can think of is the W3C's "tidy" tool, which repairs some of the common/obvious errors. __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And the change that comes along. ..." -- "Threes" Rev 1.1 - Duane E

Re: IE does not comprehend default namespace in XML

2009-08-26 Thread keshlam
> is that if I use the Internet Explorer to open this xml, it does not > render anything except the hardcoded text that I have in the XSLT, > when I have the xmlns="www.ncr.com/ocz" attribute in the root node > of the xml. As soon as I remove this attibute, it works fine and the > xslt gets appli

Re: xml include

2009-09-16 Thread keshlam
Parse without XInclude processing, walk the tree to find the XIncludes, fetch the referenced documents and attach them to the include element? __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And the change that comes alo

Re: Encoding problem

2009-10-05 Thread keshlam
To guarantee UTF-8 output (assuming the processor is writing directly out to the file rather than producing a SAX or DOM output which other code then writes out), specify the encoding in the stylesheet's directive. Though I'd be sorta surprised if UTF-8 isn't the default... __

Re: Encoding problem

2009-10-05 Thread keshlam
> There is no stylesheet, I'm not using any XSLT file. It is simply SAX reading the XML file and writing to standard output. Sorry; I'm used to thinking in terms of Xalan rather than Xerces and gave the wrong answer. Can you confirm whether the problem is occurring in the parser or on the the

Re: Encoding problem

2009-10-05 Thread keshlam
Yeah, that will do it. If you want to fix it at this level, you need to set the output stream to use UTF8 encoding rather than the JVM's default for that platform. __ "... Three things see no end: A loop with exit code done wrong, A semaphore untested, And t

Re: Special characters problem while writing XML files using JAXP DOM Parser

2009-12-04 Thread keshlam
Or, better, escape the individual troublesome character by expressing it as < XML also considers > and & to be reserved characters; they should be expressed as > and &. <[[CDATA[]]> sections, which provide a block-escaping mechanism, are sometimes useful for hand-generated XML; less so for ma

Re: How to retain Entity References in Attribute Nodes while parsing ?

2009-12-23 Thread keshlam
Xerces-J has an option to control whether EntityReference notes are generated in the DOM output. See http://xerces.apache.org/xerces-j/features.html ... specifically, the create-entity-ref-nodes feature. I haven't checked whether Xerces-C offers the same choice, but I would be a bit surp

Re: Parser passes garbage to characters() callback for XML containing character entities

2010-01-22 Thread keshlam
> I can reproduce a problem parsing certain XML 1.1 files that contain lots of > character entities (escaped control chars like ""). > At some point in the file the parser calls my characters() method with > garbage text. In XML 1.0, most control characters were simply illegal. Did we ever updat

Re: Can Xalan2.7.1 be used with Java JRE 6.0.05

2010-02-22 Thread keshlam
See http://xml.apache.org/xalan-j/getstarted.html. The instructions haven't been updated since Java 1.4, but as far as I know everything should still work roughly the same way. (You'll generally get more/better answers if you ask Xalan questions on the Xalan list rather than the Xerces list.)

Re: File not found exception during xml transformation

2010-06-02 Thread keshlam
First obvious question: You say you have the document in a Map; I presume you mean a Java Map indexed by the filename. Xerces and Xalan won't look there unless you have plugged in a user-written Resolver which recognizes the URI you are requesting and retrieves the document from the Map object

Re: namespace prefixes

2010-10-21 Thread keshlam
I presume you don't need to be reminded that the prefix used in the instance document may be different from the prefix used in the schema document, and that in fact either or both of these may have several prefixes bound to the same namespace simultaneously and/or sequentially... __

Re: Difference between Document.getNodeValue() and Document.getTextContent()?

2011-04-15 Thread keshlam
The node value of a Document node is null. [1] The text content of a Document node is the concatenation of the textContent value of every child node, excluding COMMENT_NODE and PROCESSING_INSTRUCTION_NODE nodes. [2] [1] http://www.w3.org/TR/DOM-Level-3-Core/core.html#ID-1950641247 [2] http://www

Re: DOM thread safety issues & disapearring children

2011-06-08 Thread keshlam
As a reminder: The reasons the DOM was specified by the W3C as not being theadsafe are twofold. 1) Performance would be significantly impacted by excess locking. (Since we're talking about Xalan-J, notice that Java users have almost completely given up use of the inherently threadsafe hash and

Re: DOM thread safety issues & disapearring children

2011-06-08 Thread keshlam
> > Since we're talking about Xalan-J > We're talking about Xerces-J. :-) Thanks. "To avoid damage, engage mind before putting fingers in gear." I think I should be talking about a cup of coffee right now. (T-t-t-alking 'bout code generation... no, that's also Xalan.)

Re: dismissing characters such as carriage returns and spaces after an ending and before an starting tag ...

2011-07-11 Thread keshlam
If you are validating against a DTD, and IF the enclosing element does not have mixed content, look at the SAX/DOM defiinitions of "ignorable whitespace" and how to handle it. (The term is unfortunately; it's better described as "whitespace in element-only content") If you are not validating th

Re: dismissing characters such as carriage returns and spaces after an ending and before an starting tag ...

2011-07-11 Thread keshlam
Interesting, Mike; didn't know that. Makes a certain amount of sense, since it's based on the definition of the containing element rather than what it actually contains. (I've rarely counted on it; I get too many documents thrown at me without DTDs, or am processing in a context where I want to

RE: DOM thread safety issues & disapearring children

2011-07-19 Thread keshlam
In 99% of the use cases, locking the individual DOM objects/operations would be the wrong level of granularity -- what you really need to prevent unexpected results is transaction locking for a group of related changes. That really does have to be done at the application level. Locking every in

Re: Stripping the Text Declaration from the external parsed entity

2011-08-15 Thread keshlam
Happens automagically when the external entry is expanded. __ "You build world of steel and stone I build worlds of words alone Skilled tradespeople, long years taught: You shape matter; I shape thought." (http://www.songworm.com/lyrics/songworm-parody/Shapeso

Re: Data graph from xml document

2011-12-15 Thread keshlam
There are a number of good tutorials on XML programming in Java at http://www.ibm.com/xml, along with a lot of articles on more specific techniques. For basic "how do I get started" questions, I'd recommend reading some of those. XML itself is a strict tree structure, with the only "links" bein

Re: Attribute validation bug ?

2012-01-19 Thread keshlam
Speaking as someone who was on the DOM Working Group at the time: Michael is entirely correct. The DOM Level 1 non-namespace-aware nodes and methods should be considered DEPRECATED. They are simply not interoperable with DOM Level 2 code. The only justification for continuing to use them is if

Re: AW: Error Parsing xml document

2012-05-15 Thread keshlam
> in your XML-Documet change //$xrd*($v*2.0) to a valid URI Websearching for "URI RFC" will find the formal specification for URIs, including the grammar that defines what is and isn't legal; namespace names must meet the syntactic constraints of URI References. You may have to escape some char

Re: AW: Error Parsing xml document

2012-05-15 Thread keshlam
I sit corrected... though if you want the document to be interoperable with other XML tools, you should hew close to the standard. BTW, I was just reminded that XML Namespaces 1.1 declared that namespace names should be IRIs, not just URIs. Of course how many tools tracked that chance is an op