On Fri, Aug 16, 2013 at 02:31:53PM +0200, Age Jan Kuperus wrote:
> We are using libxml2/lib(e)xslt since 2004, and very happy with it
> in general. Recently we discovered that str:tokenize and str:split
> do not always meet our expectations. The problem we have is that
> empty elements are silently removed. As an example,
> str:tokenize('abcdef,fghij, klmnop, ,,qrstuvw , xyz, ,,', ',')
> generates a node-set with seven elements instead of the ten we
> expected. Some applications (conversion of .csv based files is the
> obvious example) really need to know where empty fields are present.
> A second enhancement we would like to have (in str:tokenize only) is
> an indication (in an attribute of the token) of the delimiter that
> was present between two tokens. What is your opinion about this?

  Might be an overlook in the implementation, however the definition
  http://www.exslt.org/str/functions/tokenize/

states "The str:tokenize function splits up a string and returns a node
set of token elements, each containing one token from the string."

The problem is that in a in an XML context a token is usually taken
as this definition:

http://www.w3.org/TR/REC-xml/#NT-Nmtoken

    [7]     Nmtoken    ::=      (NameChar)+

and hum, that doesn't allow for an empty string.
I guess the best at this point would be to check what the other
implementations are doing and try to follow the majority, because i
don't thing there is much maintainance on EXSLT at this point.
The other options is to stick to the XSLT-2.0 semantic for the
equivalent function and indeed it seems to do what you expect, e.g.
Example 3 in http://zvon.org/comp/r/ref-XSLT_2.html#Functions~tokenize
being clear there

  So sounds it can be caracterized as a bug :-) but it's a bit fuzzy

Daniel
-- 
Daniel Veillard      | Open Source and Standards, Red Hat
veill...@redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to