Re: [HACKERS] XML type and XPath

Nikolay Samokhvalov Mon, 29 Jan 2007 14:12:38 -0800

BTW,

Moreover, I would like xpath_string() which return


On 1/29/07, Peter Eisentraut <[EMAIL PROTECTED]> wrote:
[...]


So, while I realize that I've been arguing for a lean core recently, I
want to propose that we add a small set of XPath support functions to
the core.  This would come down to approximately the following set

xpath_boolean(query, xml)
xpath_number(query, xml)
xpath_string(query, xml)
xpath_nodeset(query, xml) -- API and return type still unclear


As for the latest one, I am for xml[] as a result type, especially if
we have xpath* in contrib. This is not XQuery sequences, but at least
it allows user to see all XML fragments (and manage them somehow -- if
he wants, he would concatenate them to one value using corresponding
function).

As for #1-3 -- they are very simple things; I do not like them,
because they return only one scalar value, which is the one
encountered first. I do not think it's very useful functions at all...
Moreover, in case of xpath_string() I think it should work in the
following manner:
 1. Find all nodes that correspond the expression given. In general
case it will be a set of nodes; OK, let's take only the first one, as
we do with other functions...
 2. For this node retrieve all text nodes that are its descendant. It
will be an ordered set of text values.
 3. Concatenate all these values and return as a single string.
I suppose, only such behaviour is in compliance with XML data model --
as an example, consider following XML fragment: '<a>most
<b>advanced</b> open source database</a>'.

So, for xpath_string() I see two issues -- 1) a lack of usability if
it returns only one (the first) value from possible sequences of
values; 2) bad conformance if it take only one text node which belongs
to the first context node.

BTW, maybe it would be useful to have several functions, with every
behaviour that can be useful.

Also, I think it'd be better not to use the word "query" speaking of
XPath, "XPath expression" is much better (to avoid confusion with XML
Query).

We also have prospects that later on we might get fancy GIN-based
indexing support for XPath, which might need another xpath_matches()
function or operator of some kind.


Now I'm trying to collect all thought regarding indexes and express it
in a short message (what types of queries should be considered; what
types of indexes would support that queries).

BTW, Do not forget that some type of index is already available - it's
simply functional indexes on xpath_*() with static (i.e. known as a
constant value a priori) XPath expression.

As far as contrib/xml2 is concerned, I'm not going to make any efforts
to make the interface compatible because that module has a rather
pragmatic design, whereas I'd rather just provide the raw operations
that can be assembled easily by the user to achieve some of the things
that contrib/xml2 does now.  Once some description of transition steps
has been developed, I'd deprecate the contrib/xml2 module and probably
remove it after 8.3.

In the wiki we have collected some random ideas of other interesting
operations on XML types
(http://developer.postgresql.org/index.php/XML_Todo, near the bottom).
That list at the moment says:

DTD validation
Relax-NG
XSLT
XML Canonical (to compare XML values)
Pretty-printing XML (e.g., indenting)


I've added "Shredding with annotated schemas" to this list (with brief
description why it could be needed).

Also, in a long term I see such items as
 - integration/support in pl/perl and other pl-langs that can work with XML;
 - work with web services (maybe it'd better to use pl/perl here).
Maybe it too early to add such things even to the bottom of Todo list :-)

--
Best regards,
Nikolay

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: [HACKERS] XML type and XPath

Reply via email to