BTW, Moreover, I would like xpath_string() which return
On 1/29/07, Peter Eisentraut <[EMAIL PROTECTED]> wrote: [...]
So, while I realize that I've been arguing for a lean core recently, I want to propose that we add a small set of XPath support functions to the core. This would come down to approximately the following set xpath_boolean(query, xml) xpath_number(query, xml) xpath_string(query, xml) xpath_nodeset(query, xml) -- API and return type still unclear
As for the latest one, I am for xml[] as a result type, especially if we have xpath* in contrib. This is not XQuery sequences, but at least it allows user to see all XML fragments (and manage them somehow -- if he wants, he would concatenate them to one value using corresponding function). As for #1-3 -- they are very simple things; I do not like them, because they return only one scalar value, which is the one encountered first. I do not think it's very useful functions at all... Moreover, in case of xpath_string() I think it should work in the following manner: 1. Find all nodes that correspond the expression given. In general case it will be a set of nodes; OK, let's take only the first one, as we do with other functions... 2. For this node retrieve all text nodes that are its descendant. It will be an ordered set of text values. 3. Concatenate all these values and return as a single string. I suppose, only such behaviour is in compliance with XML data model -- as an example, consider following XML fragment: '<a>most <b>advanced</b> open source database</a>'. So, for xpath_string() I see two issues -- 1) a lack of usability if it returns only one (the first) value from possible sequences of values; 2) bad conformance if it take only one text node which belongs to the first context node. BTW, maybe it would be useful to have several functions, with every behaviour that can be useful. Also, I think it'd be better not to use the word "query" speaking of XPath, "XPath expression" is much better (to avoid confusion with XML Query).
We also have prospects that later on we might get fancy GIN-based indexing support for XPath, which might need another xpath_matches() function or operator of some kind.
Now I'm trying to collect all thought regarding indexes and express it in a short message (what types of queries should be considered; what types of indexes would support that queries). BTW, Do not forget that some type of index is already available - it's simply functional indexes on xpath_*() with static (i.e. known as a constant value a priori) XPath expression.
As far as contrib/xml2 is concerned, I'm not going to make any efforts to make the interface compatible because that module has a rather pragmatic design, whereas I'd rather just provide the raw operations that can be assembled easily by the user to achieve some of the things that contrib/xml2 does now. Once some description of transition steps has been developed, I'd deprecate the contrib/xml2 module and probably remove it after 8.3. In the wiki we have collected some random ideas of other interesting operations on XML types (http://developer.postgresql.org/index.php/XML_Todo, near the bottom). That list at the moment says: DTD validation Relax-NG XSLT XML Canonical (to compare XML values) Pretty-printing XML (e.g., indenting)
I've added "Shredding with annotated schemas" to this list (with brief description why it could be needed). Also, in a long term I see such items as - integration/support in pl/perl and other pl-langs that can work with XML; - work with web services (maybe it'd better to use pl/perl here). Maybe it too early to add such things even to the bottom of Todo list :-) -- Best regards, Nikolay ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match