On Sep 3, 2016, at 8:06 AM, Hernán Morales Durand <hernan.mora...@gmail.com>
wrote:
Thank you Monty for the clarification. I should say the original XPath package
was written by Phil Hargett and I just added a couple of methods. Glad you
rewrote the lib!
Cheers,
Hernán
2016-09-03 3:01 GMT-03:00 monty <mon...@programmer.net>:
Hernan, the PharoExtras/XPath repo has a major rewrite of your package to support all of XPath 1.0 + XPath 2.0 extensions like the element() and attribute() type tests and namespace literals in name tests like '{namespaceURI}localName'. A rewrite was needed because the old lib only implemented a small subset of the spec and would infinite loop on some inputs.
Sent: Thursday, September 01, 2016 at 3:56 PM
From: "Hernán Morales Durand" <hernan.mora...@gmail.com>
To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org>
Subject: Re: [Pharo-users] Coding XPath as Smalltalk
2016-09-01 16:51 GMT-03:00 PBKResearch <pe...@pbkresearch.co.uk>:
Hi Hernan
I don’t understand your first question – I can’t see a connection between SPARQL and what I am doing.
You could get the Wikitionary data by querying a SPARQL endpoint http://wiktionary.dbpedia.org/sparql instead of scrapping web pages (which seems more difficult)
I downloaded XPath from http://smalltalkhub.com/mc/PharoExtras/XPath/. However, I am probably using a somewhat out of date version; I downloaded it about a year ago.
I don't know about that version. I copied an old version from SqueakSource (with permission) and updated from time to time, but there is no much. There is also a XPath2 repository which you may try.
Hernán
Peter
From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of Hernán Morales Durand
Sent: 01 September 2016 18:54
To: Any question about pharo is welcome <pharo-users@lists.pharo.org>
Subject: Re: [Pharo-users] Coding XPath as Smalltalk
Hi Peter,
2016-09-01 10:26 GMT-03:00 PBKResearch <pe...@pbkresearch.co.uk>:
Hello
I am using XPath as a way of dissecting web pages, especially from Wiktionary.
Any specific reason to not use the SPARQL endpoint?
Generally I get good results, but I could get useful extra flexibility by using
the binary Smalltalk operators to represent XPath, as mentioned at the end of
the class comment for XPath. However, the description there is very terse, and
I am having difficulty seeing how to include more complex expressions,
especially attribute tests.
Which XPath version are you using? How did you installed it?
I have put some of my XPath expressions through the XPath compiler and looked
at the output, and out of that I have found expressions which work but look
very clumsy. As an example, I have used the fragment:
document xPath: '//div[@id=''catlinks'']//li//text()'
and found that an equivalent is:
document //'div' ?? [:node :x :y|(node attributeAt: 'id') = 'catlinks']//'li'//[:n| n isStringNode]].
(I had to put two dummy arguments in the three-argument block to get it to
work.)
Is there a more extensive explanation of the use of these binary operators? If not, could some kind person show me the most concise translation of the sample XPath above, to give me a start in working out more complex cases?
Many thanks for any help.
Peter Kenny