Hi Cédrik
I started out using Soup, but I found out that it does what its name suggests, and jumbles up the contents of the pages. I now parse the pages with XMLHTMLParser, which preserves the original structure exactly. The point of XPath is that it is a convenient way of specifying a route through the structure to the desired information. So the XPath I cited says ‘find a DIV node, at any depth, which has id=”catlinks”, then find a descendant which is a LI node, then find the text of any descendants.’ Peter Hi Peter, Never used Path so I cannot help there. I just wander if you connote use Soup to « dissect » your webpages ? http://www.smalltalkhub.com/#!/~PharoExtras/Soup HTH, Cédrik Le 1 sept. 2016 à 15:26, PBKResearch <pe...@pbkresearch.co.uk <mailto:pe...@pbkresearch.co.uk> > a écrit : Hello I am using XPath as a way of dissecting web pages, especially from Wiktionary. Generally I get good results, but I could get useful extra flexibility by using the binary Smalltalk operators to represent XPath, as mentioned at the end of the class comment for XPath. However, the description there is very terse, and I am having difficulty seeing how to include more complex expressions, especially attribute tests. I have put some of my XPath expressions through the XPath compiler and looked at the output, and out of that I have found expressions which work but look very clumsy. As an example, I have used the fragment: document xPath: '//div[@id=''catlinks'']//li//text()' and found that an equivalent is: document //'div' ?? [:node :x :y|(node attributeAt: 'id') = 'catlinks']//'li'//[:n| n isStringNode]]. (I had to put two dummy arguments in the three-argument block to get it to work.) Is there a more extensive explanation of the use of these binary operators? If not, could some kind person show me the most concise translation of the sample XPath above, to give me a start in working out more complex cases? Many thanks for any help. Peter Kenny