Tx and one day we can turn it into another little booklet :) Stef
On Sun, Nov 12, 2017 at 3:04 PM, Alistair Grant <akgrant0...@gmail.com> wrote: > Hi Stef, > > On 12 November 2017 at 14:47, Stephane Ducasse <stepharo.s...@gmail.com> > wrote: >> exampleNavigation >> | chrome page logger | >> logger := InMemoryLogger new. >> logger start. >> chrome := GoogleChrome new >> debugOn; >> debugSession; >> open; >> yourself. >> page := chrome tabPages first. >> page enablePage. >> page enableDOM. >> page navigateTo: 'http://pharo.org'. >> page getDocument. >> page getMissingChildren. >> page updateTitle. >> logger stop. >> ^{ chrome. page. logger. } >> >> but in fact I realised that I would like to a simple doc :) >> >> >> On Sun, Nov 12, 2017 at 2:44 PM, Stephane Ducasse >> <stepharo.s...@gmail.com> wrote: >>> Hi alistair >>> >>> this is cool. >>> Do you have one little example so that we can see how we can use it? >>> >>> Stef > > Fair enough :-) > > I'll try and extend the readme to include some basic documentation. > > Cheers, > Alistair > > > >>> On Sat, Nov 11, 2017 at 4:38 PM, Alistair Grant <akgrant0...@gmail.com> >>> wrote: >>>> On 9 November 2017 at 00:00, Kjell Godo <squeakl...@gmail.com> wrote: >>>>> i like to collect some newspaper comics from an online newspaper >>>>> but it takes really long to do it by hand by hand >>>>> i tried Soup but i didn’t get anywhere >>>>> the pictures were hidden behind a script or something >>>>> is there anything to do about that? >>>> >>>> Most of the web pages I want to scrape use javascript to construct the >>>> DOM, which makes Soup. XMLHTMLParser, etc. useless. >>>> >>>> I've extended Torsten's Pharo-Chrome library and use that to navigate >>>> the DOM in a way similar to Soup: >>>> >>>> https://github.com/akgrant43/Pharo-Chrome >>>> >>>> This gets around the issue with javascript since it waits for the >>>> browser to load the page, run the javascript and construct the DOM. >>>> >>>> HTH, >>>> Alistair >>>> >>>> >>>> >>>>> i don’t want to collect them all >>>>> i have the XPath .pdf but i haven’t read it yet >>>>> >>>>> these browsers seem to gobble up memory >>>>> and while open they just keep getting bigger till the OS session >>>>> crash >>>>> might there be a browser that is more minimal? >>>>> >>>>> Vivaldi seems better at not bloating up RAM >>>> >> >