Hi Stef,

On 12 November 2017 at 14:47, Stephane Ducasse <stepharo.s...@gmail.com> wrote:
> exampleNavigation
> | chrome page logger |
> logger := InMemoryLogger new.
> logger start.
> chrome := GoogleChrome new
> debugOn;
> debugSession;
> open;
> yourself.
> page := chrome tabPages first.
> page enablePage.
> page enableDOM.
> page navigateTo: 'http://pharo.org'.
> page getDocument.
> page getMissingChildren.
> page updateTitle.
> logger stop.
> ^{ chrome. page. logger. }
>
> but in fact I realised that I would like to a simple doc :)
>
>
> On Sun, Nov 12, 2017 at 2:44 PM, Stephane Ducasse
> <stepharo.s...@gmail.com> wrote:
>> Hi alistair
>>
>> this is cool.
>> Do you have one little example so that we can see how we can use it?
>>
>> Stef

Fair enough :-)

I'll try and extend the readme to include some basic documentation.

Cheers,
Alistair



>> On Sat, Nov 11, 2017 at 4:38 PM, Alistair Grant <akgrant0...@gmail.com> 
>> wrote:
>>> On 9 November 2017 at 00:00, Kjell Godo <squeakl...@gmail.com> wrote:
>>>> i like to collect some newspaper comics from an online newspaper
>>>>      but it takes really long to do it by hand by hand
>>>> i tried Soup but i didn’t get anywhere
>>>>      the pictures were hidden behind a script or something
>>>> is there anything to do about that?
>>>
>>> Most of the web pages I want to scrape use javascript to construct the
>>> DOM, which makes Soup. XMLHTMLParser, etc. useless.
>>>
>>> I've extended Torsten's Pharo-Chrome library and use that to navigate
>>> the DOM in a way similar to Soup:
>>>
>>> https://github.com/akgrant43/Pharo-Chrome
>>>
>>> This gets around the issue with javascript since it waits for the
>>> browser to load the page, run the javascript and construct the DOM.
>>>
>>> HTH,
>>> Alistair
>>>
>>>
>>>
>>>>         i don’t want to collect them all
>>>> i have the XPath .pdf but i haven’t read it yet
>>>>
>>>> these browsers seem to gobble up memory
>>>>      and while open they just keep getting bigger till the OS session crash
>>>>      might there be a browser that is more minimal?
>>>>
>>>> Vivaldi seems better at not bloating up RAM
>>>
>

Reply via email to