Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-08 Thread Ben Coman
On Wed, 8 Jan 2020 at 06:32, LawsonEnglish wrote: > “Simple inspect” works fine. > > THe trace is: > > UndefinedObject(Object)>>doesNotUnderstand: #new > Message>>sentTo: > UndefinedObject(Object)>>doesNotUnderstand: #new > XMLDocumentHighlightDefaults class(XMLHighlightDefaults > class)>>textCol

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread LawsonEnglish
te which means >>> >>>- "ingredientsXML" is defined as a workspace variable as soon as you >>> evaluate >>> - the contents of "ingredientsXML" is preserved over different >>> evaluations within the workspace / playground &g

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread Sven Van Caekenberghe
o Torsten – I agree I was slipshod in my drafting – I was in a hurry. >> Instead of saying ‘can screw things up’ I should have said ‘can produce >> counter-intuitive results’, as exemplified by the fact that, in your first >> example, ‘ingredientsXML’ can mean differ

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread LawsonEnglish
r a line at a time. > > From: Pharo-users On Behalf Of > LawsonEnglish > Sent: 07 January 2020 20:55 > To: Any question about pharo is welcome > Subject: Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub > > I deleted the playground and entered the text thusl

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread Sven Van Caekenberghe
playground) >> and when you later want to use it again you can just inspect it or >> evaluate the second line in the same playground. >> >> If you like you can open a second playground which can have its own >> "ingredientsXML" workspace variable. >

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread LawsonEnglish
the second line in the same playground. > >If you like you can open a second playground which can have its own > "ingredientsXML" workspace variable. > > Workspace variables (or "playground variables") are convenient for > experimenting - as they are p

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread PBKResearch
epending on whether you execute it all in one go or a line at a time. From: Pharo-users On Behalf Of LawsonEnglish Sent: 07 January 2020 20:55 To: Any question about pharo is welcome Subject: Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub I deleted the playground and ente

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread LawsonEnglish
ybe the document could not be retrieved on your machine. > > Bye > T. > >> Gesendet: Dienstag, 07. Januar 2020 um 04:42 Uhr >> Von: "LawsonEnglish" >> An: pharo-users@lists.pharo.org >> Betreff: Re: [Pharo-users] [ANN] XMLParserHTML

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread Torsten Bergmann
e convenient for experimenting - as they are preserved - but yes they might confuse you when you cant remember what was done with them last. Bye T. > Gesendet: Dienstag, 07. Januar 2020 um 09:55 Uhr > Von: "PBKResearch" > An: "'Any question about pharo is welcome&#

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread PBKResearch
-users On Behalf Of Torsten Bergmann Sent: 07 January 2020 07:47 To: pharo-users@lists.pharo.org Cc: pharo-users@lists.pharo.org Subject: Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub Works without a problem (Pharo 8 on Windows), see attached. So it looks like a local problem. Just

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-06 Thread LawsonEnglish
Torsten Bergmann wrote > Hi, > > > You can load using > >Metacello new > baseline: 'XMLParserHTML'; > repository: 'github://pharo-contributions/XML-XMLParserHTML/src'; > load. > > > Bye > T. Hi, I'm trying to use the sample code in the pharo screen scraping booklet — h

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-30 Thread PBKResearch
:43 To: pharo-users@lists.pharo.org Subject: Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub cedreek wrote > To me, far better than using Soup. Ah, interesting! I use Soup almost exclusively. What did you find superior about XMLParserHTML? I may give it a try... cedreek wrote >

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-30 Thread Cédrick Béler
I couldn’t get it from Zn as (I think) there are some js lib that defer the full rendering. I have the same problem with a site in France (leboncoin). They use https://datadome.co to complicate webscrapping. So an headless browser is the only solution I know. Cheers, C

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-30 Thread Esteban Maringolo
Why use Chrome instead of ZnClient? To get a "real" render of the content? (including JS and whatnot). Regards! Esteban A. Maringolo On Sat, Nov 30, 2019 at 8:11 PM Cédrick Béler wrote: > > > > > > Also interesting! Any publicly available examples? How does one load "Google > > chrome pharo in

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-30 Thread Cédrick Béler
> > Also interesting! Any publicly available examples? How does one load "Google > chrome pharo integration »? "https://github.com/astares/Pharo-Chrome"; "https://github.com/akgrant43/Pharo-Chrome » Cheers, Cédrick

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-30 Thread Cédrick Béler
> cedreek wrote >> To me, far better than using Soup. > > Ah, interesting! I use Soup almost exclusively. What did you find superior > about XMLParserHTML? I may give it a try... > It’s mainly xpath which I find easier than navigating the html tree with soup or even The xmlHtmlparser. I us

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-30 Thread Sean P. DeNigris
cedreek wrote > To me, far better than using Soup. Ah, interesting! I use Soup almost exclusively. What did you find superior about XMLParserHTML? I may give it a try... cedreek wrote > Google chrome pharo integration helps top to scrap complex full JS web > site like google ;) Also interestin

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-29 Thread Esteban Maringolo
Great! I just added a link to the README.md of the project and created a PR, because it is very likely that if you're parsing HTML you're doing some scrapping. :-) Esteban A. Maringolo On Fri, Nov 29, 2019 at 2:18 PM Cédrick Béler wrote: > > Stef and other wrote this book a while ago: > > http

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-29 Thread Cédrick Béler
Stef and other wrote this book a while ago: http://books.pharo.org/booklet-Scraping/html/scrapingbook.html Basically XMLHtmlParser + XPath To me, far better than using Soup. Google chrome pharo integration helps top to scrap complex full JS web site like google ;) Cheers, Cedrick > Le 29

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-29 Thread Esteban Maringolo
Thank you Torsten, I wasn't aware of this tool, I'm already using it to scrap content from a website and fed a Pharo driven system :) The XML integration in the Inspector is great too. Regards! Esteban A. Maringolo On Tue, Nov 19, 2019 at 8:40 AM Torsten Bergmann wrote: > > Hi, > > the STHub

[Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-19 Thread Torsten Bergmann
Hi, the STHub -> PharoExtras project "XMLParserHTML" was now moved from http://smalltalkhub.com/#!/~PharoExtras/XMLParserHTML to https://github.com/pharo-contributions/XML-XMLParserHTML including the FULL HISTORY The old STHub repo was marked as obsolete - but is linking to the new one. I've a