2017-07-29 22:16 GMT+02:00 Sannyasin Brahmanathaswami :
> you want to extract from the <head> of the document the openGraph tags > > <meta property="og:site_name" content="YouTube"> > <meta property="og:url" content="https://www.youtube. > com/user/kauaiaadheenam"> > <meta property="og:title" content="Kauai's Hindu Monastery"> > <meta property="og:image" content="https://yt3.ggpht. > com/-p766LczvKHY/AAAAAAAAAAI/AAAAAAAAAAA/SIu6ZAJbMDc/s900- > c-k-no-mo-rj-c0xffffff/photo.jpg"> > <meta property="og:description" content="{where hinduism meets the > future}"> > > c) you also cannot depend on the output being line delimited, because some > CMS's delivery "agents" will minimize this to > > <meta property="og:site_name" content="YouTube"><meta property="og:url" > content="https://www.youtube.com/user/kauaiaadheenam"><meta > property="og:title" content="Kauai's Hindu Monastery"><meta > property="og:image" content="https://yt3.ggpht. > com/-p766LczvKHY/AAAAAAAAAAI/AAAAAAAAAAA/SIu6ZAJbMDc/s900- > c-k-no-mo-rj-c0xffffff/photo.jpg"><meta property="og:description" > content="{where hinduism meets the future}"> > > Has anyone rolled up a parser/scraper for this? Looks like "idiot simple text extraction" Hi, Here is a quick coded piece of code, tested only on your URL. I did write this regex based on the Datas you provide in your email. > I see the other thread on scraping pages generated by JS and suspect > perhaps some wizard among us already has this done…would save a bit of time > here. > > BR > Every time you see any kind of scraping/search/extraction/transformation in JS, you can be sure it's possible to do it in LiveCode So, here is the code: local Rx, Rslt, _Html, OG put empty into Rslt put URL "https://www.youtube.com/user/kauaiaadheenam" into _Html get "(?ms)<meta\s+property=\x{22}og:(.+?)\x{22}\s+content=\x{22}(.+?)\x{22}>" put IT into Rx repeat while matchChunk( _Html, Rx,p1,p2,p3,p4 ) put char p3 to p4 of _Html into OG[ char p1 to p2 of _Html ] delete char 1 to p4 of _Html end repeat and you can test it this way: combine OG using return and ":" put OG into fld 1 HTH and feel free to ask any question... Kind regards, Thierry -- ------------------------------------------------ Thierry Douez - sunny-tdz.com sunnYrex - sunnYtext2speech - sunnYperl - sunnYmidi - sunnYmage _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode