GitHub user psa-neutronian closed a discussion: JavaScript pages
I'm attempting to use Storm Crawler for JavaScript pages and am having some difficulty in creating a generalised solution. Also, it seems that Storm Crawler doesn't play nice with WebDriver for Firefox, aka Gecko Driver (or maybe non-Puppeteer WebDriver in general, I haven't gotten that far yet). So, I have a two part question: 1. Has anyone else successfully used Storm Crawler with Gecko Driver? 2. Any suggestions on where to start for getting Storm Crawler to do a generalised search through rendered HTML for a link matching a description (e.g. "about us", or "terms and conditions") then following that link, which may be XHR, and grabbing said text? For paths to files, the text should be fairly easy (it's finding the links when they're JavaScript based that's challenging), for overlays, I'm completely stumped. GitHub link: https://github.com/apache/incubator-stormcrawler/discussions/874 ---- This is an automatically sent email for dev@stormcrawler.apache.org. To unsubscribe, please send an email to: dev-unsubscr...@stormcrawler.apache.org