jnioche commented on PR #1898: URL: https://github.com/apache/stormcrawler/pull/1898#issuecomment-4377682358
This is great. Quick question: why not have the redirection bolt point back to the FetcherBolt instead so that the URL gets refetched straight away? Obviously would have to make sure it does not get into an endless loop by checking if it has been fetched by Playwright. Should any outlinks with the same hostname inherit the flag? Should we have a URL filter to that effect? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
