klockla opened a new issue, #1353: URL: https://github.com/apache/incubator-stormcrawler/issues/1353
The URLFrontier Spout ( https://github.com/apache/incubator-stormcrawler/blob/main/external/urlfrontier/src/main/java/org/apache/stormcrawler/urlfrontier/Spout.java ) doesn't take into account the crawl Id that can be specified in the configuration parameters (URLFRONTIER_CRAWL_ID_KEY = "urlfrontier.crawlid" defined in org.apache.stormcrawler.urlfrontier.Constants) This results in a mix of URLs coming from distinct frontiers in URLFrontier. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org