Hi,
today I went through the Wiki and updated all links still pointing to the old
repository or to the old Java package path ("com/digitalpebble/storm-crawler"
instead of "org/apache/stormcrawler").
After running a link checker, I've also fixed few other broken links (eg. to
Guava's Javadoc).
However, I didn't touch links below "Resources": "Powered-By" and
"Presentations". Shall I?
Two other points also require an update:
1. the CrawlTopology was removed [1] but is still mentioned in the Wiki:
- I'll point instead to the crawler.flux
- link to the archetype README [2]:
2. similarly, for the Elasticsearch module [3]:
Given on the context,
- I'll point to urlfrontier instead
- eventually, also to Opensearch
- if possible linking to the corresponding READMEs [4,5]
to avoid too many duplicate information
Just asking whether this is ok. Thanks for any comments or recommendations!
I'll also check whether any of the updates apply to the Stormcrawler
code itself. Here I'll open issues / PRs. :‑)
Best,
Sebastian
[1] https://github.com/apache/stormcrawler/issues/1401
[2]
https://github.com/apache/stormcrawler/blob/main/archetype/src/main/resources/archetype-resources/README.md