GitHub user rzo1 added a comment to the discussion: Don't index sitemap in opensearch
Yes, you are right. The current implementation is a bit weird in that sense. >From my POV, filtering is exclusion-based, not inclusion-based, i.e. “exclude >documents matching these conditions”. That means: - If a document has key=value matching the filter, it should be filtered out - If the key is not present, the document should not be filtered - Multiple filters should be supported (as the docs claim) - guess it would be **or** between multiple filters. For each filter entry: - If key not present -> ignore - If key present and value matches -> reject document In that sense, I think, that we should fix it (or introduce a new config option to be backward compatbile). GitHub link: https://github.com/apache/stormcrawler/discussions/1803#discussioncomment-15744130 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
