GitHub user rzo1 added a comment to the discussion: Don't index sitemap in 
opensearch

Yes, you are right. The current implementation is a bit weird in that sense. 

>From my POV, filtering is exclusion-based, not inclusion-based, i.e. “exclude 
>documents matching these conditions”.

That means:
- If a document has key=value matching the filter, it should be filtered out
- If the key is not present, the document should not be filtered
- Multiple filters should be supported (as the docs claim) - guess it would be 
**or** between multiple filters.

For each filter entry:
- If key not present -> ignore
- If key present and value matches -> reject document

In that sense, I think, that we should fix it (or introduce a new config option 
to be backward compatbile).

GitHub link: 
https://github.com/apache/stormcrawler/discussions/1803#discussioncomment-15744130

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to