Thanx, but I'm not looking at de-deplication while adding documents, but
de-duplication while querying.
There is DuplicateFilter in contrib lib, but filters are not used
anymore in newer Lucene versions, so no luck there... :(
I assume I would maybe ned to implement my own Collector, but it seems
to me that is kinda advanced thing to do, so if one has some suggestion...
On 09/21/2016 05:40 AM, Đạt Cao Mạnh wrote:
Solr already support de-duplication when adding new documents. You can
refer to the doc at
https://cwiki.apache.org/confluence/display/solr/De-Duplication
On Tue, Sep 20, 2016 at 12:18 PM Vjeran Marcinko <
vjeran.marci...@email.t-com.hr> wrote:
Hello,
I'm pretty much Lucene newb, so wondering for some short guidelines on
how to implement some duplicate document filtering based on some field
which defines uniqueness, and first document stays, other duplicates are
filtered out?
I know some 3rd party contrib lib existed before which was for that, but
it has been abandoned/deprecated for these newer versions of Lucene.
Regards,
Vjeran
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org