Unfortunately in my architecture I cannot rely on a database and on a
updated/created
time field. There is a potentially infinite stream of documents with a
possible huge amount of duplication.
So avoid the indexing of the duplicate documents (I suppose) should improve
the performance.
On Fri, 5 A
——
At this point it would be interesting to see how this Processor would
increase the indexing performance when you have many duplicates
- when it comes to indexing performance with duplicates, there isn’t any
difference than a new document. It’s mark as original destroyed, and new one
replaces
Hi Koji, thank you so much for the details.
At first glance, looking at Javadoc, I didn't realize two things: I can use
SignatureUpdateProcessorFactory on a signatureField different from the 'id'
and also, very important, that there was a “overwriteDupes” parameter.
In my current schema I cannot ch
Hi Vincenzo,
I see. then I still think SignatureUpdateProcessorFactory is the one you are
looking for.
I tried to look for the explanation how it works in its javadoc and Solr Ref
Guide, but no luck.
Then I found the good one which was written by the contributor when SignatureUpdateProcessorFac
I mean, the problem I need to solve is how to avoid a second update when
there are no changes in the document, in other words to update a document
only if one or more fields differs from the stored document.
On Tue, Aug 2, 2022 at 6:16 AM Koji Sekiguchi
wrote:
> Hi Vincenzo,
>
> I cannot underst
Hi Vincenzo,
I cannot understand what "the second update" means...
Koji
On 2022/08/02 0:39, Vincenzo D'Amore wrote:
Koji, on second thought, this SignatureUpdateProcessorFactory does not
avoid the second update...
On Mon, Aug 1, 2022 at 5:36 PM Vincenzo D'Amore wrote:
Hi Koji, thanks! It i
Sorry, Vincenzo. Have no idea. Don't hesitate to post the answer if you
find it out.
On Tue, Aug 2, 2022 at 1:50 AM Vincenzo D'Amore wrote:
> Thanks for sharing this Mikhail.
> Do you know how big is the overhead for Solr in handling documents that do
> not have a new version?
> For example, we
Thanks for sharing this Mikhail.
Do you know how big is the overhead for Solr in handling documents that do
not have a new version?
For example, we have to update ten thousand documents, but only 100 of them
have a newer version.
How does Solr behave?
On Sun, Jul 31, 2022 at 2:16 AM Mikhail Khludn
Koji, on second thought, this SignatureUpdateProcessorFactory does not
avoid the second update...
On Mon, Aug 1, 2022 at 5:36 PM Vincenzo D'Amore wrote:
> Hi Koji, thanks! It is exactly what I was looking for!
>
> On Mon, Aug 1, 2022 at 4:28 AM Koji Sekiguchi
> wrote:
>
>> Hi Vincenzo,
>>
>> I
Hi Koji, thanks! It is exactly what I was looking for!
On Mon, Aug 1, 2022 at 4:28 AM Koji Sekiguchi
wrote:
> Hi Vincenzo,
>
> I think SignatureUpdateProcessor is what you are looking for.
>
>
> https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/update/processor/Signatur
Hi Vincenzo,
I think SignatureUpdateProcessor is what you are looking for.
https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/update/processor/SignatureUpdateProcessorFactory.java
Koji
On 2022/07/30 18:41, Vincenzo D'Amore wrote:
Hi all,
As far as I know it is not po
Hi, Vincenzo.
I can only remember version control via checking a particular field.
https://solr.apache.org/guide/solr/latest/indexing-guide/partial-document-updates.html#document-centric-versioning-constraints
On Sun, Jul 31, 2022 at 2:52 AM Vincenzo D'Amore wrote:
> Hi all,
>
> As far as I know
12 matches
Mail list logo