You could use the UUIDUpdateProcessorFactory to automatically add a UUID to 
each document and use that as the tie-breaker field.

https://solr.apache.org/guide/8_1/update-request-processors.html#uuidupdateprocessorfactory

The chances of collision of UUIDs is well-known, and highly unlikely.

https://en.wikipedia.org/wiki/Universally_unique_identifier#Collisions



> On 31 Aug 2021, at 14:04, rgamarra <rgama...@gmail.com> wrote:
> 
> hi,
> 
>> Random ≠ unique.
> 
> Agree. They are not the same. I don't want a tie breaker, I want to know
> how many ties I would face.
> 
> The implementation where it's being used has some other (posterior) sorting
> criteria. So the question can be rephrased as whether posterior orders have
> any effect or not.
> 
> For example, given
> 
> sort= random_1234 DESC, price DESC
> 
> At the end of the day, does the "price DESC" have any effect (which
> translates to how often ties in the random do happen)?
> 
> I took a glimpse at
> https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/schema/RandomSortField.java
> and I conclude that
> - an int is being used.
> - it's a hashing of the #doc + see, more than a random number generator of
> a certain distribution.
> 
> Best. Thanks.
> 
> 
> --
> Rodolfo Federico Gamarra
> 
> 
> On Tue, Aug 31, 2021 at 3:00 AM Thomas Corthals <tho...@klascement.net>
> wrote:
> 
>> Hi Rodolfo
>> 
>> Random ≠ unique. If you really need a tie breaker, you'll have to sort on
>> the uiqueKey field.
>> 
>> What is your use case here? When using a cursor, sorting on a random field
>> will yield confusing results.
>> 
>> Thomas
>> 
>> Op ma 30 aug. 2021 om 17:33 schreef rgamarra <rgama...@gmail.com>:
>> 
>>> Hi there! I'm using random fields (eg sort=random_1234 DESC) as a tie
>>> breaker.
>>> 
>>> I'm wondering the underlying random sequence how many digits uses for
>> each
>>> generated number.
>>> 
>>> My result sets my contain (in principle) millions of results, so I would
>>> like to have an estimation of possible clashes (ie two results ending
>> with
>>> the same random under, and then being a tie in the result set).
>>> 
>>> Best regards.
>>> 
>>> --
>>> Rodolfo Federico Gamarra
>>> 
>> 

Reply via email to