Re: Strategies for Real-Time Data Updates in Solr Without Compromising Latency

2023-08-28 Thread Mikhail Khludnev
Hi, here are a few considerations.
You can try to update docvalues only the price column as whole (it's
worth to update as many docs as possible).
https://solr.apache.org/guide/solr/latest/indexing-guide/partial-document-updates.html#in-place-updates
Note: docValues only fields are searchable. This approach ignores caches
(can't make them useful)
Another idea is to extract updatable fields into a separate index/cores,
update separately and join them to the main docs on the every request. This
requires
https://solr.apache.org/guide/solr/latest/query-guide/join-query-parser.html#joining-multiple-shard-collections
available since 9.3.
Here's the problem that committing price core changes makes join query
entries in the main core caches obsolete and evicted, it impacts query
times. However, there's an idea and implementation to warm "to-side" join
index caches https://issues.apache.org/jira/browse/SOLR-16242


On Fri, Aug 25, 2023 at 9:16 PM Neeraj giri  wrote:

> Greetings fellow forum members,
>
> Our team is currently working with Solr 8.11 in cloud mode to power our
> search system, built using Java Spring at the application layer. We're
> facing a challenge in maintaining up-to-date pricing information for our
> ecommerce platform, which experiences frequent data changes throughout the
> day. While attempting to achieve real-time data updates, we've encountered
> issues related to Solr's latency and overall system performance.
>
> As of now, we've implemented a process that halts data writes during the
> day. Instead, we retrieve updated pricing data from a separate microservice
> that maintains a cached and current version of the information. However, we
> believe this approach isn't ideal due to its potential impact on system
> efficiency.
>
> We're seeking guidance on designing an architecture that can seamlessly
> handle real-time updates to our Solr index without compromising the search
> latency that our users expect. Writing directly to Solr nodes appears to
> increase read latency, which is a concern for us. Our goal is to strike a
> balance between keeping our pricing information up-to-date and maintaining
> an acceptable level of system responsiveness.
>
> We would greatly appreciate any insights, strategies, or best practices
> from the community that can help us tackle this challenge. How can we
> optimize our approach to real-time data updates while ensuring Solr's
> latency remains within acceptable limits? Any advice or suggestions on
> architecture, techniques, or tools would be invaluable.
>
> Thank you in advance for your expertise and assistance.
>
> Regards,
>
> Neeraj giri
>


-- 
Sincerely yours
Mikhail Khludnev


Re: Re-index after upgrade

2023-08-28 Thread Jan Høydahl
Are you sure that 9.x refuse to open an index first created in 7.x? I thought 
that strict policy was only needed in 8.0 due to a particular lossy data 
structure change, and that 9.x is more lenient?

Jan

> 24. aug. 2023 kl. 22:38 skrev Shawn Heisey :
> 
> Version 9 behaves the same as 8.  If the index was ever touched by a version 
> before 8.0, version 9 will not read the index.



Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

2023-08-28 Thread Vincenzo D'Amore
Hi Tim, have you figured out the problem? Just curious to know what you
have done at the end.

On Fri, Aug 25, 2023 at 4:48 PM Vincenzo D'Amore  wrote:

> Just my 2 cent:, I have always used solr clients as singletons. You have
> to instantiate them only once and reuse them forever.
>
> On Fri, 25 Aug 2023 at 15:35, Tim Funk  wrote:
>
>> Update - It looks like the ThreadLocal leak is different and unrelated to
>> creating  / closing a new Http2SolrClient every request.  Even using a
>> shared
>> Http2SolrClient for my webapp - I noticed the same issue in a QA
>> environment
>> of leaking ThreadLocals. Falling back to HttpSolrClient optimistically is
>> the fix so far.
>>
>> Client is OpenJDK 11.0.17
>>
>> -Tim
>>
>> On Wed, Aug 23, 2023 at 9:46 AM Tim Funk  wrote:
>>
>> > Cool - For now I'll either revert to HttpSolrClient or use a single
>> client
>> > (depending
>> >  on what I have to refactor)
>> >
>> > My only concern with a shared client is if one calls close()
>> "accidently",
>> > i don't
>> > see an easy way to query the client to see if it was closed so I can
>> > destroy it
>> > and create a new one. (Without resorting to an webapp restart)
>> >
>> > -Tim
>> >
>> > On Tue, Aug 22, 2023 at 6:42 PM Shawn Heisey 
>> wrote:
>> >
>> >>
>> >> That kind of try-with-resources approach should take care of the
>> >> problem, because it would run the close() method on the SolrClient
>> object.
>> >>
>> >> The classes in the error are Jetty classes.  This probably means that
>> >> the problem is in Jetty, but I couldn't guarantee that.
>> >>
>> >> You do not need multiple client objects just because you have multiple
>> >> cores.  You only need one Http2SolrClient object per hostname:port
>> >> combination used to access Solr, and you should only need to create
>> them
>> >> when the application starts and close them when the application ends.
>> >>
>> >>
>>
>

-- 
Vincenzo D'Amore


Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

2023-08-28 Thread Tim Funk
I reverted to HttpSolrClient. That seems to have plugged the leak.
As for root cause, I haven't had time to dig farther. Since this happens
regardless of reusing SolrClient vs instantiating a new one, I'm hoping
that's a data point of interest. But as for constructing a "simple" test
to reproduce, I'm not sure if I'll find the time in the near future to do
other $work priorities.

As for future triage, I'd try the any of the following
- Change my endpoint and use Http2 ( disable: builder.useHttp1_1(true))
- Revert to Http2Client and add a timer / logger in existing apps servers
counting threadlocals and look for patterns
- Write a standalone client, single thread. See if I can count the
threadlocals over time.
- Write a standalone client - Make all executions in new different threads
with occasional reuse of thread

-Tim


On Mon, Aug 28, 2023 at 7:17 AM Vincenzo D'Amore  wrote:

> Hi Tim, have you figured out the problem? Just curious to know what you
> have done at the end.
>
> On Fri, Aug 25, 2023 at 4:48 PM Vincenzo D'Amore 
> wrote:
>
> > Just my 2 cent:, I have always used solr clients as singletons. You have
> > to instantiate them only once and reuse them forever.
> >
>
>


Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

2023-08-28 Thread Vincenzo D'Amore
Hi Tim, thanks for letting me know, I experienced the same problem, my
application became unstable and crashed.
My first implementation was very similar to yours and relied heavily
on try-with-resources java statements with CloudSolrClient.
As said in my previous email, I ended up using the solr clients as
singletons reusing one instance per solr instance/collection.


On Mon, Aug 28, 2023 at 1:48 PM Tim Funk  wrote:

> I reverted to HttpSolrClient. That seems to have plugged the leak.
> As for root cause, I haven't had time to dig farther. Since this happens
> regardless of reusing SolrClient vs instantiating a new one, I'm hoping
> that's a data point of interest. But as for constructing a "simple" test
> to reproduce, I'm not sure if I'll find the time in the near future to do
> other $work priorities.
>
> As for future triage, I'd try the any of the following
> - Change my endpoint and use Http2 ( disable: builder.useHttp1_1(true))
> - Revert to Http2Client and add a timer / logger in existing apps servers
> counting threadlocals and look for patterns
> - Write a standalone client, single thread. See if I can count the
> threadlocals over time.
> - Write a standalone client - Make all executions in new different threads
> with occasional reuse of thread
>
> -Tim
>
>
> On Mon, Aug 28, 2023 at 7:17 AM Vincenzo D'Amore 
> wrote:
>
> > Hi Tim, have you figured out the problem? Just curious to know what you
> > have done at the end.
> >
> > On Fri, Aug 25, 2023 at 4:48 PM Vincenzo D'Amore 
> > wrote:
> >
> > > Just my 2 cent:, I have always used solr clients as singletons. You
> have
> > > to instantiate them only once and reuse them forever.
> > >
> >
> >
>


-- 
Vincenzo D'Amore


Re: Re-index after upgrade

2023-08-28 Thread Shawn Heisey

On 8/28/23 05:03, Jan Høydahl wrote:

Are you sure that 9.x refuse to open an index first created in 7.x? I thought 
that strict policy was only needed in 8.0 due to a particular lossy data 
structure change, and that 9.x is more lenient?


I haven't actually tried it, but I believe N-1 is enforced for all 
versions starting with 8.0, including 9.x and 10.0.0-SNAPSHOT.  That 
would need to be verified by someone who is more familiar with Lucene 
than I am.


Thanks,
Shawn


Re: Weird issue -- pulling results with cursorMark gets fewer documents than numFound

2023-08-28 Thread Chris Hostetter


: Schema meets the requirements for Atomic Update, so we are doing a migration
: by querying the old cluster and writing to the new cluster. We are doing it in
: batches by filtering on one of the fields, and using cursorMark to efficiently
: page through the results.
...
: The query thread gets batches of 1 documents and dumps them on a 
...
: One of the batches always indexes 5 fewer documents than numFound.  It's
: consistent -- always 5 documents.  Updates are paused during the migration.
: On the last run, numFound for this batch was 3824942 and the indexed count was
: 3824937.

I assume you mean one of the batches always indexes 5 fewer documents then 
'rows=N' param (ie: the query batch size) ... correct?   

You're talking about the total numFound being higher then the index count?

: The other idea I have is that there could be a uniqueKey value that appears in
: more than one shard.  This doesn't seem likely, as the compositeId router

Also possible is that sme shards are out of sync with their leader -- ie: 
for some shardX, replica1 has a doc that replica2 doesn't, and replica1 is 
used for the initial phase of the request to get the "top N sorted doc 
uniqueKey at cursorMark=ZZZ" but replica2 is used in the second phase to 
fetch all of the field values.  (but if that were the case, you'd expect 
that at least some of the time you'd get "lucky" and the two phases would 
both hit replicas that agreeed with eachother -- even if they didn't agree 
with the leader -- and the problem wouldn't reliably reproduce every time)

: should keep that from happening.  Is there a way to detect this situation?  I

I would log every cursorMark request URL and the number of docs in the 
response.

If, at the end of the run, you see a cursorMark value that didn't return 
the same number of docs as your rows param (ignoring the last batch which 
you expect to be smaller) then go manually re-run that query against every 
replica of every shard using `distrib=false` and diff the responses from 
each replica of the same shard



-Hoss
http://www.lucidworks.com/


Re: Re-index after upgrade

2023-08-28 Thread Rahul Goswami
Yep, that check is present in Lucene 9.x as well. It will refuse to open an
index created in 7.x.

https://github.com/apache/lucene/blob/releases/lucene/9.4.0/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java#L345

https://github.com/apache/lucene/blob/releases/lucene/9.4.0/lucene/core/src/java/org/apache/lucene/util/Version.java#L262

-Rahul

On Mon, Aug 28, 2023 at 2:31 PM Shawn Heisey  wrote:

> On 8/28/23 05:03, Jan Høydahl wrote:
> > Are you sure that 9.x refuse to open an index first created in 7.x? I
> thought that strict policy was only needed in 8.0 due to a particular lossy
> data structure change, and that 9.x is more lenient?
>
> I haven't actually tried it, but I believe N-1 is enforced for all
> versions starting with 8.0, including 9.x and 10.0.0-SNAPSHOT.  That
> would need to be verified by someone who is more familiar with Lucene
> than I am.
>
> Thanks,
> Shawn
>


Re: Weird issue -- pulling results with cursorMark gets fewer documents than numFound

2023-08-28 Thread Shawn Heisey

On 8/28/23 11:42, Chris Hostetter wrote:

I assume you mean one of the batches always indexes 5 fewer documents then
'rows=N' param (ie: the query batch size) ... correct?

You're talking about the total numFound being higher then the index count?


The query uses rows=1, which is configurable via a commandline option.

The source collection's numFound is 5 higher than the number of 
documents indexed to the target.  I was assured that all updates to the 
source collection were paused during the most recent migration test.



Also possible is that sme shards are out of sync with their leader -- ie:
for some shardX, replica1 has a doc that replica2 doesn't, and replica1 is
used for the initial phase of the request to get the "top N sorted doc
uniqueKey at cursorMark=ZZZ" but replica2 is used in the second phase to
fetch all of the field values.  (but if that were the case, you'd expect
that at least some of the time you'd get "lucky" and the two phases would
both hit replicas that agreeed with eachother -- even if they didn't agree
with the leader -- and the problem wouldn't reliably reproduce every time)


We did make sure that the numDocs was the same on all replicas for each 
shard.  A comprehensive check of ID values across replicas has not been 
done.  I should be able to write a program to do that.



: should keep that from happening.  Is there a way to detect this situation?  I

I would log every cursorMark request URL and the number of docs in the
response.


It has been verified that each cursorMark batch is 1 docs except the 
last batch, by checking the size of the SolrDocumentList object 
retrieved from the response.  Added some debug-level logging to show 
that along with the cursorMark value.


I have finished my SolrJ program using Http2SolrClient that will look 
for IDs that exist in more than one shard.  I had hoped to have it get 
the list of core URLs from ZK, but couldn't figure that out, so now the 
commandline options accept multiple core-specific URLs, with the idea 
that one replica core from each shard will be presented.  I have tested 
it against my little Solr install, with the first URL pointing at the 
collection alias and the second pointing at the real core.  It's a 
single-shard collection on a single node.  As expected, it reported that 
every ID was duplicated.  We'll try it for real in the wee hours of the 
morning.


I put the program on github if anyone is interested in taking a look.

https://github.com/elyograg/shard_duplicate_finder

Thanks,
Shawn


Re: Re-index after upgrade

2023-08-28 Thread Jan Høydahl
Ok, thanks for the clarification.

Jan

> 28. aug. 2023 kl. 20:43 skrev Rahul Goswami :
> 
> Yep, that check is present in Lucene 9.x as well. It will refuse to open an
> index created in 7.x.
> 
> https://github.com/apache/lucene/blob/releases/lucene/9.4.0/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java#L345
> 
> https://github.com/apache/lucene/blob/releases/lucene/9.4.0/lucene/core/src/java/org/apache/lucene/util/Version.java#L262
> 
> -Rahul
> 
> On Mon, Aug 28, 2023 at 2:31 PM Shawn Heisey  wrote:
> 
>> On 8/28/23 05:03, Jan Høydahl wrote:
>>> Are you sure that 9.x refuse to open an index first created in 7.x? I
>> thought that strict policy was only needed in 8.0 due to a particular lossy
>> data structure change, and that 9.x is more lenient?
>> 
>> I haven't actually tried it, but I believe N-1 is enforced for all
>> versions starting with 8.0, including 9.x and 10.0.0-SNAPSHOT.  That
>> would need to be verified by someone who is more familiar with Lucene
>> than I am.
>> 
>> Thanks,
>> Shawn
>> 



Registration open for Community Over Code North America

2023-08-28 Thread Rich Bowen
Hello! Registration is still open for the upcoming Community Over Code
NA event in Halifax, NS! We invite you to  register for the event
https://communityovercode.org/registration/

Apache Committers, note that you have a special discounted rate for the
conference at US$250. To take advantage of this rate, use the special
code sent to the committers@ list by Brian Proffitt earlier this month.

If you are in need of an invitation letter, please consult the
information at https://communityovercode.org/visa-letter/

Please see https://communityovercode.org/ for more information about
the event, including how to make reservations for discounted hotel
rooms in Halifax. Discounted rates will only be available until Sept.
5, so reserve soon!

--Rich, for the event planning team