3x+ performance reduction for the prefixed wildcard fl (like fl=abc_*) in 9.5.0 compared to 9.4.1

2024-02-23 Thread Oleksandr Tkachuk
We have ~17 dynamic fields like abc_xxx, and requests like
/select?fl=abc_* took ~180ms with 9.4.1, but after upgrading to 9.5.0
such requests now take ~620ms to execute.

It seems in 9.5.0
org.apache.solr.common.util.GlobPatternUtil.matches
used instead of
org.apache.commons.io.FilenameUtils.wildcardMatch
Which leads to huge losses in performance.

Here is the call tree from the profiler:
9.4.1
https://i.imgur.com/2gubfDr.png

9.5.0
https://i.imgur.com/JIZ1E9u.png


Re: Run solr in cloud mode and debug from intellij

2024-02-23 Thread Vincenzo D'Amore
Hi Christine, Mikhail, I have prepared a little change to Solr docs. I
have submitted the PR.

https://github.com/apache/solr/pull/2294

Please let me know if anything is wrong with it or missing.

Best regards,
Vincenzo



On Tue, Feb 20, 2024 at 11:38 AM Christine Poerschke (BLOOMBERG/ LONDON) <
cpoersc...@bloomberg.net> wrote:

> Hello Rajani and Vincenzo,
>
> Thank you for your valuable feedback on the setup experience!
>
> Absolutely, yes please to updating the docs. No JIRA ticket needed IMHO.
>
> Best wishes,
> Christine
>
> From: users@solr.apache.org At: 02/20/24 10:34:23 UTCTo:
> users@solr.apache.org
> Cc:  solr-u...@lucene.apache.org
> Subject: Re: Run solr in cloud mode and debug from intellij
>
> Should I create an issue (improvement) on jira first?
>
>
> On Tue, Feb 20, 2024 at 11:29 AM Vincenzo D'Amore 
> wrote:
>
> > Thanks, I'm going to do it.
> >
> > On Tue, Feb 20, 2024 at 11:23 AM Mikhail Khludnev 
> wrote:
> >
> >> Yeah. Agree. A little bit cryptic. I'm up to approve your PR!
> >>
> >> On Tue, Feb 20, 2024 at 12:59 PM Vincenzo D'Amore 
> >> wrote:
> >>
> >> > Hi Mikhail, thanks for sharing.
> >> > Maybe I haven't read the documentation carefully, but I had the same
> >> > trouble, I spent a few hours struggling to understand how to build and
> >> > debug Solr locally.
> >> > What do you think if we mention this resource directly into
> >> > https://github.com/apache/solr/blob/main/dev-docs/README.adoc
> >> > or even better into
> >> >
> >>
> https://github.com/apache/solr/tree/main?tab=readme-ov-file#get-involved
> >> > ?
> >> >
> >> > On Tue, Feb 20, 2024 at 7:20 AM Mikhail Khludnev 
> >> wrote:
> >> >
> >> > > Hello Rajani
> >> > > This might be particularly useful
> >> > >
> >> https://github.com/apache/solr/blob/main/dev-docs/solr-source-code.adoc
> >> > >
> >> > > ./gradlew dev will create a Solr executable suitable for
> development.
> >> > > Change directories via cd ./solr/packaging/build/dev and run the
> >> > > bin/solr script
> >> > > to start Solr. It will also create a "slim" Solr executable based on
> >> the
> >> > > "slim" Solr distribution. You can find this environment at
> >> > > ./solr/packaging/build/dev-slim. Use either ./gradlew devSlim or
> >> > ./gradlew
> >> > > devFull to create just one type of distribution.
> >> > > Do you need to debug some of SolrCloudTestCase subc(l)a(s)ses or
> want
> >> to
> >> > > debug solr cloud instances?
> >> > >
> >> > > On Mon, Feb 19, 2024 at 8:02 PM rajani m 
> >> wrote:
> >> > >
> >> > > > Hi Solr Devs,
> >> > > >
> >> > > >Are there any docs to debug solr in cloud mode from intellij? I
> >> > found
> >> > > > this  >> >[1]
> >> > > > article which covers what I am looking for but it is from 2015,
> >> could
> >> > you
> >> > > > take a look and verify if all still apply as of today? In that
> >> article,
> >> > > are
> >> > > > the "ant" steps still valid for the current version?
> >> > > >
> >> > > >  I started by following the steps from making-a-new-contribution
> >> > > > <
> >> > > >
> >> > >
> >> >
> >>
>
> https://github.com/apache/solr/blob/main/dev-docs/how-to-contribute.adoc#making-
> a-new-contribution
> 
> >> > > > >
> >> > > > article
> >> > > > and finished the first 3 steps. The build was successful.  I also
> >> have
> >> > > the
> >> > > > solr setup on intellij and able to run individual tests in debug
> >> mode.
> >> > > From
> >> > > > the article[1], I tried the start command from the bin
> >> > > > directory(/Users/rajani/projects/solr/solr) and got a class not
> >> found
> >> > > > exception "Caused by: org.apache.solr.cli.SolrCLI"  Did I miss any
> >> more
> >> > > > build steps?  Could you please help me with this?
> >> > > >
> >> > > > Thank you for looking into this and appreciate your help.
> >> > > >
> >> > > > [1] http://visitamaresh.com/debug-solr-cloud-remote-and-local/
> >> > > >
> >> > >
> >> > >
> >> > > --
> >> > > Sincerely yours
> >> > > Mikhail Khludnev
> >> > >
> >> >
> >> >
> >> > --
> >> > Vincenzo D'Amore
> >> >
> >>
> >>
> >> --
> >> Sincerely yours
> >> Mikhail Khludnev
> >>
> >
> >
> > --
> > Vincenzo D'Amore
> >
> >
>
> --
> Vincenzo D'Amore
>
>
>

-- 
Vincenzo D'Amore


Is this list alive? I need help

2024-02-23 Thread Beale, Jim (US-KOP)
I have a Solrcloud installation of three servers on three r5.xlarge EC2 with a 
shared disk drive using EFS and stunnel.

I have documents coming in about 2 per day and I am trying to perform 
indexing along with some regular queries and some special queries for some new 
functionality.

When I just restart Solr, these queries run very fast but over time become 
slower and slower.

This is typical for the numbers. At time1, the request only took 2.16 sec but 
over night the response took 18.137 sec. That is just typical.

businessId, all count, reduced count, time1, time2
7016274253,8433,4769,2.162,18.137

The same query is so far different. Overnight the Solr servers slow down and 
give terrible response. I don't even know if this list is alive.


Jim Beale
Lead Software Engineer
hibu.com
2201 Renaissance Boulevard, King of Prussia, PA, 19406
Office: 610-879-3864
Mobile: 610-220-3067

[cid:image001.png@01DA6656.6A7564F0]

The information contained in this email message, including any attachments, is 
intended solely for use by the individual or entity named above and may be 
confidential. If the reader of this message is not the intended recipient, you 
are hereby notified that you must not read, use, disclose, distribute or copy 
any part of this communication. If you have received this communication in 
error, please immediately notify me by email and destroy the original message, 
including any attachments. Thank you. **Hibu IT Code:141459300**


Re: Is this list alive? I need help

2024-02-23 Thread Stephen Boesch
The list is alive!  (play on old IMAX film *The Dream is Alive*).  But does
it breathe? Not sure, I have not done solr in over a decade.

On Fri, 23 Feb 2024 at 10:05, Beale, Jim (US-KOP)
 wrote:

> I have a Solrcloud installation of three servers on three r5.xlarge EC2
> with a shared disk drive using EFS and stunnel.
>
>
>
> I have documents coming in about 2 per day and I am trying to perform
> indexing along with some regular queries and some special queries for some
> new functionality.
>
>
>
> When I just restart Solr, these queries run very fast but over time become
> slower and slower.
>
>
>
> This is typical for the numbers. At time1, the request only took 2.16 sec
> but over night the response took 18.137 sec. That is just typical.
>
>
>
> businessId, all count, reduced count, time1, time2
>
> 7016274253,8433,4769,2.162,18.137
>
>
>
> The same query is so far different. Overnight the Solr servers slow down
> and give terrible response. I don’t even know if this list is alive.
>
>
>
>
>
> *Jim Beale*
>
> *Lead Software Engineer *
>
> *hibu.com *
>
> *2201 **Renaissance Boulevard**, King of Prussia, PA, **19406*
>
> *Office: 610-879-3864*
>
> *Mobile: 610-220-3067*
>
>
>
>
> The information contained in this email message, including any
> attachments, is intended solely for use by the individual or entity named
> above and may be confidential. If the reader of this message is not the
> intended recipient, you are hereby notified that you must not read, use,
> disclose, distribute or copy any part of this communication. If you have
> received this communication in error, please immediately notify me by email
> and destroy the original message, including any attachments. Thank you.
> **Hibu IT Code:141459300**
>


Re: Is this list alive? I need help

2024-02-23 Thread Walter Underwood
First, a shared disk is not a good idea. Each node should have its own local 
disk. Solr makes heavy use of the disk.

If the indexes are shared, I’m surprised it works at all. Solr is not designed 
to share indexes.

Please share the full query string.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Feb 23, 2024, at 10:01 AM, Beale, Jim (US-KOP) 
>  wrote:
> 
> I have a Solrcloud installation of three servers on three r5.xlarge EC2 with 
> a shared disk drive using EFS and stunnel.
>  
> I have documents coming in about 2 per day and I am trying to perform 
> indexing along with some regular queries and some special queries for some 
> new functionality.
>  
> When I just restart Solr, these queries run very fast but over time become 
> slower and slower.
>  
> This is typical for the numbers. At time1, the request only took 2.16 sec but 
> over night the response took 18.137 sec. That is just typical.
>  
> businessId, all count, reduced count, time1, time2
> 7016274253,8433,4769,2.162,18.137
>  
> The same query is so far different. Overnight the Solr servers slow down and 
> give terrible response. I don’t even know if this list is alive.
>  
>  
> Jim Beale
> Lead Software Engineer
> hibu.com
> 2201 Renaissance Boulevard, King of Prussia, PA, 19406
> Office: 610-879-3864
> Mobile: 610-220-3067
>  
> 
>  
> The information contained in this email message, including any attachments, 
> is intended solely for use by the individual or entity named above and may be 
> confidential. If the reader of this message is not the intended recipient, 
> you are hereby notified that you must not read, use, disclose, distribute or 
> copy any part of this communication. If you have received this communication 
> in error, please immediately notify me by email and destroy the original 
> message, including any attachments. Thank you. **Hibu IT Code:141459300**



Re: Run solr in cloud mode and debug from intellij

2024-02-23 Thread Jan Høydahl
Hi,

Thanks for this contrib.

My only concern is that it is duplicating what is already explained in 
https://github.com/apache/solr/blob/main/dev-docs/solr-source-code.adoc 
But that file is also too hard to find, so agree the main README should at 
least provide a hint on how to build and run locally.
Is it possible to shorten it down a tad in the PR and link directly to 
solr-source-code.adoc?

Jan

> 23. feb. 2024 kl. 18:03 skrev Vincenzo D'Amore :
> 
> Hi Christine, Mikhail, I have prepared a little change to Solr docs. I
> have submitted the PR.
> 
> https://github.com/apache/solr/pull/2294
> 
> Please let me know if anything is wrong with it or missing.
> 
> Best regards,
> Vincenzo
> 
> 
> 
> On Tue, Feb 20, 2024 at 11:38 AM Christine Poerschke (BLOOMBERG/ LONDON) <
> cpoersc...@bloomberg.net > wrote:
> 
>> Hello Rajani and Vincenzo,
>> 
>> Thank you for your valuable feedback on the setup experience!
>> 
>> Absolutely, yes please to updating the docs. No JIRA ticket needed IMHO.
>> 
>> Best wishes,
>> Christine
>> 
>> From: users@solr.apache.org At: 02/20/24 10:34:23 UTCTo:
>> users@solr.apache.org
>> Cc:  solr-u...@lucene.apache.org
>> Subject: Re: Run solr in cloud mode and debug from intellij
>> 
>> Should I create an issue (improvement) on jira first?
>> 
>> 
>> On Tue, Feb 20, 2024 at 11:29 AM Vincenzo D'Amore 
>> wrote:
>> 
>>> Thanks, I'm going to do it.
>>> 
>>> On Tue, Feb 20, 2024 at 11:23 AM Mikhail Khludnev 
>> wrote:
>>> 
 Yeah. Agree. A little bit cryptic. I'm up to approve your PR!
 
 On Tue, Feb 20, 2024 at 12:59 PM Vincenzo D'Amore 
 wrote:
 
> Hi Mikhail, thanks for sharing.
> Maybe I haven't read the documentation carefully, but I had the same
> trouble, I spent a few hours struggling to understand how to build and
> debug Solr locally.
> What do you think if we mention this resource directly into
> https://github.com/apache/solr/blob/main/dev-docs/README.adoc
> or even better into
> 
 
>> https://github.com/apache/solr/tree/main?tab=readme-ov-file#get-involved
> ?
> 
> On Tue, Feb 20, 2024 at 7:20 AM Mikhail Khludnev 
 wrote:
> 
>> Hello Rajani
>> This might be particularly useful
>> 
 https://github.com/apache/solr/blob/main/dev-docs/solr-source-code.adoc
>> 
>> ./gradlew dev will create a Solr executable suitable for
>> development.
>> Change directories via cd ./solr/packaging/build/dev and run the
>> bin/solr script
>> to start Solr. It will also create a "slim" Solr executable based on
 the
>> "slim" Solr distribution. You can find this environment at
>> ./solr/packaging/build/dev-slim. Use either ./gradlew devSlim or
> ./gradlew
>> devFull to create just one type of distribution.
>> Do you need to debug some of SolrCloudTestCase subc(l)a(s)ses or
>> want
 to
>> debug solr cloud instances?
>> 
>> On Mon, Feb 19, 2024 at 8:02 PM rajani m 
 wrote:
>> 
>>> Hi Solr Devs,
>>> 
>>>   Are there any docs to debug solr in cloud mode from intellij? I
> found
>>> this  [1]
>>> article which covers what I am looking for but it is from 2015,
 could
> you
>>> take a look and verify if all still apply as of today? In that
 article,
>> are
>>> the "ant" steps still valid for the current version?
>>> 
>>> I started by following the steps from making-a-new-contribution
>>> <
>>> 
>> 
> 
 
>> 
>> https://github.com/apache/solr/blob/main/dev-docs/how-to-contribute.adoc#making-
>> a-new-contribution
>> 
 
>>> article
>>> and finished the first 3 steps. The build was successful.  I also
 have
>> the
>>> solr setup on intellij and able to run individual tests in debug
 mode.
>> From
>>> the article[1], I tried the start command from the bin
>>> directory(/Users/rajani/projects/solr/solr) and got a class not
 found
>>> exception "Caused by: org.apache.solr.cli.SolrCLI"  Did I miss any
 more
>>> build steps?  Could you please help me with this?
>>> 
>>> Thank you for looking into this and appreciate your help.
>>> 
>>> [1] http://visitamaresh.com/debug-solr-cloud-remote-and-local/
>>> 
>> 
>> 
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> 
> 
> 
> --
> Vincenzo D'Amore
> 
 
 
 --
 Sincerely yours
 Mikhail Khludnev
 
>>> 
>>> 
>>> --
>>> Vincenzo D'Amore
>>> 
>>> 
>> 
>> --
>> Vincenzo D'Amore
>> 
>> 
>> 
> 
> -- 
> Vincenzo D'Amore



Re: Is this list alive? I need help

2024-02-23 Thread Jan Høydahl
I think EFS is a terribly slow file system to use for Solr, who recommended it? 
:)
Better use one EBS per node.
Not sure if the gradually slower performance is due to EFS though. We need to 
know more about your setup to get a clue. What role does stunnel play here? How 
are you indexing the content etc.

Jan

> 23. feb. 2024 kl. 19:58 skrev Walter Underwood :
> 
> First, a shared disk is not a good idea. Each node should have its own local 
> disk. Solr makes heavy use of the disk.
> 
> If the indexes are shared, I’m surprised it works at all. Solr is not 
> designed to share indexes.
> 
> Please share the full query string.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Feb 23, 2024, at 10:01 AM, Beale, Jim (US-KOP) 
>>  wrote:
>> 
>> I have a Solrcloud installation of three servers on three r5.xlarge EC2 with 
>> a shared disk drive using EFS and stunnel.
>> 
>> I have documents coming in about 2 per day and I am trying to perform 
>> indexing along with some regular queries and some special queries for some 
>> new functionality.
>> 
>> When I just restart Solr, these queries run very fast but over time become 
>> slower and slower.
>> 
>> This is typical for the numbers. At time1, the request only took 2.16 sec 
>> but over night the response took 18.137 sec. That is just typical.
>> 
>> businessId, all count, reduced count, time1, time2
>> 7016274253,8433,4769,2.162,18.137
>> 
>> The same query is so far different. Overnight the Solr servers slow down and 
>> give terrible response. I don’t even know if this list is alive.
>> 
>> 
>> Jim Beale
>> Lead Software Engineer
>> hibu.com
>> 2201 Renaissance Boulevard, King of Prussia, PA, 19406
>> Office: 610-879-3864
>> Mobile: 610-220-3067
>> 
>> 
>> 
>> The information contained in this email message, including any attachments, 
>> is intended solely for use by the individual or entity named above and may 
>> be confidential. If the reader of this message is not the intended 
>> recipient, you are hereby notified that you must not read, use, disclose, 
>> distribute or copy any part of this communication. If you have received this 
>> communication in error, please immediately notify me by email and destroy 
>> the original message, including any attachments. Thank you. **Hibu IT 
>> Code:141459300**
> 



Re: firstSearcher listener replaying queries 3 times

2024-02-23 Thread Chris Hostetter


The obvious answer that comes to mind is that your collection has 3 shards 
and you have one replica for each shard on the node where you see this 
listern triggering 3 times on collection reload.  (or some other situation 
that causes 3 replicas on this one node)

firstSearcher and newSearcher events are processed on individual 
SolrIndexSearchers -- and each replica has it's own SolrIndexSearchers

: Date: Tue, 13 Feb 2024 16:57:59 -0500
: From: rajani m 
: Reply-To: users@solr.apache.org
: To: solr-user 
: Subject: firstSearcher listener replaying queries 3 times
: 
: Hi Solr Users,
: 
:   The first searcher listener replays the list of queries under the
: listener list 3 times, wondering what could be the reason for it?
: 
: In the below example, when the collection is reloaded, the "q" is replayed
: 3 times, I expected it to be once.  Is it a bug or the first searcher
: triggers any other listener?
: 
: 
: 
: cats
: 
: 
: 

-Hoss
http://www.lucidworks.com/


Re: 3x+ performance reduction for the prefixed wildcard fl (like fl=abc_*) in 9.5.0 compared to 9.4.1

2024-02-23 Thread Ishan Chattopadhyaya
Awesome. Please feel free to open a JIRA issue about it.

On Fri, 23 Feb, 2024, 5:03 pm Oleksandr Tkachuk, 
wrote:

> We have ~17 dynamic fields like abc_xxx, and requests like
> /select?fl=abc_* took ~180ms with 9.4.1, but after upgrading to 9.5.0
> such requests now take ~620ms to execute.
>
> It seems in 9.5.0
> org.apache.solr.common.util.GlobPatternUtil.matches
> used instead of
> org.apache.commons.io.FilenameUtils.wildcardMatch
> Which leads to huge losses in performance.
>
> Here is the call tree from the profiler:
> 9.4.1
> https://i.imgur.com/2gubfDr.png
>
> 9.5.0
> https://i.imgur.com/JIZ1E9u.png
>