Re: Suggester index replication
Can anybody please answer this? Many thanks in advance! On Wed, Feb 16, 2022 at 12:52 AM gnandre wrote: > Is there a way to get suggester index replicated to all search nodes from > index node? Do I need to build suggester index for each search node > separately? >
Donating to Solr
I find this open source project very useful. Is there any way to donate money for it?
Re: Suggester index replication
You need to send a build request to each node. I used to have some code to dig out the nodes from a cluster status, then send a build to each one, but I think that is marooned at my previous company. It isn’t super hard, just dig it out of the JSON. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Mar 2, 2023, at 9:03 AM, gnandre wrote: > > Can anybody please answer this? Many thanks in advance! > > On Wed, Feb 16, 2022 at 12:52 AM gnandre wrote: > >> Is there a way to get suggester index replicated to all search nodes from >> index node? Do I need to build suggester index for each search node >> separately? >>
Re: Donating to Solr
Not sure about Solr, but you can donate to the Apache Software Foundation: https://www.apache.org/foundation/contributing.html On Thu, Mar 2, 2023 at 12:04 PM gnandre wrote: > I find this open source project very useful. Is there any way to donate > money for it? >
Re: Donating to Solr
Thanks! On Thu, Mar 2, 2023, 1:02 PM Doug Turnbull wrote: > Not sure about Solr, but you can donate to the Apache Software Foundation: > > https://www.apache.org/foundation/contributing.html > > On Thu, Mar 2, 2023 at 12:04 PM gnandre wrote: > > > I find this open source project very useful. Is there any way to donate > > money for it? > > >
Re: Suggester index replication
Thanks! I am using non-cloud mode at the moment. So, there is no way to just index it to the index node and get it replicated to the search nodes? Do I have to index to each search node? Do you know why the suggester indexing does not follow the usual search indexing model? On Thu, Mar 2, 2023, 12:22 PM Walter Underwood wrote: > You need to send a build request to each node. I used to have some code to > dig out the nodes from a cluster status, then send a build to each one, but > I think that is marooned at my previous company. It isn’t super hard, just > dig it out of the JSON. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > On Mar 2, 2023, at 9:03 AM, gnandre wrote: > > > > Can anybody please answer this? Many thanks in advance! > > > > On Wed, Feb 16, 2022 at 12:52 AM gnandre > wrote: > > > >> Is there a way to get suggester index replicated to all search nodes > from > >> index node? Do I need to build suggester index for each search node > >> separately? > >> > >
Re: Donating to Solr
I know the project has been looking for more hardware resources for running tests, and some companies are currently sponsoring benchmark hardware and Jenkins servers for other OS'es. So that is one non-money way to contribute. In any case you could start the dialogue with Apache centrally and they will contact the Solr project for coordination of how to channel solr-labeled donations. Jan > 2. mar. 2023 kl. 19:36 skrev gnandre : > > Thanks! > > On Thu, Mar 2, 2023, 1:02 PM Doug Turnbull > wrote: > >> Not sure about Solr, but you can donate to the Apache Software Foundation: >> >> https://www.apache.org/foundation/contributing.html >> >> On Thu, Mar 2, 2023 at 12:04 PM gnandre wrote: >> >>> I find this open source project very useful. Is there any way to donate >>> money for it? >>> >>
Re: Donating to Solr
Another way to donate if it's a significant sum is to fund Outreachy https://www.outreachy.org with some note that it's intended for sponsoring an Apache Solr based intern project. This basically pays stipends to an intern. Unfortunately, sending money to the ASF doesn't work to fund this sort of thing due to ASF's policies to avoid any appearance of pay-for-work. I'm wrapping up an Outreachy project as a mentor but it almost didn't happen due to lack of funding. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Mar 2, 2023 at 1:47 PM Jan Høydahl wrote: > I know the project has been looking for more hardware resources for > running tests, and some companies are currently sponsoring benchmark > hardware and Jenkins servers for other OS'es. So that is one non-money way > to contribute. In any case you could start the dialogue with Apache > centrally and they will contact the Solr project for coordination of how to > channel solr-labeled donations. > > Jan > > > 2. mar. 2023 kl. 19:36 skrev gnandre : > > > > Thanks! > > > > On Thu, Mar 2, 2023, 1:02 PM Doug Turnbull > > wrote: > > > >> Not sure about Solr, but you can donate to the Apache Software > Foundation: > >> > >> https://www.apache.org/foundation/contributing.html > >> > >> On Thu, Mar 2, 2023 at 12:04 PM gnandre > wrote: > >> > >>> I find this open source project very useful. Is there any way to donate > >>> money for it? > >>> > >> > >
Re: Number of Collections in a SolrCloud
I second Brian's experience. Specific version & numbers reached vary somewhat. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Jun 28, 2021 at 7:23 PM Brian Lininger wrote: > Hi Matt, > We're currently running Solr 6.6.6 using Solr Cloud. Depending on the > application and load, we've been able to stably run upwards of 1,000 > collections without a problem in a single SolrCloud. We try to keep the > total replica count per Solr instance to less than 500, but have run > 600-700 replicas per Solr instance without issue if the user load is > light. Our Solr document sizes are pretty large, but we're able to handle > 80-90M docs per instance with 700-800G of total index size. 300B docs does > seem quite large, but if the size of your docs aren't huge and you've got > enough shards in your collection then I wouldn't be surprised if it worked > fine. The only thing we learned is that we had to change the number of > threads Solr uses for loading replicas because of our high numbers 8 > threads would take forever upon startup (look at 'coreLoadThreads') . At > the very least, perf test out something on a similar scale of what you're > thinking and see how it scales. > Best of Luck, > Brian > > On Mon, Jun 28, 2021 at 12:50 PM mtn search wrote: > > > I am guessing the consideration of hitting the limit of the number of > > collections within a SolrCloud is not a common experience. I wanted to > > raise this question again if perhaps anyone has any lessons learned or > > things to consider. We are currently planning work to migrate 300 > billion > > plus docs on the master nodes of a legacy master/slave installation to > > SolrCloud. I figure that we will push the limits of a single SolrCloud > > instance. > > > > Thanks again, > > Matt > > > > On Fri, Jun 25, 2021 at 10:15 AM mtn search wrote: > > > > > Hello, > > > > > > I am interested to learn what others have experienced in terms of > hitting > > > a limit for the number of collections supported by a SolrCloud > instance. > > > > > > Also, does anyone have any tips/questions for evaluating when to > create a > > > new SolrCloud and begin adding new collections to it rather than grow > the > > > original SolrCloud instance? > > > > > > I realize there are likely a number of characteristics of a SolrCloud > to > > > evaluate. My guess is network resources will be the key factor. I am > > > thinking of a SolrCloud with a 5, or 7 node Zookeeper ensemble. With > > > Collections containing 10-30 million docs, small doc size, heavy > > indexing, > > > small query load. > > > > > > Thanks, > > > Matt > > > > > > > > -- > > > *Brian Lininger* > Technical Architect, Infrastructure & Search > *Veeva Systems * > brian.linin...@veeva.com > > *Zoom:* https://veeva.zoom.us/j/8113896271 > > www.veeva.com > > > *This email and the information it contains are intended for the intended > recipient only, are confidential and may be privileged information exempt > from disclosure by law.* > *If you have received this email in error, please notify us immediately by > reply email and delete this message from your computer.* > *Please do not retain, copy or distribute this email.* >
Hadoop Auth module / HadoopAuthPlugin; anyone using?
Is anyone using the hadoop-auth Solr module? It's called this in Solr 9; in Solr 8 and previously it was a part of Solr-core. HadoopAuthPlugin is the class name. Apparently it supports Kerberos auth; perhaps more things. There is some support burden to maintaining Solr as a whole with this included. With this email here, I'm trying to ascertain if anyone would care if it was outright removed. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley
Re: Reversed leftOuterJoin on clause returns incorrect results
No worries I appreciate the response, I just got a jira account setup so I'll open up a bug around it and might take a stab at fixing. Thanks, On Tue, Feb 28, 2023 at 8:16 PM t sornin wrote: > Hi Geren, > > > > Sorry about the initial response, I just looked over your expression and > didn't read the full context. I've experienced similar behavior when I've > deviated (unintentionally) from the documentation. I think it's worth > raising a JIRA, SEs can definitely benefit from better error > responses/handling. > > > > Mathew > > On Tue, Feb 28, 2023 at 9:37 AM Geren White wrote: > > > Yea sorry the question was if that's a bug and if it should throw an > error? > > The results right now are pretty confusing and I could see it leading to > > some bugs. > > > > On Mon, Feb 27, 2023 at 5:23 PM t sornin wrote: > > > > > Your join key is reversed. It should be "on=item_id_2=item_id" which > > only > > > returns the left stream (first stream param for leftOuterJoin) since > > there > > > is no match. > > > > > > Hope this helps. > > > > > > Mathew > > > > > > On Mon, Feb 27, 2023, 4:25 PM Geren White wrote: > > > > > > > Hello, > > > > > > > > When testing out joins in solr streams we noticed that when the on > > clause > > > > is reversed the results are incorrect and the join will return as if > > > > everything matched. > > > > > > > > For example if you have steamA and streamB with the following tuples: > > > > > > > > streamA: > > > > { > > > > item_id_1: "123", > > > > item_id_2: "456" > > > > } > > > > > > > > streamB: > > > > { > > > > item_id: "789", > > > > user_id: "0" > > > > } > > > > > > > > Executing a stream like below: > > > > leftOuterJoin( > > > > search(collection-a, q=*:*, fq="item_id_1:123", > > > fl="item_id_1,item_id_2", > > > > qt="/export", sort="item_id_2 desc"), > > > > search(collection-b, > > > > > fq="user_id:0",q="*:*",qt="/export",fl="item_id,user_id",sort="item_id > > > > desc"), > > > > on="item_id=item_id_2") > > > > > > > > This will return something like this where all tuples are joined even > > > > though item_id doesn't match item_id_2: > > > > { > > > > item_id_1: "123", > > > > item_id_2: "456", > > > > item_id: "789", > > > > user_id: "0" > > > > } > > > > > > > > Note that the first column in the on clause is from the second table. > > > > > > > > Is this expected behavior? We're running solr 8.11.1 and noticed it > > > > while setting up a new query. It's an easy fix to switch the on > clause > > > but > > > > seems like it should throw an error or handle it properly. Happy to > > open > > > up > > > > a bug ticket if this isn't expected. > > > > > > > > Thanks, > > > > -- > > > > *Geren White | Senior Director, Engineering* > > > > *(e)* ge...@1stdibs.com > > > > > > > > > > > > > -- > > *Geren White | Senior Director, Engineering* > > *(e)* ge...@1stdibs.com > > > -- *Geren White | Senior Director, Engineering* *(e)* ge...@1stdibs.com
Re: Suggester index replication
When we were using old style replication, I did have the suggester lexicon replicated along with other config files, and I think I triggered a suggester build on replication or maybe commit (which happens with every replication). I remember it being kind of fussy to set up. You might want to set up an extra downstream machine to play with until you get it right. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Mar 2, 2023, at 10:42 AM, gnandre wrote: > > Thanks! I am using non-cloud mode at the moment. So, there is no way to > just index it to the index node and get it replicated to the search nodes? > Do I have to index to each search node? > > Do you know why the suggester indexing does not follow the usual search > indexing model? > > On Thu, Mar 2, 2023, 12:22 PM Walter Underwood > wrote: > >> You need to send a build request to each node. I used to have some code to >> dig out the nodes from a cluster status, then send a build to each one, but >> I think that is marooned at my previous company. It isn’t super hard, just >> dig it out of the JSON. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >>> On Mar 2, 2023, at 9:03 AM, gnandre wrote: >>> >>> Can anybody please answer this? Many thanks in advance! >>> >>> On Wed, Feb 16, 2022 at 12:52 AM gnandre >> wrote: >>> Is there a way to get suggester index replicated to all search nodes >> from index node? Do I need to build suggester index for each search node separately? >> >>
Result Grouping on alias collection
Hello everyone, I hope this email finds you well. I am reaching out to discuss a strange situation we are facing with result grouping. We currently have two collections, CollectionA and CollectionB, both of which contain an identical document, document1. We have created a new alias collection that includes both CollectionA and CollectionB. However, when attempting to perform result grouping on this new alias collection, we are encountering an issue where two instances of document1 appear in the output. http://10.144.10.36:8983/solr/aliasCollection/select?q=id:document1&rows=40&group=true&group.field=fieldA&group.limit=20 I have attempted to locate official documentation regarding this issue, but have been unsuccessful. The closest resource I found was this link: https://markmail.org/message/2ykh7wyexbnquc6s?q=list:org.apache.lucene.solr-user . Please let me know if you have any insights or suggestions on how to resolve this issue. Thank you for your time and attention. Best regards, Vinayak Hegde
Re: Result Grouping on alias collection
Hello Vinayak. Please find the second caveat https://solr.apache.org/guide/solr/latest/query-guide/result-grouping.html#distributed-result-grouping-caveats Two collections are equivalent to two or more shards. Sic. Grouping is not a full-fledged map-reduce engine. On Fri, Mar 3, 2023 at 6:46 AM Vinayak Hegde wrote: > Hello everyone, > I hope this email finds you well. I am reaching out to discuss a strange > situation we are facing with result grouping. > We currently have two collections, CollectionA and CollectionB, both of > which contain an identical document, document1. We have created a new alias > collection that includes both CollectionA and CollectionB. > However, when attempting to perform result grouping on this new alias > collection, we are encountering an issue where two instances of document1 > appear in the output. > > http://10.144.10.36:8983/solr/aliasCollection/select?q=id:document1&rows=40&group=true&group.field=fieldA&group.limit=20 > I have attempted to locate official documentation regarding this issue, but > have been unsuccessful. The closest resource I found was this link: > > https://markmail.org/message/2ykh7wyexbnquc6s?q=list:org.apache.lucene.solr-user > . > Please let me know if you have any insights or suggestions on how to > resolve this issue. > Thank you for your time and attention. > > Best regards, > Vinayak Hegde > -- Sincerely yours Mikhail Khludnev https://t.me/MUST_SEARCH A caveat: Cyrillic!