Re: Suggester index replication

2023-03-02 Thread gnandre
Can anybody please answer this? Many thanks in advance!

On Wed, Feb 16, 2022 at 12:52 AM gnandre  wrote:

> Is there a way to get suggester index replicated to all search nodes from
> index node? Do I need to build suggester index for each search node
> separately?
>


Donating to Solr

2023-03-02 Thread gnandre
I find this open source project very useful. Is there any way to donate
money for it?


Re: Suggester index replication

2023-03-02 Thread Walter Underwood
You need to send a build request to each node. I used to have some code to dig 
out the nodes from a cluster status, then send a build to each one, but I think 
that is marooned at my previous company. It isn’t super hard, just dig it out 
of the JSON.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 2, 2023, at 9:03 AM, gnandre  wrote:
> 
> Can anybody please answer this? Many thanks in advance!
> 
> On Wed, Feb 16, 2022 at 12:52 AM gnandre  wrote:
> 
>> Is there a way to get suggester index replicated to all search nodes from
>> index node? Do I need to build suggester index for each search node
>> separately?
>> 



Re: Donating to Solr

2023-03-02 Thread Doug Turnbull
Not sure about Solr, but you can donate to the Apache Software Foundation:

https://www.apache.org/foundation/contributing.html

On Thu, Mar 2, 2023 at 12:04 PM gnandre  wrote:

> I find this open source project very useful. Is there any way to donate
> money for it?
>


Re: Donating to Solr

2023-03-02 Thread gnandre
Thanks!

On Thu, Mar 2, 2023, 1:02 PM Doug Turnbull
 wrote:

> Not sure about Solr, but you can donate to the Apache Software Foundation:
>
> https://www.apache.org/foundation/contributing.html
>
> On Thu, Mar 2, 2023 at 12:04 PM gnandre  wrote:
>
> > I find this open source project very useful. Is there any way to donate
> > money for it?
> >
>


Re: Suggester index replication

2023-03-02 Thread gnandre
Thanks! I am using non-cloud mode at the moment. So, there is no way to
just index it to the index node and get it replicated to the search nodes?
Do I have to index to each search node?

Do you know why the suggester indexing does not follow the usual search
indexing model?

On Thu, Mar 2, 2023, 12:22 PM Walter Underwood 
wrote:

> You need to send a build request to each node. I used to have some code to
> dig out the nodes from a cluster status, then send a build to each one, but
> I think that is marooned at my previous company. It isn’t super hard, just
> dig it out of the JSON.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Mar 2, 2023, at 9:03 AM, gnandre  wrote:
> >
> > Can anybody please answer this? Many thanks in advance!
> >
> > On Wed, Feb 16, 2022 at 12:52 AM gnandre 
> wrote:
> >
> >> Is there a way to get suggester index replicated to all search nodes
> from
> >> index node? Do I need to build suggester index for each search node
> >> separately?
> >>
>
>


Re: Donating to Solr

2023-03-02 Thread Jan Høydahl
I know the project has been looking for more hardware resources for running 
tests, and some companies are currently sponsoring benchmark hardware and 
Jenkins servers for other OS'es. So that is one non-money way to contribute. In 
any case you could start the dialogue with Apache centrally and they will 
contact the Solr project for coordination of how to channel solr-labeled 
donations.

Jan

> 2. mar. 2023 kl. 19:36 skrev gnandre :
> 
> Thanks!
> 
> On Thu, Mar 2, 2023, 1:02 PM Doug Turnbull
>  wrote:
> 
>> Not sure about Solr, but you can donate to the Apache Software Foundation:
>> 
>> https://www.apache.org/foundation/contributing.html
>> 
>> On Thu, Mar 2, 2023 at 12:04 PM gnandre  wrote:
>> 
>>> I find this open source project very useful. Is there any way to donate
>>> money for it?
>>> 
>> 



Re: Donating to Solr

2023-03-02 Thread David Smiley
Another way to donate if it's a significant sum is to fund Outreachy
https://www.outreachy.org with some note that it's intended for sponsoring
an Apache Solr based intern project.  This basically pays stipends to an
intern.  Unfortunately, sending money to the ASF doesn't work to fund this
sort of thing due to ASF's policies to avoid any appearance of
pay-for-work.  I'm wrapping up an Outreachy project as a mentor but it
almost didn't happen due to lack of funding.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Mar 2, 2023 at 1:47 PM Jan Høydahl  wrote:

> I know the project has been looking for more hardware resources for
> running tests, and some companies are currently sponsoring benchmark
> hardware and Jenkins servers for other OS'es. So that is one non-money way
> to contribute. In any case you could start the dialogue with Apache
> centrally and they will contact the Solr project for coordination of how to
> channel solr-labeled donations.
>
> Jan
>
> > 2. mar. 2023 kl. 19:36 skrev gnandre :
> >
> > Thanks!
> >
> > On Thu, Mar 2, 2023, 1:02 PM Doug Turnbull
> >  wrote:
> >
> >> Not sure about Solr, but you can donate to the Apache Software
> Foundation:
> >>
> >> https://www.apache.org/foundation/contributing.html
> >>
> >> On Thu, Mar 2, 2023 at 12:04 PM gnandre 
> wrote:
> >>
> >>> I find this open source project very useful. Is there any way to donate
> >>> money for it?
> >>>
> >>
>
>


Re: Number of Collections in a SolrCloud

2023-03-02 Thread David Smiley
I second Brian's experience.  Specific version & numbers reached vary
somewhat.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Jun 28, 2021 at 7:23 PM Brian Lininger 
wrote:

> Hi Matt,
> We're currently running Solr 6.6.6 using Solr Cloud.  Depending on the
> application and load, we've been able to stably run upwards of 1,000
> collections without a problem in a single SolrCloud.  We try to keep the
> total replica count per Solr instance to less than 500, but have run
> 600-700 replicas per Solr instance without issue if the user load is
> light.  Our Solr document sizes are pretty large, but we're able to handle
> 80-90M docs per instance with 700-800G of total index size.  300B docs does
> seem quite large, but if the size of your docs aren't huge and you've got
> enough shards in your collection then I wouldn't be surprised if it worked
> fine.  The only thing we learned is that we had to change the number of
> threads Solr uses for loading replicas because of our high numbers 8
> threads would take forever upon startup (look at 'coreLoadThreads') .  At
> the very least, perf test out something on a similar scale of what you're
> thinking and see how it scales.
> Best of Luck,
> Brian
>
> On Mon, Jun 28, 2021 at 12:50 PM mtn search  wrote:
>
> > I am guessing the consideration of hitting the limit of the number of
> > collections within a SolrCloud is not a common experience.  I wanted to
> > raise this question again if perhaps anyone has any lessons learned or
> > things to consider.  We are currently planning work to migrate 300
> billion
> > plus docs on the master nodes of a legacy master/slave installation to
> > SolrCloud.  I figure that we will push the limits of a single SolrCloud
> > instance.
> >
> > Thanks again,
> > Matt
> >
> > On Fri, Jun 25, 2021 at 10:15 AM mtn search  wrote:
> >
> > > Hello,
> > >
> > > I am interested to learn what others have experienced in terms of
> hitting
> > > a limit for the number of collections supported by a SolrCloud
> instance.
> > >
> > > Also, does anyone have any tips/questions for evaluating when to
> create a
> > > new SolrCloud and begin adding new collections to it rather than grow
> the
> > > original SolrCloud instance?
> > >
> > > I realize there are likely a number of characteristics of a SolrCloud
> to
> > > evaluate.  My guess is network resources will be the key factor.  I am
> > > thinking of a SolrCloud with a 5, or 7 node Zookeeper ensemble.  With
> > > Collections containing 10-30 million docs, small doc size, heavy
> > indexing,
> > > small query load.
> > >
> > > Thanks,
> > > Matt
> > >
> >
>
>
> --
>
>
> *Brian Lininger*
> Technical Architect, Infrastructure & Search
> *Veeva Systems *
> brian.linin...@veeva.com
>
> *Zoom:* https://veeva.zoom.us/j/8113896271
>
> www.veeva.com
>
>
> *This email and the information it contains are intended for the intended
> recipient only, are confidential and may be privileged information exempt
> from disclosure by law.*
> *If you have received this email in error, please notify us immediately by
> reply email and delete this message from your computer.*
> *Please do not retain, copy or distribute this email.*
>


Hadoop Auth module / HadoopAuthPlugin; anyone using?

2023-03-02 Thread David Smiley
Is anyone using the hadoop-auth Solr module?  It's called this in Solr 9;
in Solr 8 and previously it was a part of Solr-core.  HadoopAuthPlugin is
the class name.  Apparently it supports Kerberos auth; perhaps more things.

There is some support burden to maintaining Solr as a whole with this
included.
With this email here, I'm trying to ascertain if anyone would care if it
was outright removed.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: Reversed leftOuterJoin on clause returns incorrect results

2023-03-02 Thread Geren White
No worries I appreciate the response, I just got a jira account setup so
I'll open up a bug around it and might take a stab at fixing.

Thanks,

On Tue, Feb 28, 2023 at 8:16 PM t sornin  wrote:

> Hi Geren,
>
>
>
> Sorry about the initial response, I just looked over your expression and
> didn't read the full context.  I've experienced similar behavior when I've
> deviated (unintentionally) from the documentation.  I think it's worth
> raising a JIRA, SEs can definitely  benefit from better error
> responses/handling.
>
>
>
> Mathew
>
> On Tue, Feb 28, 2023 at 9:37 AM Geren White  wrote:
>
> > Yea sorry the question was if that's a bug and if it should throw an
> error?
> > The results right now are pretty confusing and I could see it leading to
> > some bugs.
> >
> > On Mon, Feb 27, 2023 at 5:23 PM t sornin  wrote:
> >
> > > Your join key is reversed.  It should be "on=item_id_2=item_id" which
> > only
> > > returns the left stream (first stream param for leftOuterJoin) since
> > there
> > > is no match.
> > >
> > > Hope this helps.
> > >
> > > Mathew
> > >
> > > On Mon, Feb 27, 2023, 4:25 PM Geren White  wrote:
> > >
> > > > Hello,
> > > >
> > > > When testing out joins in solr streams we noticed that when the on
> > clause
> > > > is reversed the results are incorrect and the join will return as if
> > > > everything matched.
> > > >
> > > > For example if you have steamA and streamB with the following tuples:
> > > >
> > > > streamA:
> > > > {
> > > >   item_id_1: "123",
> > > >   item_id_2: "456"
> > > > }
> > > >
> > > > streamB:
> > > > {
> > > >   item_id: "789",
> > > >   user_id: "0"
> > > > }
> > > >
> > > > Executing a stream like below:
> > > > leftOuterJoin(
> > > >   search(collection-a, q=*:*, fq="item_id_1:123",
> > > fl="item_id_1,item_id_2",
> > > > qt="/export", sort="item_id_2 desc"),
> > > >   search(collection-b,
> > > >
> fq="user_id:0",q="*:*",qt="/export",fl="item_id,user_id",sort="item_id
> > > > desc"),
> > > > on="item_id=item_id_2")
> > > >
> > > > This will return something like this where all tuples are joined even
> > > > though item_id doesn't match item_id_2:
> > > > {
> > > >   item_id_1: "123",
> > > >   item_id_2: "456",
> > > >   item_id: "789",
> > > >   user_id: "0"
> > > > }
> > > >
> > > > Note that the first column in the on clause is from the second table.
> > > >
> > > > Is this expected behavior? We're running solr 8.11.1 and noticed it
> > > > while setting up a new query. It's an easy fix to switch the on
> clause
> > > but
> > > > seems like it should throw an error or handle it properly. Happy to
> > open
> > > up
> > > > a bug ticket if this isn't expected.
> > > >
> > > > Thanks,
> > > > --
> > > > *Geren White | Senior Director, Engineering*
> > > > *(e)* ge...@1stdibs.com
> > > >
> > >
> >
> >
> > --
> > *Geren White | Senior Director, Engineering*
> > *(e)* ge...@1stdibs.com
> >
>


-- 
*Geren White | Senior Director, Engineering*
*(e)* ge...@1stdibs.com


Re: Suggester index replication

2023-03-02 Thread Walter Underwood
When we were using old style replication, I did have the suggester lexicon
replicated along with other config files, and I think I triggered a suggester 
build
on replication or maybe commit (which happens with every replication).
I remember it being kind of fussy to set up. You might want to set up an extra
downstream machine to play with until you get it right.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 2, 2023, at 10:42 AM, gnandre  wrote:
> 
> Thanks! I am using non-cloud mode at the moment. So, there is no way to
> just index it to the index node and get it replicated to the search nodes?
> Do I have to index to each search node?
> 
> Do you know why the suggester indexing does not follow the usual search
> indexing model?
> 
> On Thu, Mar 2, 2023, 12:22 PM Walter Underwood 
> wrote:
> 
>> You need to send a build request to each node. I used to have some code to
>> dig out the nodes from a cluster status, then send a build to each one, but
>> I think that is marooned at my previous company. It isn’t super hard, just
>> dig it out of the JSON.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>> On Mar 2, 2023, at 9:03 AM, gnandre  wrote:
>>> 
>>> Can anybody please answer this? Many thanks in advance!
>>> 
>>> On Wed, Feb 16, 2022 at 12:52 AM gnandre 
>> wrote:
>>> 
 Is there a way to get suggester index replicated to all search nodes
>> from
 index node? Do I need to build suggester index for each search node
 separately?
 
>> 
>> 



Result Grouping on alias collection

2023-03-02 Thread Vinayak Hegde
Hello everyone,
I hope this email finds you well. I am reaching out to discuss a strange
situation we are facing with result grouping.
We currently have two collections, CollectionA and CollectionB, both of
which contain an identical document, document1. We have created a new alias
collection that includes both CollectionA and CollectionB.
However, when attempting to perform result grouping on this new alias
collection, we are encountering an issue where two instances of document1
appear in the output.
http://10.144.10.36:8983/solr/aliasCollection/select?q=id:document1&rows=40&group=true&group.field=fieldA&group.limit=20
I have attempted to locate official documentation regarding this issue, but
have been unsuccessful. The closest resource I found was this link:
https://markmail.org/message/2ykh7wyexbnquc6s?q=list:org.apache.lucene.solr-user
.
Please let me know if you have any insights or suggestions on how to
resolve this issue.
Thank you for your time and attention.

Best regards,
Vinayak Hegde


Re: Result Grouping on alias collection

2023-03-02 Thread Mikhail Khludnev
Hello Vinayak.
Please find the second caveat
https://solr.apache.org/guide/solr/latest/query-guide/result-grouping.html#distributed-result-grouping-caveats
Two collections are equivalent to two or more shards. Sic. Grouping is not
a full-fledged map-reduce engine.

On Fri, Mar 3, 2023 at 6:46 AM Vinayak Hegde  wrote:

> Hello everyone,
> I hope this email finds you well. I am reaching out to discuss a strange
> situation we are facing with result grouping.
> We currently have two collections, CollectionA and CollectionB, both of
> which contain an identical document, document1. We have created a new alias
> collection that includes both CollectionA and CollectionB.
> However, when attempting to perform result grouping on this new alias
> collection, we are encountering an issue where two instances of document1
> appear in the output.
>
> http://10.144.10.36:8983/solr/aliasCollection/select?q=id:document1&rows=40&group=true&group.field=fieldA&group.limit=20
> I have attempted to locate official documentation regarding this issue, but
> have been unsuccessful. The closest resource I found was this link:
>
> https://markmail.org/message/2ykh7wyexbnquc6s?q=list:org.apache.lucene.solr-user
> .
> Please let me know if you have any insights or suggestions on how to
> resolve this issue.
> Thank you for your time and attention.
>
> Best regards,
> Vinayak Hegde
>


-- 
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!