Thanks Mikhail

Will get this checked

*Thanks & Regards,*
*Uday Kumar*
*Product Search Tech*


On Fri, Apr 4, 2025 at 3:51 PM Mikhail Khludnev <m...@apache.org> wrote:

> You may try to experiment with query time {!join mode=score max} to join
> from prods to supplier.
> Query time reduces supplier reindex burden.
> And you'll get a scored and cropped page of suppliers, but completely lost
> product hits.
> Then you can search for products by supplier again in [subquery] result
> transformer, it's going to be slow since it proceeds one by one.
> It seems like custom development is needed for either bulking subquery
> calls, maybe custom {!join} impl,
> or perhaps complete https://issues.apache.org/jira/browse/SOLR-7830 after
> all.
>
>
>
> On Fri, Apr 4, 2025 at 12:02 PM Uday Kumar <uday.p...@indiamart.com
> .invalid>
> wrote:
>
> > Hi all,
> > In our index, we have data of suppliers along with their products which
> we
> > display on front-end, wrt search requests.
> >
> >
> >
> > *Example:For a supplier with id: 678, we have 2 products in our index*
> > *product-id(unique)*
> > *document1:*
> > {
> > product-id: 123
> > product-price: 2000rs
> > product-name: Jute bags
> >
> > *supplier-id: 678company-name: BagFactoryLimited*
> > }
> >
> > *document2:*
> > {
> > product-id: 863
> > product-price: 4500rs
> > product-name: trolley bags
> >
> > *supplier-id: 678company-name: BagFactoryLimited*
> > }
> >
> >
> >
> > *As you can see from above, each document in our index containsproduct
> > details i.e product-id, product-price, product-nameand also supplier
> > details i.e supplier-id, company-name*
> >
> > *Problem1: (while indexing)*
> > Here, whenever there is a change in supplier specific details/field, we
> are
> > re-indexing all the products of the supplier although the supplier data
> > will be the same in all of his products.
> > *FYI*
> > We re-index ~5Cr documents per day
> >
> >
> > *We would like to know, if there is any better way to optimize this which
> > helps to avoid indexing of redundant data*
> > *Problem2: (while querying)*
> > Now, when the data in our current index is queried, we display the single
> > most relevant product of a supplier. [even if the query matches 1 or more
> > documents in our index]
> >
> > For this we are using a *collapse query* on supplier-id field (as we dont
> > know relationship between documents) [which is resource intensive]
> > *Ex:*
> > fq={!collapse field=supplier-id}
> >
> > *FYI*
> > We serve ~25 Lakh Queries per day
> >
> > *We would like to know if there is any better way to organize index, so
> > that we can avoid such resource intensive queries, thereby optimizing
> > search response*
> >
> > *Our Solr Infra Stats: FYI*
> > *Version:* v9.6.1
> > *No. of nodes:* 8
> > *No. of shards:* 62
> > *Heap per node: *12G
> > *RAM per node: *50G
> > *No. of cpu cores per node: *16
> > *Count of docs:* ~20Cr
> > *Size of Index: *~250G
> > *Routing used:* implicit
> >
> > *Thanks & Regards,*
> > *Uday Kumar*
> > *Product Search Tech*
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Reply via email to