Thanks Mikhail Will get this checked
*Thanks & Regards,* *Uday Kumar* *Product Search Tech* On Fri, Apr 4, 2025 at 3:51 PM Mikhail Khludnev <m...@apache.org> wrote: > You may try to experiment with query time {!join mode=score max} to join > from prods to supplier. > Query time reduces supplier reindex burden. > And you'll get a scored and cropped page of suppliers, but completely lost > product hits. > Then you can search for products by supplier again in [subquery] result > transformer, it's going to be slow since it proceeds one by one. > It seems like custom development is needed for either bulking subquery > calls, maybe custom {!join} impl, > or perhaps complete https://issues.apache.org/jira/browse/SOLR-7830 after > all. > > > > On Fri, Apr 4, 2025 at 12:02 PM Uday Kumar <uday.p...@indiamart.com > .invalid> > wrote: > > > Hi all, > > In our index, we have data of suppliers along with their products which > we > > display on front-end, wrt search requests. > > > > > > > > *Example:For a supplier with id: 678, we have 2 products in our index* > > *product-id(unique)* > > *document1:* > > { > > product-id: 123 > > product-price: 2000rs > > product-name: Jute bags > > > > *supplier-id: 678company-name: BagFactoryLimited* > > } > > > > *document2:* > > { > > product-id: 863 > > product-price: 4500rs > > product-name: trolley bags > > > > *supplier-id: 678company-name: BagFactoryLimited* > > } > > > > > > > > *As you can see from above, each document in our index containsproduct > > details i.e product-id, product-price, product-nameand also supplier > > details i.e supplier-id, company-name* > > > > *Problem1: (while indexing)* > > Here, whenever there is a change in supplier specific details/field, we > are > > re-indexing all the products of the supplier although the supplier data > > will be the same in all of his products. > > *FYI* > > We re-index ~5Cr documents per day > > > > > > *We would like to know, if there is any better way to optimize this which > > helps to avoid indexing of redundant data* > > *Problem2: (while querying)* > > Now, when the data in our current index is queried, we display the single > > most relevant product of a supplier. [even if the query matches 1 or more > > documents in our index] > > > > For this we are using a *collapse query* on supplier-id field (as we dont > > know relationship between documents) [which is resource intensive] > > *Ex:* > > fq={!collapse field=supplier-id} > > > > *FYI* > > We serve ~25 Lakh Queries per day > > > > *We would like to know if there is any better way to organize index, so > > that we can avoid such resource intensive queries, thereby optimizing > > search response* > > > > *Our Solr Infra Stats: FYI* > > *Version:* v9.6.1 > > *No. of nodes:* 8 > > *No. of shards:* 62 > > *Heap per node: *12G > > *RAM per node: *50G > > *No. of cpu cores per node: *16 > > *Count of docs:* ~20Cr > > *Size of Index: *~250G > > *Routing used:* implicit > > > > *Thanks & Regards,* > > *Uday Kumar* > > *Product Search Tech* > > > > > -- > Sincerely yours > Mikhail Khludnev >