Re: Ranking of duplicate documents on solr

2024-11-20 Thread Saksham Gupta
ain a different version internally which can differ from one visible?* *Is this the reason why the above hypothesis is failing?* Would appreciate any help regarding solr duplicity handling/ and my aforementioned doubts! On Thu, Aug 1, 2024 at 4:38 PM Saksham Gupta wrote: > Hi Deepak, > &

Re: Filtering Based on _version_ Field not working as expected

2024-11-20 Thread Saksham Gupta
Hi All, Requesting some assistance with this problem! On Wed, Nov 20, 2024 at 2:48 PM Saksham Gupta wrote: > Hi All, > > *Understanding of Duplicity Handling by Solr* > > As per an older discussion on solr community [ref mail: *Ranking of > duplicate documents on solr*], solr

Filtering Based on _version_ Field not working as expected

2024-11-20 Thread Saksham Gupta
Hi All, *Understanding of Duplicity Handling by Solr* As per an older discussion on solr community [ref mail: *Ranking of duplicate documents on solr*], solr handles duplicate documents [documents present in multiple shards], by preferring the document which is oldest according to indexed date, a

Re: java.lang.IllegalStateException on using group.facet=true in solr query

2024-10-15 Thread Saksham Gupta
information <https://solr.apache.org/guide/solr/latest/indexing-guide/docvalues.html#enabling-docvalues> On Tue, Oct 15, 2024 at 2:08 PM Mikhail Khludnev wrote: > Hello. Please find below. > > On Tue, Oct 15, 2024 at 10:37 AM Saksham Gupta > wrote: > > > Hi All, > > &g

java.lang.IllegalStateException on using group.facet=true in solr query

2024-10-15 Thread Saksham Gupta
Hi All, We are using solr 9.6.1 with a collection consisting of 59 shards [one document might be present on more than one shard]. We are trying to apply grouping based on suppliers and then apply facet based on one document of each supplier [although we are fetching more than one]. To achieve thi

Re: 5xx on Node Restart After Upgrading from Solr 8.10 to 9.6.1

2024-09-02 Thread Saksham Gupta
Hi there, Looking for a graceful way to restart a node in solr9, please help! On Mon, Sep 2, 2024 at 5:21 PM Saksham Gupta wrote: > Hi All, > We have encountered an issue while upgrading our solr cloud from v8.10 to > v9.6.1. We use a collection with 56 shards [each having a singl

5xx on Node Restart After Upgrading from Solr 8.10 to 9.6.1

2024-09-02 Thread Saksham Gupta
Hi All, We have encountered an issue while upgrading our solr cloud from v8.10 to v9.6.1. We use a collection with 56 shards [each having a single replica], hosted across a cluster of 8 nodes. Solr queries contain _route_ parameter to decide which shards/ replicas will be used for the respective qu

Re: Ranking of duplicate documents on solr

2024-08-01 Thread Saksham Gupta
ent wins the query race. > But remember, even in the digital cosmos, duplicates play by the > rules—mostly. > > > Deepak > "The greatness of a nation can be judged by the way its animals are treated > - Mahatma Gandhi" > > +91 73500 12833 > deic...@gmail.com > &

Re: Unusually High Number of timeouts on 1 Solr Shard

2024-07-30 Thread Saksham Gupta
after full indexing. More details on [solr community mail]: *Split Shards in Solr Collection with Implicit Routing* On Fri, Jul 19, 2024 at 3:33 PM Saksham Gupta wrote: > Hi Aman, > Yes, I mean the shard is having a size of 10 gb. The index was created > from scratch, so no recov

Re: Split Shards in Solr Collection with Implicit Routing

2024-07-29 Thread Saksham Gupta
Hi All, Pinging again for some help to split shards in collection with implicit routing. Thanks! On Mon, Jul 29, 2024 at 10:41 PM Saksham Gupta wrote: > After fidgeting with the split shards API for a few hours, I stumbled > across the line in solr documentation which states that `*split

Split Shards in Solr Collection with Implicit Routing

2024-07-29 Thread Saksham Gupta
After fidgeting with the split shards API for a few hours, I stumbled across the line in solr documentation which states that `*split shard api can only be used for SolrCloud collections created with numShards parameter, meaning collections which rely on Solr’s hash-based routing mechanism.*` Neve

Ranking of duplicate documents on solr

2024-07-29 Thread Saksham Gupta
Hi Solr Developers, Which solr document will be displayed if a duplicate instance of the same document is present? In our current solr architecture, there is a possibility that a document can move from one solr shard to another shard. While the document will eventually be deleted from its old sha

Re: Limit IO while running solr backup

2024-07-23 Thread Saksham Gupta
t; > As far as I know, there is no mechanism to specifically limit IOs, but I > achieved the same by limiting the number of snapshots concurrently done. > > > Le mer. 10 juil. 2024 à 08:27, Saksham Gupta > a écrit : > > > Hi All, > > Pinging again for some assistance

Re: Unusually High Number of timeouts on 1 Solr Shard

2024-07-19 Thread Saksham Gupta
gt; there is any old recovery issue due to which the old logs or index still > exist. > > If this shard is having 10gb of space, you please try to divide data. I > hope you can try in development environment before applying it on > production clusters. > > Regards, > Aman

Re: Unusually High Number of timeouts on 1 Solr Shard

2024-07-17 Thread Saksham Gupta
Hi All, Pinging again for assistance. This is a very unusual case, which is ruining user experience for a particular type of search [searches mapped in the replica facing timeouts] as these requests are taking more than 3 seconds. On Wed, Jul 17, 2024 at 11:37 AM Saksham Gupta wrote: > Hi

Unusually High Number of timeouts on 1 Solr Shard

2024-07-16 Thread Saksham Gupta
Hi All, We are using a solr cloud cluster of 59 shards [1 replica for each shard] spread across 8 nodes. We have used implicit routing for indexing and searching data across these shards. Upon analyzing the timeouts on solr, we have found that more than 85% [3097/3693 timeouts on 9th July] of the

Re: Limit IO while running solr backup

2024-07-09 Thread Saksham Gupta
Hi All, Pinging again for some assistance! On Tue, Jul 9, 2024 at 4:02 PM Saksham Gupta wrote: > Hi All, > > As an effort to enhance disaster recovery for solr, we have started a solr > backup process on a daily basis. The backup runs for each replica one after > the other,

Limit IO while running solr backup

2024-07-09 Thread Saksham Gupta
Hi All, As an effort to enhance disaster recovery for solr, we have started a solr backup process on a daily basis. The backup runs for each replica one after the other, after which an integrity check is executed to check if the index is having no faults. Although, throughout the backup, we exper

Re: Connection Timeouts while connecting with Zookeeper for Indexing on Solr

2024-07-09 Thread Saksham Gupta
-quot-maxClientCnxns-quot-option-in-zookeeper/m-p/341370#M233508 On Wed, Jun 26, 2024 at 11:06 AM Saksham Gupta wrote: > Hi All, > Pinging again for some assistance. > > On Tue, Jun 25, 2024 at 12:32 PM Saksham Gupta < > saksham.gu...@indiamart.com> wrote: > >> Hi

Re: Connection Timeouts while connecting with Zookeeper for Indexing on Solr

2024-06-25 Thread Saksham Gupta
Hi All, Pinging again for some assistance. On Tue, Jun 25, 2024 at 12:32 PM Saksham Gupta wrote: > Hi All, > We are facing timeouts while connecting with zookeeper for indexing. > > *Exception Details:* org.apache.solr.common.SolrException: > java.util.concurrent.TimeoutExcep

Connection Timeouts while connecting with Zookeeper for Indexing on Solr

2024-06-25 Thread Saksham Gupta
Hi All, We are facing timeouts while connecting with zookeeper for indexing. *Exception Details:* org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper 10.128.193.20:2181,10.128.193.21:2181,10.128.193.22:2181 within 15000 ms On checking this f

Re: Load on Solr Nodes due to High GC

2024-06-23 Thread Saksham Gupta
ten > or in one big batch? > > > On Jun 20, 2024, at 12:26 AM, Saksham Gupta > > > wrote: > > > > Hi All, > > > > We have been facing extra load incidents due to higher gc count and gc > time > > causing higher response time and timeouts. >

Load on Solr Nodes due to High GC

2024-06-19 Thread Saksham Gupta
Hi All, We have been facing extra load incidents due to higher gc count and gc time causing higher response time and timeouts. Solr Cloud Cluster Details We use solr cloud v8.10 [with java 8 and G1 GC] with 8 shards where each shard is present on a single vm of 16 cores and 50 gb RAM. Size of ea

Re: Deletion on Solr Failing due to _version_ conflict

2023-12-19 Thread Saksham Gupta
y is delete request with _version_ is failing. On Tue, Dec 19, 2023 at 10:05 PM Shawn Heisey wrote: > On 12/18/23 21:42, Saksham Gupta wrote: > > I have written a small script in python to delete a product from solr > cloud > > collection on the basis of unique id, _route_ and _v

Re: Deletion on Solr Failing due to _version_ conflict

2023-12-19 Thread Saksham Gupta
Hi All, Pinging again for some assistance. On Tue, Dec 19, 2023 at 10:12 AM Saksham Gupta wrote: > Hi All, > > I have written a small script in python to delete a product from solr > cloud collection on the basis of unique id, _route_ and _version_. > > I have extracted t

Deletion on Solr Failing due to _version_ conflict

2023-12-18 Thread Saksham Gupta
Hi All, I have written a small script in python to delete a product from solr cloud collection on the basis of unique id, _route_ and _version_. I have extracted the exact values of unique id, _route_ and _version_ from solr index and used them to delete the product. But my script gives an error

Re: Optimal Way to Index a Document on Multiple Solr Shards

2023-12-01 Thread Saksham Gupta
Hi All, Pinging again for assistance. On Fri, Dec 1, 2023 at 4:58 PM Saksham Gupta wrote: > Hi Solr Developers, > > We have been using solr cloud with implicit sharding. The data of the > collection has been sharded into 56 shards. > > Due to business requirements, we have cr

Optimal Way to Index a Document on Multiple Solr Shards

2023-12-01 Thread Saksham Gupta
Hi Solr Developers, We have been using solr cloud with implicit sharding. The data of the collection has been sharded into 56 shards. Due to business requirements, we have created a collection such that there is a chance that a single document could be indexed on multiple shards. To implement thi

Re: Prevent Loss of Documents after Implicit Sharding

2023-11-29 Thread Saksham Gupta
Hi All, Pinging again for some assistance. On Wed, Nov 29, 2023 at 7:11 PM Saksham Gupta wrote: > Hi Solr Developers, > > Problem Statement > > We have been using solr cloud with implicit sharding. The data of the > collection was divided into 8 shards. In order to reduce

Prevent Loss of Documents after Implicit Sharding

2023-11-29 Thread Saksham Gupta
Hi Solr Developers, Problem Statement We have been using solr cloud with implicit sharding. The data of the collection was divided into 8 shards. In order to reduce the response time, we thought of sharding the data further. Therefore we planned on sharding the solr data into 56 shards to reduce

Re: Best Practices for Solr Cloud Testing Infra Setup

2023-11-06 Thread Saksham Gupta
at lets you confirm your smaller test environment will match > the larger prod environment…. Or you just have test == prod ;-). > > > > On Nov 3, 2023, at 12:26 AM, Saksham Gupta > > > wrote: > > > > Hi Solr Developers, > > > > Reaching out to inquire a

Re: Best Practices for Solr Cloud Testing Infra Setup

2023-11-03 Thread Saksham Gupta
Hi All, Pinging again for some assistance regarding the aforementioned problem. On Fri, Nov 3, 2023 at 9:56 AM Saksham Gupta wrote: > Hi Solr Developers, > > Reaching out to inquire about the best practices to set up staging and dev > environments for solr cloud. > > We ar

Best Practices for Solr Cloud Testing Infra Setup

2023-11-02 Thread Saksham Gupta
Hi Solr Developers, Reaching out to inquire about the best practices to set up staging and dev environments for solr cloud. We are using solr cloud with a cluster of 8 nodes ~25 gb of data(15-16 million docs) present on each shard. The resources used are optimized to server heavy indexing/searchi

Optimal Sharding Strategy for Solr Cloud v8.10

2023-09-13 Thread Saksham Gupta
Hi All, I have been trying to reduce the response time of solr cloud(v8.10, 8 nodes). To achieve this, I have tried increasing the number of shards of solr cloud which can help reduce data size on each shard thereby reducing response time. I have encountered a few questions regarding sharding st

Re: AutoWarming Solr Core before Adding to the Solr Cloud Cluster

2023-07-03 Thread Saksham Gupta
lr cluster. > Approach for this problem described > > https://docs.cloudera.com/runtime/7.2.10/search-managing/topics/search-migrate-replicas.html > ? > > > On Wed, Jun 28, 2023 at 1:50 PM Saksham Gupta > wrote: > > > Hi, > > > > Is there a way of auto

Re: Solr Cloud Backup Strategy and Data Corruption Prevention

2023-06-29 Thread Saksham Gupta
cating the indexes to another data > directory, and then our organisations backup scheduling backs up the data > each night for however long we set it to roll over for.as you can > imagine, if you have large indexes you could with rolling backups be > storing a huge amount of data so that ne

AutoWarming Solr Core before Adding to the Solr Cloud Cluster

2023-06-28 Thread Saksham Gupta
Hi, Is there a way of automwarming solr cores/shards before adding them to the solr cloud cluster? We are using solr cloud with a cluster of 8 nodes where each node consists of 1 shard of a collection. It is very troublesome to restart a solr node for some maintenance activity as it leads to mult

Re: Solr Cloud Backup Strategy and Data Corruption Prevention

2023-06-27 Thread Saksham Gupta
Hi All, Any help regarding this problem. What is the standard practice to create backup on solr cloud? On Tue, Jun 27, 2023 at 5:57 PM Saksham Gupta wrote: > Hi Solr Developers, > Reaching out to inquire about the best practices for implementing a backup > strategy in Solr Cloud. We

Solr Cloud Backup Strategy and Data Corruption Prevention

2023-06-27 Thread Saksham Gupta
Hi Solr Developers, Reaching out to inquire about the best practices for implementing a backup strategy in Solr Cloud. We recently migrated from Solr standalone (solr6.5) to Solr 8.10, where we have a collection with data divided among 8 shards using implicit routing. Until now, we have maintained

Re: Search Request Strategy on Solr Cloud

2023-06-19 Thread Saksham Gupta
Thanks Mikhail, will try these approaches. On Thu, Jun 15, 2023 at 5:40 PM Mikhail Khludnev wrote: > From the other POV, a node can be excluded from LB pool via balancer API > before restart and brought back then. > > On Wed, Jun 14, 2023 at 6:09 PM Saksham Gupta > wrote:

Re: Search Request Strategy on Solr Cloud

2023-06-14 Thread Saksham Gupta
olling restart/recycle scenarios executed? > > On Wed, Jun 14, 2023 at 8:52 AM Saksham Gupta > wrote: > > > @Ufuk We are using a load balancer to avoid a single point of failure > i.e. > > if all the requests have a single coordinator node then it would be a > ma

Re: Search Request Strategy on Solr Cloud

2023-06-13 Thread Saksham Gupta
Also, Google cloud has sophisticated Traffic Director, which can also > > suit > > > for node failover. > > > > > > On Tue, Jun 13, 2023 at 9:13 AM Saksham Gupta > > > wrote: > > > > > >> Hi team, > > >> We need help with the

Search Request Strategy on Solr Cloud

2023-06-12 Thread Saksham Gupta
Hi team, We need help with the strategy used to request data from solr cloud. *Current Searching Strategy:* We are using solr cloud 8.10 having 8 nodes with data sharded on the basis of an implicit route parameter. We send a search http request on google's network load balancer which divides reque

Re: Tuning Merge Settings for Solr Cloud

2023-05-18 Thread Saksham Gupta
Hi All, Any help regarding this problem? On Tue, May 16, 2023 at 12:05 PM Saksham Gupta wrote: > Hi team, > We use a solr cloud where more than 4 million search requests are served > and more than 50 million documents are updated daily. > We want to tune the merge configuratio

Tuning Merge Settings for Solr Cloud

2023-05-15 Thread Saksham Gupta
Hi team, We use a solr cloud where more than 4 million search requests are served and more than 50 million documents are updated daily. We want to tune the merge configuration of solr to improve searching and indexing performance. 1. Do we need to perform full indexing in order to bring the change

Upgrade to solr cloud 9 from solr cloud 8.10

2023-05-15 Thread Saksham Gupta
Hi team, We are planning to migrate our solr cloud from solr version8.10 to solr 9. 1. Is it okay to plan a rolling upgrade from solr8.10 to 9? 2. Is this a major update? If yes, what should be the upgrade procedure?