Re: tlog size issue- solr cloud 6.6

2021-03-22 Thread Ritvik Sharma
Yes, I did. With the use of CloudClollection Utility. On Tue, 23 Mar 2021 at 12:01 AM, Dominique Bejean wrote: > Hi, > > Did you try to force a hard commit in order to see the inpact on tlog ? > http://localhost:8983/solr/[collection_name]/update?commit=true > > Did you read this article ? > >

Re: Distributed IDF for Solr using ExactStatsCache issue

2021-03-22 Thread Michael Gibney
Cameron, What is your cluster configuration? i.e., how many nodes, how many replicas per node, how many replicas in each collection, etc.? Do you observe consistent behavior for the same query if you always route that query via the same "entry node" (i.e., not load balanced over the cluster)? Micha

Re: Conflict between atomic update and highlighting constraints

2021-03-22 Thread NDK Reichenberg
I am looking into a similar feature and I also believe SOLR-1105 would potentially be the fix. The reason is that highlighting happens at query time, meaning that the stored field value is first run through the index analyzer and then the search goes through the query analyzer. On 2021/03/19 2

Re: tlog size issue- solr cloud 6.6

2021-03-22 Thread Dominique Bejean
Hi, Did you try to force a hard commit in order to see the inpact on tlog ? http://localhost:8983/solr/[collection_name]/update?commit=true Did you read this article ? https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Regards Dominique Le lun. 22 m

Re: REINDEXCOLLECTION unknown field

2021-03-22 Thread Alexandre Rafalovitch
There may also be a way to drop a bunch of fields on intake by crafting a custom update request processor chain in solrconfig.xml. Or by temporarily declaring them with stored=false, indexed=false in the target schema. As long as nothing actually ends up in Lucene segments, you can change schema

Re: tlog size issue- solr cloud 6.6

2021-03-22 Thread Ritvik Sharma
Hi Dominique Any suggestions? On Mon, 22 Mar 2021 at 10:23, Ritvik Sharma wrote: > HI Dominique > > softcommit=false is coming in logs, > > > INFO - 2021-03-22 07:40:58.129; [c:solrcollection s:shard2 r:core_node1 > x:solrcollection_shard2_replica1] > org.apache.solr.update.processor.LogUpdate

Re: REINDEXCOLLECTION unknown field

2021-03-22 Thread Karl Stoney
Interestingly enough the next issue we hit is the `fl=750 fields` == too big, so we switched to using POST/x-url-form-encoded for REINDEXCOLLECTION which accepts the request, but then it silently fails (no logs in solr, just doesn't work). I can only assume this is because behind the scenes the

Re: REINDEXCOLLECTION unknown field

2021-03-22 Thread Karl Stoney
So for context we have 900x fields on Collection one and have removed some 250 fields from the schema and want to reindex into collection2. We're trying to have a process where we can easily remove fields and reindex without too much coding overhead. Therefore, we were simply using the default

The Split: Introduction and Discussion about Apache Lucene and Solr as Separate Top-Level Projects

2021-03-22 Thread Lisa Biella
Hello folks, I'm really excited to announce our next London Information Retrieval Meetup (ONLINE), that will take place the 30th of March, starting at 6:15 PM (London Time). This time, we are going to have two amazing talks: Talk 1 "*Explainability for learning to Rank*" Ilaria Petreti, IR/ML E

Re: Deprecation of QueryElevationComponent's support for elevate.xml in data directories

2021-03-22 Thread David Smiley
A commit is required or else there might be dirty cache problems if there were cached queries affected by QEC.. Generally, Solr should always return the same result if there hasn't been a commit. I know there are short-circuit optimizations for commits to no-op if there's no new data so I could s

Re: REINDEXCOLLECTION unknown field

2021-03-22 Thread David Hastings
>Surely this field should simply just be ignored? why would solr ignore this field if you're trying to index to it? can't you change your indexer to remove these fields as well? solr will try to do what its told, and if its told to do something bad it will simply fail, you dont want it to ignore

Re: REINDEXCOLLECTION unknown field

2021-03-22 Thread David Smiley
https://solr.apache.org/guide/8_8/collection-management.html#reindexcollection See the "fl" param ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Mar 22, 2021 at 9:01 AM Karl Stoney wrote: > Hi, > Sorry for all the questions recently… > > So

REINDEXCOLLECTION unknown field

2021-03-22 Thread Karl Stoney
Hi, Sorry for all the questions recently… So as per https://solr.apache.org/guide/8_0/reindexing.html; we’re trying to remove a load of fields. Subsequently we’ve created a new collection with the new schema and we’re attempting to reindex from old to new. There’s about 216 fields in total bein

Re: Problem with Backup - Standalone Mode

2021-03-22 Thread Jason Gerlowski
Hi Adam, Solr's backup functionality integrated into the /replication handler is relatively simple - it just iterates over index files and copies them to the requested location. The only time that replication handler backup delete files is when the backup fails and Solr tries to clean up after it

Re: Distributed IDF for Solr using ExactStatsCache issue

2021-03-22 Thread Bernd Fehling
Hello, I have a SolrCloud with 5 shards 2 Replicas. I tried everything back and forth with LocalStatsCache, ExactStatsCache and ExactSharedStatsCache. I saw some minor advantage between LocalStatsCache and the Exact... pieces. But as a matter of fact while showing 10 search results per page, as

Re: Deprecation of QueryElevationComponent's support for elevate.xml in data directories

2021-03-22 Thread Mónica Marrero
It seems to be a cleaner way to manage this. Just a comment: what about a mechanism to just update this file in memory? I know it will also involve opening a new searcher, but at least It does not depend on commits in the data (we may need to force changes in the data when what we really want to do

Re: Solr complains about unknown field during atomic indexing

2021-03-22 Thread Andreas Hubold
Hi, You could then add the following to take care of any and all unknown fields: Or you could name individual fields like that, which I think would be a better option than the wildcard dynamic field. Just a small addition, in case you're also using nested documents: You should really pr

Re: Replication and Score Issue

2021-03-22 Thread Dominique Bejean
Hi, If your replicas are all NRT, they both index documents. Their commit and segment merge cycles are independant and so yes, see different MaxDoc and DeletedDoc for each replicas is normal. We can expect BM25 doesn't care about deleted docs, but I can't answer with certainty. Regards. Dominiq