Re: Help Needed: Distributed update Async Exception solr 8.8.2 - Update
On 8/24/2021 10:08 PM, Reej Nayagam wrote: Okay, Got your point. But we cannot modify the java code to stop commits for now. So my manager suggests we comment out the auto commit in solrconfig instead, We are not sure if that is correct. His point is, let us commit every time we index (that is through java passing the commit & optimise params) and remove the autocommit config in solrconfig.xml to commit every 6 milliseconds. WIll it be the right approach? Don't remove the autoCommit. Frequent hard commits are vital for good operation -- it flushes data to disk and starts a new transaction log. Doing it with openSearcher set to false makes it VERY fast. Solr ships with autoCommit at 15000 -- up to four times more frequently than you have it configured ... and it doesn't cause problems for users. I like to increase that to 6 just so things are a little bit less busy, but 15000 would work too. The commits that were causing problems for you are the ones sent by your indexing software, and those commits DO open a new searcher. Opening a new searcher is the expensive part of a commit ... so your autoCommit is not a problem. https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ (the article says SolrCloud ... but it applies just as much when Solr is NOT in Cloud mode) Thanks, Shawn
ICUFoldingFilter with preserveOriginal option?
Hi, I'm looking at using ICUFoldingFilter for a customer, to fold e.g. Genéve to Geneve and thus get better recall. However, for some common Norwegian words, the folding makes them clash with super-common words so it becomes impossible to find exactly what you want. I imagined if ICUFoldingFilter had a preserverOriginal=true option, then it could leave the original word in the index on the same position, and an exact match for "Genéve" would score better than the normalized one. But this filter does not support this. Have anyone found a workaround for this, except from duplicating all content in different fields with different analysis and search across them with different weights? Jan
Suggester only returning on suggestion
Dear Solr user community, I have started to use the Solr 8.9 suggester. The definition is as follows: { "searchComponent":{ "suggest_test":{ "name":"suggest_test", "class":"solr.SuggestComponent", "suggester":{ "name":"combinedSuggester", "lookupImpl":"BlendedInfixLookupFactory", "indexPath":"/var/solr/sprint2/suggester", "dictionaryImpl":"DocumentDictionaryFactory", "field":"COMBINED", "suggestAnalyzerFieldType":"text_general", "buildOnStartup":"false", "buildOnCommit":"false" } } }, "requestHandler":{ "/suggest":{ "name":"/suggest", "class":"solr.SearchHandler", "startup":"lazy", "defaults":{ "suggest":"true", "suggest.count":"10", "suggest.dictionary":"combinedSuggester" }, "components":[ "suggest_test" ] } } } When I tested this on a relatively small set of documents, it was working as expected. For example, when requesting "suggest?q=south" it would return "south africa" and "south korea" as suggestions. Now, on a different core with more documents, the same query returns "south africa", but "suggest?q=south k" still return "south korea". Does anybody have an idea why this is the case and how I can debug the behaviour of the suggester? Thanks in advance. Best regards Theo Kien Disclaimer This e-mail message and any attachments (“message”) may contain confidential, privileged or proprietary information and is intended solely for the use of the named recipient(s). If you are not the intended recipient, you may not disclose, copy, distribute or retain any part of this message. If you have received this message in error, please inform the sender immediately by return e-mail and delete this message from your system. The BIS is not liable for any error in the content of this message and does not represent that it is uncorrupted and/or free of viruses. Views expressed in this message are those of the author and may not reflect those of the BIS. By exchanging e-mails with the BIS it is understood that the BIS may collect, store and further use e-mail addresses and other personal information which may be provided therein. The BIS will treat such information as confidential.
Re: ICUFoldingFilter with preserveOriginal option?
Not with ICUFoldingFilter, but with the MappingCharFilter. There you can supply a mapping file and skip baseletter mappings for the users' native language, because in their own language, they know the correct spelling ... most of the time ... sometimes. This does really help with multiple languages and you lose the convenience of ICUFoldingFilter. André From: Jan Høydahl Sent: Wednesday, 25 August 2021 15:43 To: users@solr.apache.org Subject: ICUFoldingFilter with preserveOriginal option? External e-mail. Hi, I'm looking at using ICUFoldingFilter for a customer, to fold e.g. Genéve to Geneve and thus get better recall. However, for some common Norwegian words, the folding makes them clash with super-common words so it becomes impossible to find exactly what you want. I imagined if ICUFoldingFilter had a preserverOriginal=true option, then it could leave the original word in the index on the same position, and an exact match for "Genéve" would score better than the normalized one. But this filter does not support this. Have anyone found a workaround for this, except from duplicating all content in different fields with different analysis and search across them with different weights? Jan
Re: ICUFoldingFilter with preserveOriginal option?
Hoi Jan, ICUFoldingFilter and ASCIIFoldingFilter i think do not respect the keyword=true attribute when i last checked. If you use KeywordRepeatFilter and modify the said TokenFilters to respect the keyword attribute, the problem seems solved. Regards, Markus 2021-08-25 16:32 GMT+02:00, André Widhani : > Not with ICUFoldingFilter, but with the MappingCharFilter. > > There you can supply a mapping file and skip baseletter mappings for the > users' native language, because in their own language, they know the correct > spelling ... most of the time ... sometimes. > > This does really help with multiple languages and you lose the convenience > of ICUFoldingFilter. > > André > > From: Jan Høydahl > Sent: Wednesday, 25 August 2021 15:43 > To: users@solr.apache.org > Subject: ICUFoldingFilter with preserveOriginal option? > > External e-mail. > > > Hi, > > I'm looking at using ICUFoldingFilter for a customer, to fold e.g. Genéve to > Geneve and thus get better recall. > However, for some common Norwegian words, the folding makes them clash with > super-common words so it becomes impossible to find exactly what you want. > I imagined if ICUFoldingFilter had a preserverOriginal=true option, then it > could leave the original word in the index on the same position, and an > exact match for "Genéve" would score better than the normalized one. But > this filter does not support this. > > Have anyone found a workaround for this, except from duplicating all content > in different fields with different analysis and search across them with > different weights? > > Jan >