Re: Improper Solr Search results

Markus Jelsma Wed, 23 Nov 2022 04:20:13 -0800

Hello,

It is unclear what you are looking for, do you have a problem with the
highlighted excerpts, or a problem with the sorting of the top search
results?


Also, everything below 'Here are the settings what I have used.' is not
really helpful.

Regards,
Markus

Op wo 23 nov. 2022 om 12:35 schreef Raj Krishna <rkris...@sandvine.com>:

> Hi Team,
> Do we have any leads on this issue.
>
> Thanks
> Raj
>
> From: Raj Krishna
> Sent: Monday, November 21, 2022 2:56 PM
> To: users@solr.apache.org
> Subject: Improper Solr Search results
>
> Hi solr team,
>
> The solr search is not showing up the proper results.
>
> Here is what I am looking for:
>
> Scenerio1
> Let's say, I searched for "ABC DEF" with Contains all of these words
> configuration.
> Result I get:
> .......ABC........................DEF.........
> .......DEF...........ABC.............
> .......DEF......................
> .......ABC............
>
> Expected Result:
> ..........ABC DEF.......
>
> In scenerio1, in some cases when I go  to the actual page of the partial
> search results (let's say 3rd one). I get the exact match in some different
> line, not the excerpt which is displayed in the result.
>
> Scenerio2
> Let's say, I searched for "ABC DEF" with Contains all of these words
> configuration.
> Result I get:
> .......DEF......................
> .......ABC............
>
> Expected Result:
> ..........ABC DEF.......
>
> In Scenerio2, I don't even get the exact match.
>
>
>
>
> Here are the settings what I have used.
>
>
>
> 1.    Home
> 2.     Administration
> 3.     Configuration
> 4.     Search and Metadata
> 5.     Search API
> 6.     Solr index
> 7.     Solr index
> Index name Machine name: solr_index
> Enter the displayed name for the index.
> Machine-readable name
> A unique machine-readable name. Can only contain lowercase letters,
> numbers, and underscores.
> Datasources
>  Comment
> Provides Comment entities for indexing and searching.
>  Contact message
> Provides Contact message entities for indexing and searching.
>  Content
> Provides Content entities for indexing and searching.
>  Content moderation state
> Provides Content moderation state entities for indexing and searching.
>  Custom block
> Provides Custom block entities for indexing and searching.
>  Custom menu link
> Provides Custom menu link entities for indexing and searching.
>  File
> Provides File entities for indexing and searching.
>  Media
> Provides Media entities for indexing and searching.
>  Search task
> Provides Search task entities for indexing and searching.
>  Shortcut link
> Provides Shortcut link entities for indexing and searching.
>  Simplenews subscriber
> Provides Simplenews subscriber entities for indexing and searching.
>  Solr Document
> Search through external Solr content. (Only works if this index is
> attached to a Solr-based server.)
>  Solr Multisite Document
> Search through a different site's content. (Only works if this index is
> attached to a Solr-based server.)
>  Taxonomy term
> Provides Taxonomy term entities for indexing and searching.
>  URL alias
> Provides URL alias entities for indexing and searching.
>  User
> Provides User entities for indexing and searching.
>  Webform submission
> Provides Webform submission entities for indexing and searching.
>  Workflow scheduled transition
> Provides Workflow scheduled transition entities for indexing and searching.
>  Workflow transition
> Provides Workflow transition entities for indexing and searching.
> Select one or more datasources of items that will be stored in this index.
> CONFIGURE THE CONTENT DATASOURCE
> BUNDLESLANGUAGES
> CONFIGURE THE DEFAULT TRACKER
> Default index tracker which uses a simple database table for tracking
> items.
> Indexing order
>  Index items in the same order in which they were saved
>  Index the most recent items first
> The order in which items will be indexed.
> Server
>  - No server -
>  solr index server
> Select the server this index should use. Indexes cannot be enabled without
> a connection to a valid, enabled server.
>  Enabled
> Only enabled indexes can be used for indexing and searching. This setting
> will only take effect if the selected server is also enabled.
> Description
>
> Enter a description for the index.
> INDEX OPTIONS
>  Read only
> Do not write to this index or track the status of items in this index.
>  Index items immediately
> Immediately index new or updated items instead of waiting for the next
> cron run. This might have serious performance drawbacks and is generally
> not advised for larger sites.
>  Track changes in referenced entities
> Automatically queue items for re-indexing if one of the field values
> indexed from entities they reference is changed. (For instance, when
> indexing the name of a taxonomy term in a Content index, this would lead to
> re-indexing when the term's name changes.) Enabling this setting can lead
> to performance problems on large sites when saving some types of entities
> (an often-used taxonomy term in our example). However, when the setting is
> disabled, fields from referenced entities can go stale in the search index
> and other steps should be taken to prevent this.
> Cron batch size
> Set how many items will be indexed at once when indexing items during a
> cron run. "0" means that no items will be indexed by cron for this index,
> "-1" means that cron should index all items at once.
> SOLR SPECIFIC INDEX OPTIONS
>  Finalize index before first search
> If enabled, other modules could hook in to apply "finalizations" to the
> index after updates or deletions happend to index items.
> MULTILINGUAL
>  Limit to current content language.
> Limit all search results for custom queries or search pages not managed by
> Views to current content language if no language is specified in the query.
>  Include language independent content in search results.
> This option will include content without a language assigned in the
> results of custom queries or search pages not managed by Views. For
> example, if you search for English content, but have an article with
> languague of "undefined", you will see those results as well. If you
> disable this option, you will only see content that matches the language.
> HIGHLIGHTER
> If "Retrieve result data from Solr" and "Highlight retrieved data" are
> selected for the Solr backend on the server edit page, these highlighting
> settings will be used.
> maxAnalyzedChars
> Specifies the number of characters into a document that Solr should look
> for suitable snippets.
> fragmenter
> Specifies a text snippet generator for highlighted text. The standard
> fragmenter is gap, which creates fixed-sized fragments with gaps for
> multi-valued fields. Another option is regex, which tries to create
> fragments that resemble a specified regular expression. This parameter
> accepts per-field overrdes.
> REGEX
> regex.slop
> When using the regex fragmenter, this parameter defines the factor by
> which the fragmenter can stray from the ideal fragment size (given by
> fragsize) to accommodate a regular expression. For instance, a slop of 0.2
> with fragsize=100 should yield fragments between 80 and 120 characters in
> length. It is usually good to provide a slightly smaller fragsize value
> when using the regex fragmenter.
> regex.pattern
> Specifies the regular expression for fragmenting. This could be used to
> extract sentences.
> regex.maxAnalyzedChars
> Instructs Solr to analyze only this many characters from a field when
> using the regex fragmenter (after which, the fragmenter produces
> fixed-sized fragments). Applying a complicated regex to a huge field is
> computationally expensive.
>  usePhraseHighlighter
> If set, Solr will highlight phrase queries (and other advanced
> position-sensitive queries) accurately. If false, the parts of the phrase
> will be highlighted everywhere instead of only when it forms the given
> phrase.
>  highlightMultiTerm
> If set, Solr will highlight wildcard queries (and other MultiTermQuery
> subclasses). If false, they won't be highlighted at all.
>  preserveMulti
> If set, multi-valued fields will return all values in the order they were
> saved in the index. If false, only values that match the highlight request
> will be returned.
>  mergeContiguous
> Instructs Solr to collapse contiguous fragments into a single fragment. A
> value of true indicates contiguous fragments will be collapsed into single
> fragment. This parameter accepts per-field overrides. The default value,
> false, is also the backward-compatible setting.
>  requireFieldMatch
> If set, highlights terms only if they appear in the specified field. If
> not set, terms are highlighted in all requested fields regardless of which
> field matched the query.
> snippets
> Specifies maximum number of highlighted snippets to generate per field. It
> is possible for any number of snippets from zero to this value to be
> generated. This parameter accepts per-field overrides.
> fragsize
> Specifies the size, in characters, of fragments to consider for
> highlighting. 0 indicates that no fragmenting should be considered and the
> whole field value should be used. This parameter accepts per-field
> overrides.
> MLT (MORELIKETHIS)TERM MODIFIERSADVANCED
>
>
>
> Manage processors for search index Solr index
>  Add to Default shortcuts<
> https://docs.support.sandvine.com/admin/config/user-interface/shortcut/manage/default/add-link-inline?link=admin/config/search/search-api/index/solr_index/processors&name=Manage%20processors%20for%20search%20index%20Solr%20index&destination=/admin/config/search/search-api/index/solr_index/processors&token=IXOY03csEq7siIRPM6iA8innjeB_U7l08-neAjqibSk
> >
> Primary tabs
> *       View
> *       Edit
> *       Fields
> *       Processors(active tab)
> Breadcrumb
> 1.    Home
> 2.     Administration
> 3.     Configuration
> 4.     Search and Metadata
> 5.     Search API
> 6.     Solr index
> 7.     Solr index
>
> Configure processors which will pre- and post-process data at index and
> search time. Find more information on the processors documentation page.
> ENABLED
>  Boost more recent dates
> Boost more recent documents and penalize older documents.
>  Content access
> Adds content access checks for nodes and comments.
>  Double Quote Workaround
> Replaces double quotes in field values and query to work around a bug in
> Solr streaming expressions.
>  Entity status
> Exclude inactive users and unpublished entities (which have a "Published"
> state) from being indexed.
>  Highlight
> Adds a highlighted excerpt to results and highlights returned fields.
>  HTML filter
> Strips HTML tags from fulltext fields and decodes HTML entities. Use this
> processor when indexing HTML data - for example, node bodies for certain
> text formats. The processor also allows to boost (or ignore) the contents
> of specific elements.
>  Ignore case
> Makes searches case-insensitive on selected fields.
> It is recommended not to use this processor with the selected server.
>  Ignore characters
> Configure types of characters which should be ignored for searches.
>  Index hierarchy
> Allows the indexing of values along with all their ancestors for
> hierarchical fields (like taxonomy term references)
>  Number field-based boosting
> Adds a boost to indexed items based on the value of a numeric field.
>  Regular expression based replacements
> Regular expression based replacements.
>  Reverse entity references
> Allows indexing of entities that link to the indexed entity.
>  Role-based access
> Adds an access check based on a user's roles. This may be sufficient for
> sites where access is primarily granted or denied based on roles and
> permissions. For grants-based access checks on "Content" or "Comment"
> entities the "Content access" processor may be a suitable alternative.
>  Solr dummy fields
> Adds dummy fields to all datasources to register a pseudo field names that
> get their values via API, for example
> hook_search_api_solr_documents_alter().
>  Stemmer
> Stems search terms (for example, talking to talk). Currently, this only
> acts on English language content. It uses the Porter 2 stemmer algorithm
> (More information). For best results, use after tokenizing.
> It is recommended not to use this processor with the selected server.
>  Stopwords
> Allows you to define stopwords which will be ignored in searches. Caution:
> Only use after both 'Ignore case' and 'Tokenizer' have run.
> It is recommended not to use this processor with the selected server.
>  Tokenizer
> Splits text into individual words for searching.
> It is recommended not to use this processor with the selected server.
>  Transliteration
> Makes searches insensitive to accents and other non-ASCII characters.
> It is recommended not to use this processor with the selected server.
>  Type-specific boosting
> Adds a boost to indexed items based on their datasource and/or bundle.
> PROCESSOR ORDER
> PREPROCESS INDEX
> Show row weights
>  <
> https://docs.support.sandvine.com/admin/config/search/search-api/index/solr_index/processors
> >
> HTML filter
>
> PREPROCESS QUERY
> Show row weights
>  <
> https://docs.support.sandvine.com/admin/config/search/search-api/index/solr_index/processors
> >
> HTML filter
>
>  <
> https://docs.support.sandvine.com/admin/config/search/search-api/index/solr_index/processors
> >
> Content access
>
>  <
> https://docs.support.sandvine.com/admin/config/search/search-api/index/solr_index/processors
> >
> Boost more recent dates
>
> POSTPROCESS QUERY
> Show row weights
>  <
> https://docs.support.sandvine.com/admin/config/search/search-api/index/solr_index/processors
> >
> Highlight
>
> Processor settings
> *       Boost more recent datesEnabled
> *       HighlightEnabled(active tab)
> *       HTML filterEnabled
> Highlight returned field data
> Select whether returned fields should be highlighted.
>  Highlight partial matches
> When enabled, matches in parts of words will be highlighted as well.
>  Create excerpt
> When enabled, an excerpt will be created for searches with keywords,
> containing all occurrences of keywords in a fulltext field.
>  Create excerpt even if no search keys are available
> When enabled, an excerpt will be created even with an empty query string.
> Excerpt length
> The requested length of the excerpt, in characters
> Exclude fields from excerpt
>  Body (body)
>  Title (title)
> Exclude certain fulltext fields from being included in the excerpt.
> Highlighting prefix
> Text/HTML that will be prepended to all occurrences of search keywords in
> highlighted text
> Highlighting suffix
> Text/HTML that will be appended to all occurrences of search keywords in
> highlighted text
>
>
>
>
>
>
>
> Please Triage on this issue.
> Feel free to ask for more clarity and details regarding this from my side.
>
> Thanks
> Raj
>
> Disclaimer:
> This communication (including any attachments) is intended for the use of
> the intended recipient(s) only and may contain information that is
> considered confidential, proprietary, sensitive and/or otherwise legally
> protected. Any unauthorized use or dissemination of this communication is
> strictly prohibited. If you have received this communication in error,
> please immediately notify the sender by return e-mail message and delete
> all copies of the original communication. Thank you for your cooperation.
>

Re: Improper Solr Search results

Reply via email to