Actually, it's the Porter Stemmer that is turning "ace" into "ac".

Try making a copy of text_en_splitting and delete the PorterStemFilterFactory filter from both the query and index analyzers.

-- Jack Krupansky

-----Original Message----- From: Rohan Thakur
Sent: Wednesday, March 20, 2013 8:39 AM
To: solr-user@lucene.apache.org
Subject: Re: had query regarding the indexing and analysers

hi jack

I have been using text_en_splitting initially but what it was doing is it
is changing by query aswell
for example:
if i am searching for "ace" term it is taking it as "ac" thus giving split
ac higher score...
see debug statment:

"debug":{
   "rawquerystring":"ace",
   "querystring":"ace",
   "parsedquery":"(+DisjunctionMaxQuery((title:ac^30.0)))/no_coord",
   "parsedquery_toString":"+(title:ac^30.0)",
   "explain":{
     "":"\n1.8650155 = (MATCH) weight(title:ac^30.0 in 469)
[DefaultSimilarity], result of:\n  1.8650155 = fieldWeight in 469,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n
0.4375 = fieldNorm(doc=469)\n",
     "":"\n1.8650155 = (MATCH) weight(title:ac^30.0 in 470)
[DefaultSimilarity], result of:\n  1.8650155 = fieldWeight in 470,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n
0.4375 = fieldNorm(doc=470)\n",
     "":"\n1.8650155 = (MATCH) weight(title:ac^30.0 in 471)
[DefaultSimilarity], result of:\n  1.8650155 = fieldWeight in 471,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n
0.4375 = fieldNorm(doc=471)\n",
     "":"\n1.8650155 = (MATCH) weight(title:ac^30.0 in 472)
[DefaultSimilarity], result of:\n  1.8650155 = fieldWeight in 472,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n
0.4375 = fieldNorm(doc=472)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 331)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 331,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=331)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 332)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 332,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=332)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 335)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 335,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=335)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 336)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 336,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=336)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 337)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 337,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=337)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 393)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 393,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=393)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 425)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 425,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=425)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 426)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 426,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=426)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 429)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 429,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=429)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 430)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 430,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=430)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 431)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 431,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=431)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 433)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 433,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=433)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 434)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 434,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=434)\n",
     "":"\n1.5985848 = (MATCH) weight(title:ac^30.0 in 502)
[DefaultSimilarity], result of:\n  1.5985848 = fieldWeight in 502,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n    0.375
= fieldNorm(doc=502)\n",
     "":"\n1.332154 = (MATCH) weight(title:ac^30.0 in 411)
[DefaultSimilarity], result of:\n  1.332154 = fieldWeight in 411,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n
0.3125 = fieldNorm(doc=411)\n",
     "":"\n1.332154 = (MATCH) weight(title:ac^30.0 in 424)
[DefaultSimilarity], result of:\n  1.332154 = fieldWeight in 424,
product of:\n    1.0 = tf(freq=1.0), with freq of:\n      1.0 =
termFreq=1.0\n    4.2628927 = idf(docFreq=39, maxDocs=1045)\n
0.3125 = fieldNorm(doc=424)\n"},
   "QParser":"ExtendedDismaxQParser",



On Tue, Mar 19, 2013 at 7:37 PM, Jack Krupansky <j...@basetechnology.com>wrote:

Yeah, one ambiguity in typography is whether a hyphen is internal to a
compound term (e.g., "CD-ROM") or a phrase separator as in your case. Some
people are careful to put spaces around the hyphen for a phrase delimiter,
but plenty of people still just drop it in directly adjacent to two words.

In your case, text_en_splitting_tight is SPECIFICALLY trying to keep
"Laptop-DUAL" together as a single term, so that "wi fi" is kept distinct
from "Wi-Fi".

Try text_en_splitting, which specifically is NOT trying to keep them
together.

The key clue here is that the former does not have generateWordParts="1".
That is the option that is needed so that "Laptop-DUAL" will be indexed as
"laptop dual".

-- Jack Krupansky

-----Original Message----- From: Rohan Thakur
Sent: Tuesday, March 19, 2013 3:35 AM
To: solr-user@lucene.apache.org
Subject: Re: had query regarding the indexing and analysers


my default is title only I have used debug as well it shows that solr
divides the query into dual and core and then searches both separately now
while calculating the scores it puts the document in which both the terms
appear and in my case the document containing this title:

Wipro  7710U Laptop-DUAL CORE 1.4 Ghz-120GB HDD

solr has found only core term not dual as I guess it is
attached to laptop term not as even searching for only dual
term this document doesnot show up which is why this document
sshows down in the search results thus I am not able to
search for partial terms for that I have to apply *dual
in the query then it is searching this document but then
other search scoring gets affected with this when I put * in
the query terms I think I have to remove the "-" terms from
the strings before indexing them point me if i am wrong any
where

thanks
regards
Rohan


On Sat, Mar 16, 2013 at 7:02 PM, Erick Erickson <erickerick...@gmail.com>*
*wrote:

 See admin/analysis, it's invaluable. Probably

The terms are being searched against your default text field which I'd
guess is not "title".

Also, try adding &debug=all to your query and look in the debug info at
the
parsed form of the query to see what's actually being searched.

Best
Erick


On Fri, Mar 15, 2013 at 2:52 AM, Rohan Thakur <rohan.i...@gmail.com>
wrote:

> hi all
>
> wanted to know I have this string in field title :
>
> Wipro  7710U Laptop-DUAL CORE 1.4 Ghz-120GB HDD
>
> I have indexed it using text-en-splliting-tight
>
>
> and now I am searching for term like q=dual core
>
> but in the relevance part its this title is coming down the order as
> solr is not searching dual in this string its just searching core term
> from the query in this string thus multiplying the score for this field
by
> 1/2
> decreasing the score.
>
> how can I correct this can any one help
>
> thanks
> regards
> Rohan
>




Reply via email to