Doesn't work with two word! :-(

If I search "jakartd apache lucene"~10 not found  "jakarta apache lucene"

But

If I search "jakarte apache lucene"~10 FOUND  "jakarta apache lucene"

WHY?!?!?!

Mirko Mancin

Software Developer

[cid:38E1590B-64FC-42C9-B24C-27DC3CBD6984]

Ubiq srl
stradello Conrad Marca-Relli, 9
43122 Parma (PR)
t. +39 0521 781601
cell. +39 346 4137577
follow us on Linkedin<https://www.linkedin.com/company/ubiq-srl>

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. If 
you have received this email in error please notify the system manager. This 
message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and delete 
this e-mail from your system. If you are not the intended recipient you are 
notified that disclosing, copying, distributing or taking any action in 
reliance on the contents of this information is strictly prohibited.

Da: Mostafa Gomaa <[email protected]<mailto:[email protected]>>
Risposta: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Data: mercoledì 1 aprile 2015 15:54
A: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Oggetto: Re: Problem with NGram

Hello Mirko,

Try using fuzzy queries. You can do that by adding a tilde at the end of the 
term you're searching for, like PRIN3ER~. It uses the edit distance algorithm 
to find similar words. You can also specify the number of edits by adding the 
number after the tilde, for example, PRIN3ER~2 will match similar words up to 
two edits. Hope this helps.

Regards,

Mostafa Gomaa.

On Wed, Apr 1, 2015 at 2:37 PM, Mirko Mancin 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

    I have a problem with n-gram. I would try to find the word "PRINTER".

I have this fields:


<field name="bestExternalDescriptionStandard" type="text_general" 
indexed="true" stored="true" multiValued="true" termVectors="true" 
termPositions="true" termOffsets="true"/>

   <field name="bestExternalDescriptionGram" type="text_ngram" indexed="true" 
stored="true" multiValued="true" termVectors="true" termPositions="true" 
termOffsets="true"/>




<fieldType name="text_general" class="solr.TextField" 
positionIncrementGap="100">

      <analyzer>

        <tokenizer class="solr.StandardTokenizerFactory"/>

        <filter class="solr.LowerCaseFilterFactory"/>

        <filter class="solr.SnowballPorterFilterFactory" language="Italian" />

      </analyzer>

</fieldType>


<fieldType name="text_ngram" class="solr.TextField" positionIncrementGap="100">

<analyzer>

          <tokenizer class="solr.NGramTokenizerFactory" minGramSize="2" 
maxGramSize="4"/>


          <filter class="solr.LowerCaseFilterFactory"/>

          <filter class="solr.SnowballPorterFilterFactory" language="Italian" />

        </analyzer>

</fieldType>



And rightly found:

"BROTHER PRINTER","SAMSUNG PRINTER",ecc...

But if I search "PRIN3R" (with an error within the string), solr do not return 
anything!!

How to do it? How to setup my schema.xml for found documents with a certain 
similarity?

Thanks


Mirko Mancin

Software Developer

[cid:522DC2EC-33F1-4171-B17A-171D46B2CF64]

Ubiq srl
stradello Conrad Marca-Relli, 9
43122 Parma (PR)
t. +39 0521 781601
cell. +39 346 4137577
follow us on Linkedin<https://www.linkedin.com/company/ubiq-srl>

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. If 
you have received this email in error please notify the system manager. This 
message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and delete 
this e-mail from your system. If you are not the intended recipient you are 
notified that disclosing, copying, distributing or taking any action in 
reliance on the contents of this information is strictly prohibited.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to