RE: Spanquery problem

axel.reymonet Mon, 30 Apr 2007 06:47:18 -0700

Hello,

Thank you for your piece of advice. Indeed, my mistake was to use HashSet
instead of an ArrayList (for instance). I must have been really distracted
when I wrote my code, even more when I checked it! Anyway, thank you again,


Axel Reymonet

-----Message d'origine-----
De : Erick Erickson [mailto:[EMAIL PROTECTED]
Envoyé : lundi 30 avril 2007 14:52
À : java-user@lucene.apache.org
Objet : Re: Spanquery problem


The first thing I'd do is not use a HashSet when you collect your
SpanTermQuerys since the iteration order is not guaranteed. That is,
the order when putting them in is not necessarily the same as
when getting them out. So you may be searching for
"automatique climatisation" rather then "climatisation automatique".

You can easily test this by looking at snQ.toString() to see what query
is actually constructed.....

If that isn't it, post the results and we'll have another go at it....

Erick

On 4/30/07, axel.reymonet <[EMAIL PROTECTED]> wrote:
>
> Hello,
>
> I am having some issues with the SpanQuery functionality. As a matter of
> fact, I index a single french file containing for instance "climatisation
> automatique" (which means automatic air-conditioning) with the classical
> FrenchAnalyzer, and when I search in this index with SpanQuery, I have the
> following situation :
> - I have 1 span for the "climatisation" request
> - I have 1 span for the "automatique" request
> - I have 0 span for the "climatisation automatique" request
> Maybe I am doing something wrong, but I cannot spot my mistake. I have
> given
> the problematic portion of code below. Does anyone have an idea ? Thanks
> in
> advance,
>
> Axel Reymonet
>
>
> My code is as follows :
> - FOR INDEXING
> public void fileProcess(String filepath)
> {
> File index_dir = new File(this.currentProjectPath+"index");
> if (index_dir.exists())
>     {
>     String[] files = index_dir.list();
>     for (String f:files)
>         {
>         File file = new File (this.currentProjectPath+"index/"+f);
>         file.delete();
>         }
>     index_dir.delete();
>     }
> File toBeIndexed = new File(filepath);
> if (!toBeIndexed.exists() || !toBeIndexed.canRead())
>     {
>     System.out.println("Document directory '"
> +toBeIndexed.getAbsolutePath()+ "' does not exist or is not readable,
> please
> check the path");
>     System.exit(1);
>     }
> try {
> IndexWriter writer = new IndexWriter(index_dir, new
> org.apache.lucene.analysis.fr.FrenchAnalyzer(),true);
> writer.addDocument(FileDocument.Document(toBeIndexed));
> writer.close();
> } catch (IOException e) {System.out.println(" caught a " + e.getClass()
> +"\n
> with message: " + e.getMessage());}
> }
>
> -----------------------------------------
> - FOR SEARCHING
> public void testSF(String searched,String indexPath)
> {
> try{
> IndexReader reader = IndexReader.open(indexPath);
> Analyzer analyzer = new org.apache.lucene.analysis.fr.FrenchAnalyzer();
> TokenStream requestStream = analyzer.tokenStream("contents",new
> StringReader(searched));
> HashSet<SpanTermQuery> qSet = new HashSet<SpanTermQuery>();
> Token currentToken = requestStream.next();
> while (currentToken!=null)
>     {
>     qSet.add(new SpanTermQuery(new
> Term("contents",currentToken.termText())));
>     currentToken = requestStream.next();
>     }
> SpanQuery[] sQ = new SpanQuery [qSet.size()];
> int k = 0;
> for (SpanTermQuery stq:qSet)
>     {
>     sQ[k]=stq;
>     k++;
>     }
> SpanNearQuery snQ;
> snQ = new SpanNearQuery(sQ,0,true);
> Spans spans = snQ.getSpans(reader);
> int resultsCpt = 0;
> while (spans.next())
>     resultsCpt++;
> System.out.println("Number of results: "+resultsCpt);
> }
> catch(IOException ioe){ioe.printStackTrace();}
> }
> --------------------------------------------
> - OUTPUT
> Query: climatisation
> Number of results: 1
> Query: automatique
> Number of results: 1
> Query: climatisation automatique
> Number of results: 0
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Spanquery problem

Reply via email to