The first thing I'd do is not use a HashSet when you collect your SpanTermQuerys since the iteration order is not guaranteed. That is, the order when putting them in is not necessarily the same as when getting them out. So you may be searching for "automatique climatisation" rather then "climatisation automatique".
You can easily test this by looking at snQ.toString() to see what query is actually constructed..... If that isn't it, post the results and we'll have another go at it.... Erick On 4/30/07, axel.reymonet <[EMAIL PROTECTED]> wrote:
Hello, I am having some issues with the SpanQuery functionality. As a matter of fact, I index a single french file containing for instance "climatisation automatique" (which means automatic air-conditioning) with the classical FrenchAnalyzer, and when I search in this index with SpanQuery, I have the following situation : - I have 1 span for the "climatisation" request - I have 1 span for the "automatique" request - I have 0 span for the "climatisation automatique" request Maybe I am doing something wrong, but I cannot spot my mistake. I have given the problematic portion of code below. Does anyone have an idea ? Thanks in advance, Axel Reymonet My code is as follows : - FOR INDEXING public void fileProcess(String filepath) { File index_dir = new File(this.currentProjectPath+"index"); if (index_dir.exists()) { String[] files = index_dir.list(); for (String f:files) { File file = new File (this.currentProjectPath+"index/"+f); file.delete(); } index_dir.delete(); } File toBeIndexed = new File(filepath); if (!toBeIndexed.exists() || !toBeIndexed.canRead()) { System.out.println("Document directory '" +toBeIndexed.getAbsolutePath()+ "' does not exist or is not readable, please check the path"); System.exit(1); } try { IndexWriter writer = new IndexWriter(index_dir, new org.apache.lucene.analysis.fr.FrenchAnalyzer(),true); writer.addDocument(FileDocument.Document(toBeIndexed)); writer.close(); } catch (IOException e) {System.out.println(" caught a " + e.getClass() +"\n with message: " + e.getMessage());} } ----------------------------------------- - FOR SEARCHING public void testSF(String searched,String indexPath) { try{ IndexReader reader = IndexReader.open(indexPath); Analyzer analyzer = new org.apache.lucene.analysis.fr.FrenchAnalyzer(); TokenStream requestStream = analyzer.tokenStream("contents",new StringReader(searched)); HashSet<SpanTermQuery> qSet = new HashSet<SpanTermQuery>(); Token currentToken = requestStream.next(); while (currentToken!=null) { qSet.add(new SpanTermQuery(new Term("contents",currentToken.termText()))); currentToken = requestStream.next(); } SpanQuery[] sQ = new SpanQuery [qSet.size()]; int k = 0; for (SpanTermQuery stq:qSet) { sQ[k]=stq; k++; } SpanNearQuery snQ; snQ = new SpanNearQuery(sQ,0,true); Spans spans = snQ.getSpans(reader); int resultsCpt = 0; while (spans.next()) resultsCpt++; System.out.println("Number of results: "+resultsCpt); } catch(IOException ioe){ioe.printStackTrace();} } -------------------------------------------- - OUTPUT Query: climatisation Number of results: 1 Query: automatique Number of results: 1 Query: climatisation automatique Number of results: 0