The TermFrequencyVector works perfectly for normal query strings. But if I add a wild card (*) onto words to search for different forms of the word I get an ArrayIndexOutOfBoundsException because the index is -1. Why does this happen? And is there anyway to avoid it?
Thanks, James jnance wrote: > > Yes, the term frequency vector is exactly what I needed. Thanks! > > -James > > > Ajay Lakhani wrote: >> >> Hi James, >> >> Try this: >> >> Searcher searcher = new IndexSearcher(dir); >> QueryParser parser = new QueryParser("content", new >> StandardAnalyzer()); >> Query query = parser.parse(queryString); >> >> HashSet queryTerms = new HashSet(); >> query.extractTerms(queryTerms); >> >> Hits hits = searcher.search(query); >> >> IndexReader reader = IndexReader.open(dir); >> >> for (int i =0; i < hits.length() ; i ++){ >> Document d = hits.doc(i); >> Field fid = d.getField("id"); >> Field ftitle = d.getField("title"); >> System.out.println("id is " + fid.stringValue()); >> System.out.println("title is " + ftitle.stringValue()); >> >> TermFreqVector tfv = reader.getTermFreqVector(hits.id(i), >> "content"); >> String[] terms = tfv.getTerms(); >> int [] freqs = tfv.getTermFrequencies();//get the frequencies >> >> // for each term in the query >> for (Iterator iter = queryTerms.iterator(); iter.hasNext();) { >> Term term = (Term) iter.next(); >> >> // for each term in the vector >> for (int j = 0; j < terms.length; j++) { >> if (terms[j].equals(term.text())) { >> System.out.println("frequency of term ["+ term.text() +"] is >> " + >> freqs[j] ); >> } >> } >> } >> } >> >> Let me know if this helps. >> Cheers >> AJ >> >> 2008/7/10 Karl Wettin <[EMAIL PROTECTED]>: >> >>> Maybe you are looking for the document TermFreqVector? >>> >>> >>> karl >>> >>> 9 jul 2008 kl. 15.49 skrev jnance: >>> >>> >>>> Hi, >>>> >>>> I am indexing lots of text files and need to see how many times a >>>> certain >>>> word comes up in each text file. Right now I have this constructor for >>>> "search": >>>> >>>> static void search(Searcher searcher, String queryString) throws >>>> ParseException, IOException { >>>> QueryParser parser = new QueryParser("content", new >>>> StandardAnalyzer()); >>>> Query query = parser.parse(queryString); >>>> Hits hits = searcher.search(query); >>>> >>>> int hitCount = hits.length(); >>>> if (hitCount == 0) { >>>> System.out.println("0 documents contain the >>>> word >>>> \"" + queryString + >>>> ".\""); >>>> } >>>> else { >>>> System.out.println(hitCount + " documents >>>> contain >>>> the word \"" + >>>> queryString + ".\""); >>>> } >>>> } >>>> >>>> This tells me how many documents contain the word I'm looking for... >>>> but >>>> how >>>> do I get it to tell me how many times the word occurs within that >>>> document? >>>> >>>> Thanks, >>>> >>>> James >>>> -- >>>> View this message in context: >>>> http://www.nabble.com/Searching-for-instances-within-a-document-tp18362075p18362075.html >>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>>> For additional commands, e-mail: [EMAIL PROTECTED] >>>> >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> >> >> > > -- View this message in context: http://www.nabble.com/Searching-for-instances-within-a-document-tp18362075p18403878.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]