Re: Terms not being found in query
On Feb 4, 2006, at 1:09 AM, kate wrote: i have an index with documents containing n-grams, in fields such as "3gram", "4gram", etc. one 5-gram found in the text is "oswax". using Luke, i can see that a field with this value exists for a particular document. however, searching for "5gram:oswax" produces no results (either using a query constructed by the query parser, or manually). the n- gram fields are indexed and stored, but not tokenised. i have tried setting maxFieldLength to Integer.MAX_VALUE with no change. why do i receive no results? It looks like you've got all the troubleshooting bases covered, so I'm not sure what to suggest other than for you to post a simple test case that demonstrates the issue. If you see the term in Luke, and it is indexed, then it most definitely can be used to find the document using a TermQuery (I hope that is what you meant as "manually"). If you're using QueryParser "manually", then perhaps your analyzer is causing an issue? What is the .toString of your Query? Setting maxFieldLength isn't the issue, otherwise you wouldn't have seen the term in Luke. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Two problems of lucene.
Hi, I got two problems of lucene. 1. How does the lucene calculate each term's weight in the query? Is it a simple boolean value? 2. Can i change the similarity measure in the lucene? For instance, i only use the term frequence instead of the tf/idf value to give weight to each term in the document. -- Regards Jiang Xing
Re: Frequency Matrix
Hi Chris, Thanks a zillion for providing me this quick solution. It worked! It would not have been possible withiut yur help in such a short time. Is it your dedicated effort to learn Lucene or some technique? Thanks, Varun On 2/3/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > > take a look at the TermEnum and TermDoc classes. they should give you all > the info you need, using psuedo code something like this... > > foreach Term in TermEnum > foreach doc in TermDoc >record Term, TermDoc.doc, TermDoc.freq > > > : Date: Fri, 3 Feb 2006 13:31:49 -0500 > : From: varun sood <[EMAIL PROTECTED]> > : Reply-To: java-user@lucene.apache.org > : To: java-user@lucene.apache.org > : Subject: Frequency Matrix > : > : Hi, > : I am impelementing Lucene to index my website. I would like to know if > its > : possible to generate a simple frequency matrix? > : > : By frequency matrix I mean, docmuent name on top X-Axis and keywords on > left > : Y-Axis. and the cells of the matrix will contain the frequency of the > : keyword in a particluar document. > : > : I know its very much possible, but its just time which is limited to dig > : more in Lucene. > : > : Thanks in advance. > : Varun > : > > > > -Hoss > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
two problems of using the lucene.
Hi, I got two problems of using the lucene and may need your help. 1. For each word, how the lucene calculate its weight. I only know for each work in the document will be weighed by its tf/idf values. 2. Can I modify the lucene so that i use the term frequency instead of tf/idf value to calculate the similarity between documents and queries. -- Regards Jiang Xing
Field search problem(only single word query works)
Hi, I have two libraries A and B indexed from database tables where A has about 10 fields and B has about 30 fields(with about a couple of hundred records). A and B both have a TEXT type field "headline" reading data from the same database table column. However the field query - "headline: fire water" works for library A, NOT for library B(returns 0 results without any error) when the headline field value is "fire and water". But query "headline:fire headline:water" does work for library B. Any possible reason why library B only accepts single word fielded query? I am running Lucene 1.4.3 on Java 5/JBoss4.0.3 in XP/Linux environment. Thanks. -Xin