Re: Terms not being found in query

2006-02-04 Thread Erik Hatcher


On Feb 4, 2006, at 1:09 AM, kate wrote:

i have an index with documents containing n-grams, in fields such as
"3gram", "4gram", etc.  one 5-gram found in the text is "oswax".   
using

Luke, i can see that a field with this value exists for a particular
document.  however, searching for "5gram:oswax" produces no results  
(either
using a query constructed by the query parser, or manually).  the n- 
gram

fields are indexed and stored, but not tokenised.

i have tried setting maxFieldLength to Integer.MAX_VALUE with no  
change.


why do i receive no results?


It looks like you've got all the troubleshooting bases covered, so  
I'm not sure what to suggest other than for you to post a simple test  
case that demonstrates the issue.  If you see the term in Luke, and  
it is indexed, then it most definitely can be used to find the  
document using a TermQuery (I hope that is what you meant as  
"manually").  If you're using QueryParser "manually", then perhaps  
your analyzer is causing an issue?   What is the .toString of your  
Query?


Setting maxFieldLength isn't the issue, otherwise you wouldn't have  
seen the term in Luke.


Erik


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Two problems of lucene.

2006-02-04 Thread xing jiang
Hi,

I got two problems of lucene.

1. How does the lucene calculate each term's weight in the query? Is it a
simple boolean value?

2. Can i change the similarity measure in the lucene? For instance, i only
use the term frequence instead of the tf/idf value to give weight to each
term in the document.



--
Regards

Jiang Xing


Re: Frequency Matrix

2006-02-04 Thread varun sood
Hi Chris,
  Thanks a zillion for providing me this quick solution. It worked! It
would not have been possible withiut yur help in such a short time. Is it
your dedicated effort to learn Lucene or some technique?

Thanks,
Varun


On 2/3/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
>
>
> take a look at the TermEnum and TermDoc classes. they should give you all
> the info you need, using psuedo code something like this...
>
>  foreach Term in TermEnum
> foreach doc in TermDoc
>record Term, TermDoc.doc, TermDoc.freq
>
>
> : Date: Fri, 3 Feb 2006 13:31:49 -0500
> : From: varun sood <[EMAIL PROTECTED]>
> : Reply-To: java-user@lucene.apache.org
> : To: java-user@lucene.apache.org
> : Subject: Frequency Matrix
> :
> : Hi,
> :  I am impelementing Lucene to index my website. I would like to know if
> its
> : possible to generate a simple frequency matrix?
> :
> : By frequency matrix I mean, docmuent name on top X-Axis and keywords on
> left
> : Y-Axis. and the cells of the matrix will contain the frequency of the
> : keyword in a particluar document.
> :
> : I know its very much possible, but its just time which is limited to dig
> : more in Lucene.
> :
> : Thanks in advance.
> : Varun
> :
>
>
>
> -Hoss
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


two problems of using the lucene.

2006-02-04 Thread xing jiang
Hi,

I got two problems of using the lucene and may need your help.

1. For each word, how the lucene calculate its weight. I only know for each
work in the document will be weighed by its tf/idf values.

2. Can I modify the lucene so that i use the term frequency instead of
tf/idf value to calculate the similarity between documents and queries.

--
Regards

Jiang Xing


Field search problem(only single word query works)

2006-02-04 Thread Xin Herbert Wu
Hi,

I have two libraries A and B indexed from database tables where A has about
10 fields and B has about 30 fields(with about a couple of hundred records).
A and B both have a TEXT type field "headline" reading data from the same
database table column. 

 

However the field query - "headline: fire water" works for library A, NOT
for library B(returns 0 results without any error) when the headline field
value is "fire and water". But query "headline:fire headline:water" does
work for library B. 

 

Any possible reason why library B only accepts single word fielded query?

 

I am running Lucene 1.4.3 on Java 5/JBoss4.0.3 in XP/Linux environment. 

 

Thanks.

 

-Xin