Re: similarity function

Grant Ingersoll Thu, 05 Mar 2009 12:05:49 -0800

Hi Seid,

Do you have a reference for the article? I've done some QA in my day,but don't recall reading that one.

At any rate, I do think it is possible to do what you are after. Seebelow.


On Mar 5, 2009, at 9:49 AM, Seid Mohammed wrote:

For my work, I have read an article stating that " Answer type can be
automatically constructed by Indexing Different Questions and Answer
types. Later, when an unseen question apears, answer type for this
question will be found with the help of 'similarity function'
computation"

so I am clear with the arguement above. my problem is,

1. how can I index individual questions and Answer types as is ( nottokenized

I'm not sure you want this, but when constructing your Field, just usethe NOT_ANALYZED option.


2. how can I calculate the similarity between indexed questions and
and unseen questions (question of any type that can be asked latter)

In line with #1, I think you might be better off to actually tokenizethe question as one one field, and the answer type as a second field.Then, you can let Lucene calculate similarity via it's normal querymechanisms. In this case, I would like try experimenting with thingslike: exact match, phrase queries with slop, etc. That way, not onlycan you match "Who is the president of UN" but you might also match onthings that are a bit fuzzier. To do this, you might need to haveseveral fields per document with variations. I could also see usingLucene's payload mechanism as well.


But, as Vasu said, you will likely need other parts too, like OpenNLP.

HTH,
Grant

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: similarity function

Reply via email to