Hi Seid,
Do you have a reference for the article? I've done some QA in my day,
but don't recall reading that one.
At any rate, I do think it is possible to do what you are after. See
below.
On Mar 5, 2009, at 9:49 AM, Seid Mohammed wrote:
For my work, I have read an article stating that " Answer type can be
automatically constructed by Indexing Different Questions and Answer
types. Later, when an unseen question apears, answer type for this
question will be found with the help of 'similarity function'
computation"
so I am clear with the arguement above. my problem is,
1. how can I index individual questions and Answer types as is ( not
tokenized
I'm not sure you want this, but when constructing your Field, just use
the NOT_ANALYZED option.
2. how can I calculate the similarity between indexed questions and
and unseen questions (question of any type that can be asked latter)
In line with #1, I think you might be better off to actually tokenize
the question as one one field, and the answer type as a second field.
Then, you can let Lucene calculate similarity via it's normal query
mechanisms. In this case, I would like try experimenting with things
like: exact match, phrase queries with slop, etc. That way, not only
can you match "Who is the president of UN" but you might also match on
things that are a bit fuzzier. To do this, you might need to have
several fields per document with variations. I could also see using
Lucene's payload mechanism as well.
But, as Vasu said, you will likely need other parts too, like OpenNLP.
HTH,
Grant
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org