Hi, i have an assignment in my Text Analytics class. I am supposed to create an index and search it. The corpus is a PubMed-like XML file. it is possible to query terms (programcall a few terms) and phrases (programcall "a phrase"). When a phrase is queried the program should answer how often the phrase occured. The problem is, on certain queries the IndexSearcher returns some documents that do not have that particular query in its fields. I'd be delighted if someone could tell me what i am doing wrong. See the source code at my github repo https://github.com/jangingnicht/TextAnalytics2/tree/master/src/textanalytics2/
Thanks in advance jan PS: I use Lucene 3.0.2 and the OpenJDK Runtime Environment (IcedTea6 1.8.2) on an 64 bit Linux machine.
signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil