ok, thanks for the link. I will have a look and see...but if this is really
as slow as you describe it, I probably have to accept it like it is and let
it.
--
View this message in context:
http://www.nabble.com/fuzzy-sentence-search-t1516604.html#a4118600
Sent from the Lucene - Java Users forum a
Is it possible to search sentences, more than one word at a time, or phrases
with fuzzy search?
I have implemented fuzzy search, if I only search one single word it works
fine, but if I start searching more than one word or a sentence it does not
find anything...strange, when I set the relevance
Im trying to construct a plaintext parser for different file formats like ms
word, excel, powerpoint, rich text format, plain text, html, pdf etc.
I use the known libraries PDFBox, POI and some parts from AtLeap...and now I
should support the OpenOffice formats and the more important msg-fromat (
yes, this might be a way, but in my case it would not work:
The probles is, that I have to return an exceprt (snippet) and the words to
be highlighted as two separate strings. So now I use highlighter and
getBestFragment to extract the excerpt, then I remove the inserted html tags
and return the
HashSet terms = new HashSet();
query.rewrite(reader).extractTerms(terms);
Ok, but this delivers every term, not just a list of words the Levenshtein
algorithm produced with similarity. Regarding to the posts here in my opened
thread, you guis seem to be experienced programmers so
ok, thank Erik, now it works :-)
Probably, do you know if there is a possibility to get the similar words
generated by the algorithm when doing fuzzy search?
Cheers
Simon Dietschi
--
View this message in context:
http://www.nabble.com/highlighting---fuzzy-search-t1392775.html#a3746483
Sent fro
Ok, thanks Erik. So probably my code may explain it:
---
public void searchQuery(String q, float rel, String indexDir){
String excerpt = "";
Is it possible to get back a highlighted text "snippet" when using fuzzy
search? I mean where does lucene stores the similar words to the search
query? If I know where these words are, I can use one of these words to
highlight.
thx
Simon Dietschi
--
View this message in context:
http://www.nabb
I want to have a simple hit score for every document where the query has been
found. E.g. if the query word was found 3 times in a document, this doc
should have 100% score, next document with 2 times should have 90% and so
on...
Normal hit score used by Lucene seems to be strange so I only want