Yes, have a look at this:
http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/Similarity.html
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
- Original Message
> From: starz10de
> T
Hello,
Here is a class you can use for that:
./contrib/miscellaneous/src/java/org/apache/lucene/misc/HighFreqTerms.java
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
- Original Message
> From: star
: : The same here, even with trunk from yesterday. If you create a field, it
: : stays there forever, even after deleting *all* documents from index,
: : reindexing without the field and optimizing.
:
: Uwe: if you have a quick test case already written can you try it against
: 2.4 (and maybe 2.
How to get the most frequent terms in the index in descending order?
Thanks
--
View this message in context:
http://www.nabble.com/most-frquent-term-in-the-index-tp24651807p24651807.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---
Does lucene use cosine smiliarity measure to measure the similarity between
the query and the indexed documents?
Thanks
--
View this message in context:
http://www.nabble.com/Cosine-similarity-tp24651759p24651759.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
-
There are a couple of things I can think of:
1) From IndexReader's javadoc (
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/index/IndexReader.html#deleteDocument%28int%29):
"An IndexReader can be opened on a directory for which an IndexWriter is
opened already, but it cannot be used to
walid, it is true some of what you mentioned (from aramorph) works in
light stemming version, some does not.
The problem is that its not clear to me that what aramorph is doing is
really the best.
>From the paper I sent you:
The best stemmer in our experiments, light8-s was very simple and did
no
yes Ahmet Arslan ...this works!!
I've just tested it and works nicely...
really thanks..
Ahmet Arslan wrote:
>
>
> Or alternatively:
>
> String test = "HÄllo HÄllo HÄllo HÄllo HÄllo";
>
> ISOLatin1AccentFilter filter = new ISOLatin1AccentFilter(new
> KeywordTokenizer(new St
I'm trying to index all the words without accent.
I do the same when I'm querying, I remove the accent and lower case the
search term.
Why should I pass the string through the analyzer?
or what is wrong if don't pass it through the analyzer?
and what are the benefits?
I'm just a newbie with Lucene
Or alternatively:
String test = "HÄllo HÄllo HÄllo HÄllo HÄllo";
ISOLatin1AccentFilter filter = new ISOLatin1AccentFilter(new
KeywordTokenizer(new StringReader(test)));
final Token reusableToken = new Token();
Token nextToken;
if ((nextToken = filter.next(re
On Fri, Jul 24, 2009 at 11:41 AM, luther blisset wrote:
>
> Hi folks,
> I just upgrading Hibernate Search library of my app and so I had to upgrade
> Lucene too and pass from 2.2 to 2.4 version.
> In Lucene 2.4 the ISOLatin1AccentFilter class has changed and I can't figure
> how it works.
> I use a
Hi folks,
I just upgrading Hibernate Search library of my app and so I had to upgrade
Lucene too and pass from 2.2 to 2.4 version.
In Lucene 2.4 the ISOLatin1AccentFilter class has changed and I can't figure
how it works.
I use a TwoWayFieldBridge to index the data and this is my set method:
publ
We've a centralized lucene index running on a nfs share. This index gets
an update per 30 min. The LuceneServer nodes will notice the update and
copy the index (about 2,5gig) to a local tmpfs directory. Searching is
way faster in our case compared to a local disk.
However eks' concerns are valid an
We were using the aramorph library for some time and so we mapped out
the set of features it provides, they come as follows:
The ء and ~
14 matches
Mail list logo