Cool! Only one question: if we have
class RelevanceAndDistanceCollector extends
HitCollector
{
public ScoreDoc[] getMatches(int start, int size)
{
...
}
}
and a call of getMatches(1, 25); would not cache
as many as 1+ docs, would it? Remember this is the
whole point o
Hi
I am using lucene to index all my data, and it is working just great.
I will now add search to a web application, so the index can actually be
used, not just sit there.
I know how to to this, but I have been going around thinking on what is
the best practice. Speed is essential for me.
1. Ca
Hi Edward,
We have indexed the MedLine data. We used the default StopAnalyzer on
the full text fields (fields that are more than just dates or ids) and
the default Keyword for the other fields. So the index has the short
fields stored in it and just indexing for the larger fields. In our
a
I'm investigating possible alternatives for indexing/searching a very
large dataset (2TB) of xml data from the pubmed database[1]. Does
anyone have any experience working with indexes of this size? Granted
the actual index size would be smaller than the source files, but I'm
just curious h
Hi,
I am playing with Lucene source code and have this somewhat stupid question,
so please bear with me ;-)
Basically, I want to implement a custom ranking algorithm. That is,
iterating through the documents that contains all the search keywords, for
each document, retrieve its inverted docum
Here's an example I put together to illustrate the point.
package distance;
import java.io.IOException;
import java.util.ArrayList;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.WhitespaceAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lu
I'm trying to run a Lucene (1.4.3) index through an RMI server on a Windows
machine, but I'm getting the following error when I try to read some (but
not all) documents from the Hits object:
SEVERE: java.io.IOException: The handle is invalid
java.io.IOException: The handle is invalid
Does anyone have experiences with relevance feedback and lucene or just
knows some good websites?
thx
stefan
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hi,
I need to merge two indexes into one which is accessed by a Searcher in
Tomcat. Tomcat keeps the searcher (or reader) open for good performance.
However, on Windows you cannot delete a file when it's opened for reading,
so I cannot do the merge while Tomcat is running and the reader is open
On Monday 19 September 2005 18:24, Erik Hatcher wrote:
> So what's the deal with this? It looks like something is wrong with
> your environment if it cannot resolve java.io.Reader.
There once was a problem that the import statement for this was missing in
the .jj file and thus it's missing in
This is interesting, one I had not considered.
Mark - are there any code samples that implement this approach? Or maybe
something similar in approach?
thanks,
jeff
On 9/19/05, mark harwood <[EMAIL PROTECTED]> wrote:
>
> I think the HitCollector approach was fine but needed
> a couple of changes
On Sep 19, 2005, at 11:03 AM, tirupathi reddy wrote:
C:\LUCENE-CURRENT\SOURCE\lucene-1.4.3>ant -Djavacc.home=c:/javacc
javacc
Buildfile: build.xml
init:
javacc-check:
javacc-StandardAnalyzer:
invoke-javacc:
[java] Java Compiler Compiler Version 3.2 (Parser Generator)
[java] (type "ja
I think this is probably the closest thing I like to/am able to do now. If I
ever get to do this, I'll share the idea/code and seek review and suggestions.
Thank you very much, Mark, and all others that have helped!
-James
mark harwood <[EMAIL PROTECTED]> wrote:
I think the HitCollector appro
Hello Erik,
The output from ant command is :
C:\LUCENE-CURRENT\SOURCE\lucene-1.4.3>ant
Buildfile: build.xml
init:
[mkdir] Created dir: C:\LUCENE-CURRENT\SOURCE\lucene-1.4.3\build
[mkdir] Created dir: C:\LUCENE-CURRENT\SOURCE\lucene-1.4.3\dist
compile-core:
[mkdir] Created dir: C:
I believe there a several ways of doing it. You can use the
MoreLikeThis contribution at
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/similarity
or you can roll your own using the TermVector implementation.
Basically, do your first search, get the term vector from the document
you ar
Hi
I was wondering how would you search for documents similar to a
specified document using Lucene?
The context would be that I categorise document A manually, and then
search for documents with similar terms. Hopefully the documents
returned would be in the same category/theme as document A.
The
I think the HitCollector approach was fine but needed
a couple of changes:
1) use a PriorityQueue subclass in place of the
SortedSet to keep only the top n scoring docs
2) multiply lucene score by a distance measurement
based on the current doc's location (doc location
being read from a cached arra
On Sep 19, 2005, at 4:41 AM, tirupathi reddy wrote:
Hello,
I am using Lucene for for searching in my application.
My application needs prefix wildcard search also.
But Lucene doesn't support this. So I changed in the QueryParser.jj
file
FROM:
|
(<_TERM_CHAR> | ( [ "*", "?"
>>does it deal w/ aggregate functions and group by
>> clauses?
Yes, it is basically *all* the normal SQL
functionality but with the added option to mix in
scores from lucene queries to the criteria.
>From the example code:
select top 10 count(*) as numAds,pricePounds from ads
where pricePounds
Hello,
I am using Lucene for for searching in my application.
My application needs prefix wildcard search also.
But Lucene doesn't support this. So I changed in the QueryParser.jj file
FROM:
|
(<_TERM_CHAR> | ( [ "*", "?" ] ))* >
To:
| | ( [ "*", "?" ] ))* >
And then I build
On Sep 18, 2005, at 3:39 PM, James Huang wrote:
> So the question is, is there a way to overriding score
> calculation at runtime? In the lucene/search package,
> I see interfaces like Scorer, Weight and methods like
> Query.createWeight(). This looks promising.
You indeed need to override the fol
21 matches
Mail list logo