: IMHO it would be nice if Lucene's Similarity formula took the
: indexed-date of the document into account. Ideally as an optional
: setting, where the user can provide a date field as well.
It really wouldn't make sense to incorporate this into the Similarity
class.
: Some of the other searc
I'm trying to index information related to Olap Cubes.
Each cube I'm trying to model it like a document.
The cube have the following information:
ID - Unique identifier for the cube
Name - Name of the cube
Description - Description of the cube
(There can be many dimensions per cube)
Dimensi
Thanks for all this. We're doing warmup searching also, but just for
some common date searches. The warmup would be a good place to add some
pre-caching capability. I'll plan for this eventually and start with the
partial cache for now.
Thanks,
Brian Beard
-Original Message-
From: Antony
Thanks
I guess I should have looked in the code before asking those silly questions
:-)
I wonder why there isn't a specific API for that though ...
On Jan 11, 2008 7:36 PM, Steven A Rowe <[EMAIL PROTECTED]> wrote:
> Hi Shai,
>
> On 01/11/2008 at 7:42 AM, Shai Erera wrote:
> > Will IndexReader.max
I've often stored a special sort field that's lower-cased.
On Jan 11, 2008 11:40 AM, Alex Wang <[EMAIL PROTECTED]> wrote:
> Hi All,
>
>
>
> I was searching my index with sorting on a field called "Label" which is
> not tokenized, here is what came back:
>
>
>
> Extended Sites Catalog Asset Store
Hi Shai,
On 01/11/2008 at 7:42 AM, Shai Erera wrote:
> Will IndexReader.maxDocs() - IndexReader.numDocs() give the
> correct result? or this is just a heuristic?
I think your expression gives the correct result - the abstract
IndexReader.numDocs() method is implemented in SegmentReader as:
pu
Hi Lucene Users,
good news: we are planning to release Lucene 2.3 in about ten days from
now! Lucene 2.3 will have significant performance improvements and
various other new features. (see
http://people.apache.org/~buschmi/staging_area/lucene_2_3/CHANGES.txt
for a full list of new features and API
String fields are sorted using natural (lexicographic) order. For characters
in ASCII range this means uppercase letters will sort before lowercase
letters (e.g., 'A' U+0041 sorts before 'a' U+0061). This behaviour is
documented on in the JavaDocs for org.apache.lucene.search.Sort.
-tree
On
Hi All,
I was searching my index with sorting on a field called "Label" which is
not tokenized, here is what came back:
Extended Sites Catalog Asset Store
Extended Sites Catalog Asset Store SALES
Print Catalog 2
Print catalog test
Test Print Catalog
Test refresh catalog
print test 3
You can utilize the CustomScoreQuery introduced in Lucene 2.2 to provide
this type of functionality. This is quite straight forward to do and works
really well. Since "recentness" is a function of the time the search was
made, we store the appropriate date in an index field and use a
CustomScoreQue
Have a look at the Similarity class and also the Scoring section of
the website (Documentation-> Scoring on the left hand side) This is a
classic problem of dealing with TF/IDF and length normalization.
Lucene makes general assumptions about what is best, but does allow
you to tune as wel
See below
On Jan 11, 2008 9:36 AM, <[EMAIL PROTECTED]> wrote:
> Hi,
>
>
> > You could even store all of the page offsets in your
> > meta-data document
> > in a special field if you wanted, then lazy-load that field
> > rather than
> > dynamically counting.
>
> How can I lazy load a field?
>
See
Hi,
When I am searching with lucene, the formula takes care of the number of
total words in the document.
For exemple, an indexed one power-point slide with the term "JAVA" is most
relevent than a 50 pages Word document on JAVA.
It is a problem for me, the Word document on Java should be most r
Hi,
> You could even store all of the page offsets in your
> meta-data document
> in a special field if you wanted, then lazy-load that field
> rather than
> dynamically counting.
How can I lazy load a field?
> You'd have to be careful that your offsets
> corresponded to the data *after* it
Hi
I didn't find a proper API on InderWriter or IndexReader to retrieve the
total number of deleted documents.
Will IndexReader.maxDocs() - IndexReader.numDocs() give the correct result?
or this is just a heuristic?
Thanks,
Shai
On Tue, 2008-01-01 at 15:06 -0500, Mark Miller wrote:
> Perhaps, in some esoteric case, multiple readers is the right idea
> (monster, monster, super IO system, static index?? maybe...)...but
> unless you have run into this case and have some data to show it, I
> would stick with what the commun
16 matches
Mail list logo