Thanks, all.
The field cache and the bitsets both seem like good options until the
collection grows too large, provided that the index does not need to
be updated very frequently. Then for large collections, there's
statistical sampling. Any of those options seems preferable to
retriev
: There is a "TermPositions pos = reader.termPositions();" [reader is an
: instance of IndexReader] - but I have no clue, how to get a position of
: a hit in a document. What can I do with TermPosition?
:
: So, I have all hits of my query with "Hits hits =
: searcher.search(query);" - with the hel
http://lucenebook.com
http://www.amazon.com/exec/obidos/asin/1932394281
:)
- Original Message -
From: "Andreas Harth" <[EMAIL PROTECTED]>
To:
Sent: Tuesday, May 16, 2006 10:51 PM
Subject: Theoretical Lucene Performance
Hello,
I'd like to learn a bit more about the index organizati
But how can I retrieve this information during my search process???
I retrieve an object of the Typ Document ... but this object doesn't
have a "getPosition()" or "getTermVector()" methode?!
IndexReader has the appropriate get... methods.
There is a "TermPositions pos = reader.termPosit
Hello,
I'd like to learn a bit more about the index organization of
Lucene (ideally without sifting through source code).
Are there any publications that explain the Lucene indexing
structure in detail? Or is it possible to say in a few sentences
how Lucene works and I can look up the details in
All,
I've just released Zilverline version 1.5.0.
This version adds security and upload functionality, as well as some
minor fixes and enhancements.
The source will be made available as well very soon.
Zilverline is protected by a Collaborative Source License. You can read
more on this type
On Dienstag 16 Mai 2006 18:42, Franz Coriand wrote:
> "private boolean storeTermVector = true;"
> "private boolean storePositionWithTermVector = true;"
Use the optional Field.TermVector parameter in the Field constructor.
> But how can I retrieve this information during my search process???
> I
On Tue, 2006-05-16 at 17:51 +0200, David Trattnig wrote:
> Is it possible to set more than one default-field at the
> QueryParser's constructor? Actually I've set it to "contents" but i'd
> like to search "contents" AND "title" and matches in title should have
> a higher rating.
I've posted a pat
Hello,
I'm working on a very large implementation of a search engine based on the
lucene api (1.4.3). We have also been investigating enterprise search companies
such as FAST and Verity but have come to the conclusion that we might aswell
save ourselves 1 millon dollars by doing our own implem
Daniel Naber schrieb:
On Montag 15 Mai 2006 14:54, Franz Coriand wrote:
is it possible not only to get the document which contains the words of
a query, but also get the position in the text of the query word?
Yes, by using the term vectors with positions that were added in Lucene 1.9
Hi Mike, Hi Eks-Dev,
first of all: Thank you so much! Both of you helped me a lot & it works fine!
> Additionally: If I submit no area
>
> query-string: "hello"
>
> the query should be applied as it would have a matching area.
I'm not sure exactly what you mean. This simple query will only re
try:
1. query-string: "hello +area:home" to get Filtering effect
2. to minimize scoring use boosts: "(hello)^HIGH_BOOST +(area:home)^LOW_BOOST"
3. If scoring via boosts does not work good enough for you, or is slow, use
Filter interface from your code... search this list for Filter
- Or
Hello LuceneList,
I've got at least following fields in my index:
AREA = "home news business"
CONTENTS = "... hello world ..."
If I submit the query
query-string: "hello area:home"
Lucene should only search these documents which has the matching area.
Actually Lucene searches the area, but
David Trattnig wrote:
Hello LuceneList,
I've got at least following fields in my index:
AREA = "home news business"
CONTENTS = "... hello world ..."
If I submit the query
query-string: "hello area:home"
Lucene should only search these documents which has the matching area.
Actually Lucene s
Sharad Agarwal wrote:
I am a newbie in lucene space. and trying to understand lucene search
result caching; facing with a wierd issue.
After creating the IndexReader from a file system directory, I
rename/remove the index directory; but still I am able to search the
index and able to get the
I am a newbie in lucene space. and trying to understand lucene search
result caching; facing with a wierd issue.
After creating the IndexReader from a file system directory, I
rename/remove the index directory; but still I am able to search the
index and able to get the documents from Hits. Th
Hello everybody,
I have an question. It's not related to Lucene, I
know, but I post it here because many of you have
excellent knowledge in computer science and I hope
that you can help me.
My question is how I can extract citation graph of ACM
digital library (or any important digital library i
Hello LuceneList,
I've got at least following fields in my index:
AREA = "home news business"
CONTENTS = "... hello world ..."
If I submit the query
query-string: "hello area:home"
Lucene should only search these documents which has the matching area.
Actually Lucene searches the area, but it
Thanks a lot Jelda.
I'll try this get back with the performance comparison chart.
Regards,
kapilChhabra
Ramana Jelda wrote:
Hi Kapil,
As I remember FieldCache is in lucene api since 1.4 .
Ok . Anyhow here is suedo code that can help.
//1. initialize reader on opening documentId to the category
Hi Kapil,
As I remember FieldCache is in lucene api since 1.4 .
Ok . Anyhow here is suedo code that can help.
//1. initialize reader on opening documentId to the categoryid relation as
below. Depending on your requirement you can either getStringIndex().. I get
StringIndex in //my project.
String
Hi Jelda,
I have not yet migrated to Lucene 1.9 and I guess FieldCache has been
introduced in this release.
Can you please give me a pointer to your strategy of FieldCache?
Thanks & Regards,
Kapil Chhabra
Ramana Jelda wrote:
But this BitSet strategy is more memory consuming mainly if you hav
On May 16, 2006, at 3:02 AM, Mathias Keilbach wrote:
I'm going to create a small application with Lucene, which analyze
diffenrent Strings. While analyzing the strings, patterns (like
emails or urls) shall be sort out and saved in a seperate index field.
I'm not sure if I can handle this with
On May 16, 2006, at 1:37 AM, Kapil Chhabra wrote:
Even I am doing the same in my application.
Once in a day, all the filters [for different categories] are
initialized. Each time a query is fired, the Query BitSet is ANDed
with the BitSet of each filter. The cardinality obtained is the
des
But this BitSet strategy is more memory consuming mainly if you have
documents in million numbers and categories in thousands.
So I preferred in my project FieldCache strategy.
Jelda
> -Original Message-
> From: Kapil Chhabra [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, May 16, 2006 7:38 A
Hi!
I'm going to create a small application with Lucene, which analyze diffenrent
Strings. While analyzing the strings, patterns (like emails or urls) shall be
sort out and saved in a seperate index field.
I'm not sure if I can handle this with a self implemented Analyzer class. Afaik
you can't
25 matches
Mail list logo