You can write your own StoredFieldVisitor that excludes the document fields you
don't want to have (and pass it to IndexReader.getDocument()). But keep in
mind, that the underlying data structures do not support lazy loading at all.
So whenever you want to load a single document field
Thanks Uwe,
Is there any way I can implement this using lucene 4 ?
Regards
Geet.
On Wed, Jan 1, 2014 at 3:55 PM, Uwe Schindler wrote:
> Hi,
>
> Lazy stored field loading is no longer available with Lucene 4. There is
> only an emulation layer, but which does in fact not do lazy
Hi,
Lazy stored field loading is no longer available with Lucene 4. There is only
an emulation layer, but which does in fact not do lazy loading (it just
emulates for backwards compatibility) in the misc module.
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http
Hi,
I was trying to search for Lazy loading using lucene api, but not able to
figure out how to implement it, Please help me.
Regards
Geet
So this is just the old problem of avoiding reading large, less
frequently accessed fields when you are trying to read just the smaller
more frequently accessed fields eg titles.
You can achieve this by:
a) Modifying Lucene using something like the code I originally posted
which stops reading
in my last post to this:
>
> for(int i=0;i {
> String fieldA=reader.document(i).get("fieldA");
Which brings me back full-circle, because reader.document(i) loads the entire
document with all its fields, hence the request for document lazy-loading...
k
>>> "to be able" != "able to be"
OK, I thought you wanted to count terms within the
title field. If you want to group counts on the whole
field value change the loop in my last post to this:
for(int i=0;ihttp://uk.messenger.yahoo.com
-
Hey Mark, thanks for the code sample. I did look into this, but for a book's
title field, for example,
"to be able" != "able to be"
and
"java programmer" != "programmer (java)" - tokenizer will remove the
parentheses
so in my use case at least, a field value isn't simply an array of its terms.
Your requirement was clear but I guess my suggested
solution wasn't.
Here it is in detail:
public class CountTest
{
public static void main(String[] args) throws
Exception
{
RAMDirectory tempDir = new RAMDirectory();
Analyzer analyzer=new WhitespaceAnalyze
Ah, I apologize. My use of the word "frequency" was misleading. By that, I
meant, the number of hits/documents, whose fields have that value. Once again:
doc a=title:1,keyword:a,contents:somelongmemoryhoggingstring
doc b=title:1,keyword:a,contents:somelongmemoryhoggingstring
doc c=title:1,keyword
The new TermFreqVector code sounds like what you need
here. This gives you fast access to precomputed totals
of term frequencies for each document.
See IndexReader.getTermFreqVector
Send instant messages to your online friends http://uk.messenger.yahoo.com
Neither. :-)
4) Top 10 fieldvalues (for some fields) returned in search results
So, let's say the results of a search were:
doc a=title:1,keyword:a,contents:somelongmemoryhoggingstring
doc b=title:1,keyword:a,contents:somelongmemoryhoggingstring
doc c=title:1,keyword:b,contents:somelongmemoryhog
Not sure I get what the requirement is yet:
>>Here's my requirement, ..I need to perform a simple
>>"Top 10 most frequent occurring " from a
search.
Does this mean:
1)Top 10 fieldnames present in each of your matching
documents?
2)Top 10 most frequent terms found in a choice of
field?
3)Top 10
hat way, the iteration through all fields can be avoided.
>
> There's a price to pay for allowing clients too much
> freedom and I think lazy loading of field values is an
> example of something which is too costly.
> I personally prefer a search interface which requires
> clients t
14 matches
Mail list logo