Hi,
And is it not passibe to sort on the result we get instead of on all the
values like
Hits hits = searcher.search(query);
and it will be good if got sorting on the hits i.e on the result
thanks for the reply
markrmiller wrote:
>
> To sort on 13mil docs will take like at least 400 mb fo
I'm not sure how the current Highlighter works - haven't had the time to
look into it yet - but I thought about the following implementation. Judging
by your question, this works in a slightly different way than the current
Highlighter:
1. Build a Radix tree (PATRICIA) and populate it with all se
Hi All,
I need help on retrieving results based on relevance + freshness. As of
now, i get based on either of the fields, either on relevance or freshness.
how can i achieve this. Lucene retrieves results on relevance but also
fetches old results too. i need more relevant results with freshne
luceneuser a écrit :
Hi All,
I need help on retrieving results based on relevance + freshness. As of
now, i get based on either of the fields, either on relevance or freshness.
how can i achieve this. Lucene retrieves results on relevance but also
fetches old results too. i need more relevan
Have a look at the FunctionQuery capabilities in Lucene in
org.apache.lucene.search.function
You can use this to have field values factor into the score.
-Grant
On Mar 19, 2008, at 3:43 AM, luceneuser wrote:
Hi All,
I need help on retrieving results based on relevance + freshness. As
o
Hi All,
Can someone please guide me on how to use IndexReader's
getFieldNames() method properly?
I want to get all the filed names in the index. Currently I am getitng
it via Document object but that not wt i want.
I am implementing the code below and what I get is a very long string
of character
Can you give an example of the output?
What does out.print() do? Does it print spaces between records on new-lines?
On Wed, Mar 19, 2008 at 3:17 PM, varun sood <[EMAIL PROTECTED]> wrote:
> Hi All,
> Can someone please guide me on how to use IndexReader's
> getFieldNames() method properly?
> I wa
Hi,
I'm trying to write to a specific index from several different processes and
encounter problems with locked files (deletable for example).
I don't perform any specific locking because as I understand it there should
be file-specific locking mechanism used by lucene API. This doesn't seem to
You'll get more meaningful answers if you provide some details:
Things that come to mind:
op system (windows? *nix?)
file system (NFS? local? NTFS?)
An example of the error you receive (a stack trace would be good).
The code you're executing when you get the error.
Imagine you're trying to adv
Hi,
I'm trying to write to a specific index from several different processes and
encounter problems with locked files (deletable for example).
I don't perform any specific locking because as I understand it there should
be file-specific locking mechanism used by lucene API. This doesn't seem to
be
Are you using multiple computers?
Probably what's happening is: because older versions of Lucene store
the lock file in the /tmp directory by default, multiple computers
sharing an index will be able to open multiple writers because they
have their own /tmp directories. They don't see eac
Sorry for any duplicate posts.
Actually I'm using the latest "final" Lucene.Net and I hope this problem is
not unique to this version.
The OS is windows, FS - NTFS.
Here's an example of what I do in each process (which may reside on a
different computer):
writer = new IndexWriter(
Hello there! I have just started with lucene. Bought the Lucene in action
book [right now I'm at chap 4, plus the 10th chapter, great explanation by
Terence from jGuru, really nice stuff], also I'm reading most that I can at
the wiki :)
Still a bit lost with some stuff, mostly with clusters :)
Our
We went through this a couple of years ago. I couldn't find the thread in
the archive but the jist of it is as follows:
1. We have a singleton thread that does all of the writing. new
Documents and deletions are queued to the writer via a database table.
2. Since searchers are "point in time
Thanks a lot for sharing this :)
I'll try to follow your guidelines
Regards
-- Forwarded message --
From: <[EMAIL PROTECTED]>
Date: Wed, Mar 19, 2008 at 12:16 PM
Subject: Re: Lucene on a cluster environment
To: java-user@lucene.apache.org
We went through this a couple of years a
Hello there! Since I've just begun with lucene, some concepts are kinda new
for me :). One of the is the whole indexing process. Well, AFAIK, indexing
should happen in a batch process right, to maximize the time spent on this
operation. One issue tough, is that our client wants "instants search
res
Hi Shai,
The code I pasted is not working.. sorry abt that..
The code which is working is ..
Collection c = ir.getFieldNames(IndexReader.FieldOption.ALL);
int i = 0;
while (c.iterator().hasNext()) {
out.print(c.iterator().next(););
i++;
}
This hangs my machine for minutes minutes on
Try this :
for (Iterator iter = reader.getFieldNames(FieldOption.ALL).iterator();
iter.hasNext();) {
String fieldName = (String)iter.next();
}
Your code creates iterator each time when you call next()
Also, if your method out.print() gets String as parameter, casting is
redundan
Hi Robert,
Did you run into any performance issues (because multiple searchers accessed a
single index on a shared directory)? Also, did you employ some redundancy
scheme to ensure that the shared directory is always "available"? Thank you.
> To: java-user@lucene.apache.org
> Subject: Re: Lucen
You are asking for a new iterator each time around the loop - you'll just be
printing the first field forever.
- Original Message
From: varun sood <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, 19 March, 2008 4:26:39 PM
Subject: Re: IndexReader getFieldNames()
Hi Constantin and others,
Thanks very much for the reply.
The code fragment works.
thanks
Varun
On Wed, Mar 19, 2008 at 12:42 PM, Constantin Radchenko
<[EMAIL PROTECTED]> wrote:
> Try this :
>
> for (Iterator iter = reader.getFieldNames(FieldOption.ALL).iterator();
> iter.hasNext();) {
>
No noticeable performance hit, searches are not a bottleneck in our
system. We don't have disk redundancy.
Dragon Fly <[EMAIL PROTECTED]>
03/19/2008 11:47 AM
Please respond to
java-user@lucene.apache.org
To
cc
Subject
RE: Lucene on a cluster environment
Hi Robert,
Did you run into
Heres what happens: in order to sort all of the hits you get back on a
field, you need to get the value of that field for comparisons right?
Well it turns out that reading a field value from the index is pretty
slow (its on the disk after all)...so Lucene will read all of the terms
in the field
Hi,
I emailed a question earlier about the difference between OR and AND in a
Boolean query. So in what I am trying to do, I need AND to behave like an
OR ( or what I like to call "soft AND"), and I need OR to behave like a
logic OR, meaning that I don't want to reward documents that have more
Hello there! This is really a dumb question, but I just need to get things
started :( I'm just trying to get things working here, and I'm not being
able to index :(. Here's my code:
public abstract class AbstractLuceneIndexer implements LuceneIndexer{
protected String INDEX_DIR = "";
pu
Doh Sorry, never mind, returning different indexWriter instances :P
On Wed, Mar 19, 2008 at 7:21 PM, Vinicius Carvalho <
[EMAIL PROTECTED]> wrote:
> Hello there! This is really a dumb question, but I just need to get things
> started :( I'm just trying to get things working here, and I'm not
On Thursday 20 March 2008 07:22:27 Mark Miller wrote:
> You might think, if I only ask for the top 10 docs, don't i only read 10
> field values? But of course you don't know what docs will be returned as
> each search comes in...so you have to cache them all.
If it lazily cached one field at a tim
Hi,
Adding new documents to the index is not very costly even when the
index is large. The newly added documents result in additional
segment(s). If you then optimize then that is extremely costly for a
large index. Test your search performance to see if you really need
to optimize.
On Wednesday 19 March 2008 18:28:15 Itamar Syn-Hershko wrote:
> 1. Build a Radix tree (PATRICIA) and populate it with all search terms.
> Phrase queries will be considered as one big string, regardless their
> spaces.
>
> 2. Iterate through your text ignoring spaces and punctuation marks, and for
>
I'm building this for my application which uses (will at least) query
inflation - no stemming, just basic tokenizing.
I'm using Radix and letter based lookup (which is how Radix works) since I
want to execute the highlighting on large documents too, possibly with a lot
of terms (since I'm inflatin
: You might think, if I only ask for the top 10 docs, don't i only read 10 field
: values? But of course you don't know what docs will be returned as each search
: comes in...so you have to cache them all.
Arguements have been made in the past that when you have an index
large enough that the Fi
: I emailed a question earlier about the difference between OR and AND in a
: Boolean query. So in what I am trying to do, I need AND to behave like an OR (
: or what I like to call "soft AND"), and I need OR to behave like a logic OR,
: meaning that I don't want to reward documents that have more
luceneuser skrev:
Hi All,
I need help on retrieving results based on relevance + freshness. As of
now, i get based on either of the fields, either on relevance or freshness.
how can i achieve this. Lucene retrieves results on relevance but also
fetches old results too. i need more relevant r
33 matches
Mail list logo