On 6/30/06, Dominik Bruhn <[EMAIL PROTECTED]> wrote:
SELECT id,addfield FROM table WHERE id IN ([LUCENERESULT]);
Where LUCENERESULT is like 2,3,19,3,5.
This works fine but got one problem: The Search-Result of Lucene is order by
relevance and so the id-list is also sorted by relevance. But the
ex on the
indexing server and push it to the search server.
Otis
- Original Message
From: Monsur Hossain <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, June 30, 2006 3:39:27 PM
Subject: Changing the MergeFactor - should I reindex?
I have a system of 2 servers
I have a system of 2 servers, one to index and one to search. The
index server updates the Lucene index and then copies the 200 meg
index over to the search server. Originally, the index server would
optimize the index before copying. To improve performance, I stopped
optimizing, dropped the me
When Lucene first issues a query, it caches a hash of sort values (one
value per document, plus a bit more if you are sorting on strings),
which takes a while. Therefore, when our application first starts up,
we issue one query per sort type. As I understand, it doesn't matter
what the query is
How often are you updating your index? Are you closing your old
IndexSearchers after switching over to the new index? You'll need to
close the searchers in order to release the file handle. This was the
same issue I was experiencing:
http://mail-archives.apache.org/mod_mbox/lucene-java-user/2
Anyone have any ballpark stats about sorting a single field versus sorting
multiple fields? I understand every implementation is different, but I'm
just trying to get a sense of what to expect before I revamp my index.
We need fairly fine-grained sorting of items, so I have a field with the
dat
Ah, I got it. retArray is an array of ints; in order to return the string
value, it needs the mterms array to do the mapping. Thanks, Yonik!
Monsur
> -Original Message-
> From: Yonik Seeley [mailto:[EMAIL PROTECTED]
> Sent: Thursday, November 10, 2005 1:33 PM
> To: java-user@lucen
caching a
> String[] (or StringIndex) and int sorting will involve caching an
> int[]. Unique string values are shared in the array, but the String
> values plus the String[] will always take up more room than the int[].
>
> -Yonik
> Now hiring -- http://forms.cnet.com/slink?231706
Hi all. I have a question about sorting. Lucene in Action says: "For
numeric types, each field being sorted for each document in the index
requires that four bytes be cached. For String types, each unique term is
also cached for each document."
I want to make sure I'm understanding this correct
gt;
> Nicely done, looks pretty and seems fast.
>
> How much data is being searched there?
>
> Otis
>
>
> --- Monsur Hossain <[EMAIL PROTECTED]> wrote:
>
> > Hey all. We just relaunched our search feature over here at
> > Xanga.com; the Blogs, Me
Hey all. We just relaunched our search feature over here at Xanga.com; the
Blogs, Metros and Blogrings sections are powered by Lucene.NET! You can
check it out here:
http://search.xanga.com/
This is only the beginning of what we want to do with search and Lucene. I
want to thank everyone on th
We've actually been running into this sort of issue a lot, since we take a
user generated query from a web page and then push it into a QueryParser.
In general we've learned that escaping special characters is not enough to
create a well formed query. Since our users aren't running complicated
que
I'm a little late to this thread. But is there any performance difference
between the compound index format and the multifile index format when
*searching*? The Lucene book mentions a performance difference when
*indexing*, but not when searching.
Monsur
> -Original Message-
> From:
> -Original Message-
> From: Ian Soboroff [mailto:[EMAIL PROTECTED]
>
> Grossman and Frieder's book, "Information Retrieval, Algorithms and
> Heuristics", is out in a second (and much cheaper, too!) edition,
> probably the most up-to-date textbook.
Much along the same lines, I'm curio
Ramya. I don't have an answer to your specific lock file question, but
a couple thoughts.
You say you're using multiple threads to index 50,000 documents. Have
you tried a single thread version first? I'd try that, and then scale
out to multiple threads as needed. We index over ten times that
> -Original Message-
> From: Chris Lamprecht [mailto:[EMAIL PROTECTED]
> Sent: Thursday, April 28, 2005 7:53 PM
>
> Since the "stored fields" index would basically just be a
> database, perhaps this is better served using a traditional
> relational database (or even use the OS's file s
I'm using Lucene.NET, but I had a similar issue with Visual Studio. With
Visual Studio open, my application would randomly crash with the same error
when I tried to run it from the command line. I'd recommend shutting down
all running apps and then see if the error happens in Ant. You could als
> I ran this test a little differently than letting the
> IndexSearcher get garbage collected. Instead, I explicitly closed the
> searcher (reader) and reopened it periodically.
Thanks Chuck, this is all really helpful. That explicit close() is what
allows the files stored up in "deletable" to
> Just tried this on my linux laptop - with IndexSearcher uncommented, I
> still get a single .cfs file.
Hmmm, rereading this, I'm curious to know how/why this works in Linux.
Consider this scenario:
1) Create a new index
2) Create a new IndexSearcher pointing to that index.
3) Run an incremen
> Just tried this on my linux laptop - with IndexSearcher uncommented, I
> still get a single .cfs file. It's one of those problems
> where Windows
> doesn't let you erase the file. I'd start this SortTest in the
> debugger and step through it until you find a spot where you see that
> some inde
> Do you get 2 .cfs files even if you add isearcher.close() right after
> you open the IndexSearcher?
Nope! Adding the close() right after the open gives me one .cfs file.
Monsur
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
Fo
oo late to delete the old
files.
Thanks,
Monsur
> -Original Message-
> From: Chuck Williams [mailto:[EMAIL PROTECTED]
> Sent: Thursday, April 28, 2005 10:09 PM
> To: java-user@lucene.apache.org
> Subject: Re: IndexSearcher hanging on to old index files in Windows
>
Ok, I've written up a Java test with Lucene 1.4.3, the code is pasted below.
The code creates a new index, creates an IndexSearcher object, and then does
an incremental index/optimize. The IndexSearcher line is commented out.
When I run this code, I end up with a single "segments", "deletable" and
ow running for 40 day since we launched it productively.
> No problem at all! We have two index directories between
> which we switch back and forth though?
>
> Frank
>
> >-Original Message-
> >From: Monsur Hossain [mailto:[EMAIL PROTECTED]
> >Sent: Frida
Hi all. I'm running Lucene.NET in a Windows/ASP.NET environment. We are
searching a 300meg index in a web environment, where the IndexSearcher is
cached. Every 10-30 minutes, a separate process updates the index. When
ASP.NET's cache detects a changed index, it drops the current IndexSearcher
The setup: Using Lucene.NET in a web environment on Win2k3 servers. One
process runs every 5 minutes, grabbing new rows from the database, and
adding them to a Lucene index. Only additions are made to the index, no
deletions. The mergeFactor is set to 2 to minimize the number of segments.
This
26 matches
Mail list logo