Re: Sorting & SQL-Database

2006-07-03 Thread Monsur Hossain
On 6/30/06, Dominik Bruhn <[EMAIL PROTECTED]> wrote: SELECT id,addfield FROM table WHERE id IN ([LUCENERESULT]); Where LUCENERESULT is like 2,3,19,3,5. This works fine but got one problem: The Search-Result of Lucene is order by relevance and so the id-list is also sorted by relevance. But the

Re: Changing the MergeFactor - should I reindex?

2006-06-30 Thread Monsur Hossain
ex on the indexing server and push it to the search server. Otis - Original Message From: Monsur Hossain <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Friday, June 30, 2006 3:39:27 PM Subject: Changing the MergeFactor - should I reindex? I have a system of 2 servers

Changing the MergeFactor - should I reindex?

2006-06-30 Thread Monsur Hossain
I have a system of 2 servers, one to index and one to search. The index server updates the Lucene index and then copies the 200 meg index over to the search server. Originally, the index server would optimize the index before copying. To improve performance, I stopped optimizing, dropped the me

Re: preloading / "warming up" the index

2006-05-31 Thread Monsur Hossain
When Lucene first issues a query, it caches a hash of sort values (one value per document, plus a bit more if you are sorting on strings), which takes a while. Therefore, when our application first starts up, we issue one query per sort type. As I understand, it doesn't matter what the query is

RE: References to deleted file handles in long-running server application

2005-11-17 Thread Monsur Hossain
How often are you updating your index? Are you closing your old IndexSearchers after switching over to the new index? You'll need to close the searchers in order to release the file handle. This was the same issue I was experiencing: http://mail-archives.apache.org/mod_mbox/lucene-java-user/2

Sorting: single field vs multiple fields

2005-11-17 Thread Monsur Hossain
Anyone have any ballpark stats about sorting a single field versus sorting multiple fields? I understand every implementation is different, but I'm just trying to get a sense of what to expect before I revamp my index. We need fairly fine-grained sorting of items, so I have a field with the dat

RE: Sorting: string vs int

2005-11-10 Thread Monsur Hossain
Ah, I got it. retArray is an array of ints; in order to return the string value, it needs the mterms array to do the mapping. Thanks, Yonik! Monsur > -Original Message- > From: Yonik Seeley [mailto:[EMAIL PROTECTED] > Sent: Thursday, November 10, 2005 1:33 PM > To: java-user@lucen

RE: Sorting: string vs int

2005-11-10 Thread Monsur Hossain
caching a > String[] (or StringIndex) and int sorting will involve caching an > int[]. Unique string values are shared in the array, but the String > values plus the String[] will always take up more room than the int[]. > > -Yonik > Now hiring -- http://forms.cnet.com/slink?231706

Sorting: string vs int

2005-11-09 Thread Monsur Hossain
Hi all. I have a question about sorting. Lucene in Action says: "For numeric types, each field being sorted for each document in the index requires that four bytes be cached. For String types, each unique term is also cached for each document." I want to make sure I'm understanding this correct

RE: Lucene and Xanga.com

2005-08-24 Thread Monsur Hossain
gt; > Nicely done, looks pretty and seems fast. > > How much data is being searched there? > > Otis > > > --- Monsur Hossain <[EMAIL PROTECTED]> wrote: > > > Hey all. We just relaunched our search feature over here at > > Xanga.com; the Blogs, Me

Lucene and Xanga.com

2005-08-23 Thread Monsur Hossain
Hey all. We just relaunched our search feature over here at Xanga.com; the Blogs, Metros and Blogrings sections are powered by Lucene.NET! You can check it out here: http://search.xanga.com/ This is only the beginning of what we want to do with search and Lucene. I want to thank everyone on th

RE: QueryParser exception on escaped backslash preceding ) character

2005-08-15 Thread Monsur Hossain
We've actually been running into this sort of issue a lot, since we take a user generated query from a web page and then push it into a QueryParser. In general we've learned that escaping special characters is not enough to create a well formed query. Since our users aren't running complicated que

RE: Hardware Question

2005-08-02 Thread Monsur Hossain
I'm a little late to this thread. But is there any performance difference between the compound index format and the multifile index format when *searching*? The Lucene book mentions a performance difference when *indexing*, but not when searching. Monsur > -Original Message- > From:

RE: Search Theory Book

2005-05-13 Thread Monsur Hossain
> -Original Message- > From: Ian Soboroff [mailto:[EMAIL PROTECTED] > > Grossman and Frieder's book, "Information Retrieval, Algorithms and > Heuristics", is out in a second (and much cheaper, too!) edition, > probably the most up-to-date textbook. Much along the same lines, I'm curio

RE: indexdir/segments (No such file or directory) lock file present..

2005-05-12 Thread Monsur Hossain
Ramya. I don't have an answer to your specific lock file question, but a couple thoughts. You say you're using multiple threads to index 50,000 documents. Have you tried a single thread version first? I'd try that, and then scale out to multiple threads as needed. We index over ten times that

RE: Splitting index into indexed fields and stored fields for performance

2005-05-10 Thread Monsur Hossain
> -Original Message- > From: Chris Lamprecht [mailto:[EMAIL PROTECTED] > Sent: Thursday, April 28, 2005 7:53 PM > > Since the "stored fields" index would basically just be a > database, perhaps this is better served using a traditional > relational database (or even use the OS's file s

RE: Intermittent exception on optimize(): IOException: Cannot delete _5.cfs

2005-05-09 Thread Monsur Hossain
I'm using Lucene.NET, but I had a similar issue with Visual Studio. With Visual Studio open, my application would randomly crash with the same error when I tried to run it from the command line. I'd recommend shutting down all running apps and then see if the error happens in Ant. You could als

RE: IndexSearcher hanging on to old index files in Windows

2005-04-29 Thread Monsur Hossain
> I ran this test a little differently than letting the > IndexSearcher get garbage collected. Instead, I explicitly closed the > searcher (reader) and reopened it periodically. Thanks Chuck, this is all really helpful. That explicit close() is what allows the files stored up in "deletable" to

RE: IndexSearcher hanging on to old index files in Windows

2005-04-29 Thread Monsur Hossain
> Just tried this on my linux laptop - with IndexSearcher uncommented, I > still get a single .cfs file. Hmmm, rereading this, I'm curious to know how/why this works in Linux. Consider this scenario: 1) Create a new index 2) Create a new IndexSearcher pointing to that index. 3) Run an incremen

RE: IndexSearcher hanging on to old index files in Windows

2005-04-29 Thread Monsur Hossain
> Just tried this on my linux laptop - with IndexSearcher uncommented, I > still get a single .cfs file. It's one of those problems > where Windows > doesn't let you erase the file. I'd start this SortTest in the > debugger and step through it until you find a spot where you see that > some inde

RE: IndexSearcher hanging on to old index files in Windows

2005-04-28 Thread Monsur Hossain
> Do you get 2 .cfs files even if you add isearcher.close() right after > you open the IndexSearcher? Nope! Adding the close() right after the open gives me one .cfs file. Monsur - To unsubscribe, e-mail: [EMAIL PROTECTED] Fo

RE: IndexSearcher hanging on to old index files in Windows

2005-04-28 Thread Monsur Hossain
oo late to delete the old files. Thanks, Monsur > -Original Message- > From: Chuck Williams [mailto:[EMAIL PROTECTED] > Sent: Thursday, April 28, 2005 10:09 PM > To: java-user@lucene.apache.org > Subject: Re: IndexSearcher hanging on to old index files in Windows >

RE: IndexSearcher hanging on to old index files in Windows

2005-04-28 Thread Monsur Hossain
Ok, I've written up a Java test with Lucene 1.4.3, the code is pasted below. The code creates a new index, creates an IndexSearcher object, and then does an incremental index/optimize. The IndexSearcher line is commented out. When I run this code, I end up with a single "segments", "deletable" and

RE: IndexSearcher hanging on to old index files in Windows

2005-04-28 Thread Monsur Hossain
ow running for 40 day since we launched it productively. > No problem at all! We have two index directories between > which we switch back and forth though? > > Frank > > >-Original Message- > >From: Monsur Hossain [mailto:[EMAIL PROTECTED] > >Sent: Frida

IndexSearcher hanging on to old index files in Windows

2005-04-28 Thread Monsur Hossain
Hi all. I'm running Lucene.NET in a Windows/ASP.NET environment. We are searching a 300meg index in a web environment, where the IndexSearcher is cached. Every 10-30 minutes, a separate process updates the index. When ASP.NET's cache detects a changed index, it drops the current IndexSearcher

Deploying index to multiple webservers

2005-03-23 Thread Monsur Hossain
The setup: Using Lucene.NET in a web environment on Win2k3 servers. One process runs every 5 minutes, grabbing new rows from the database, and adding them to a Lucene index. Only additions are made to the index, no deletions. The mergeFactor is set to 2 to minimize the number of segments. This