date:20060130

Reindexing

2006-01-30 Thread revati joshi

Hello Lucene members. i tried to do reindexing using Lifecycle interface of Hibernate ,but i'm stuck up with the implementation part of this interface. I wrote the code for it but i'm now stuck up with the concept of Hibernate. It uses methods lkie Onsa

RE: Help with indexing and query strategy

2006-01-30 Thread Colin Young

I have thought about that. I couldn't figure out a way to make it work. Fortunately, I have managed to solve the problem (excepting prefix or wildcard searches) which is very close to what Rajesh suggested (also see my response to his response). Thanks for taking a look. Colin -Original Mes

RE: Help with indexing and query strategy

2006-01-30 Thread Colin Young

Actually, I arrived at a very similar solution for indexing as you did, but I've been away from a connection, so I haven't been able to post it here. Essentially I'm adding the items as you suggest, but I've built a synonym injector (actually I'm just using the one from "Lucene in Action") to prod

Re: Help with indexing and query strategy

2006-01-30 Thread Jeff Rodenburg

Have you considered evaluating doc-score thresholds for limiting your results? Since the perfect answers to these situations lie in the constant tweaking and twiddling of analysis and tokenization, one way I've found to help is to evaluate result scores. In your "Ontario CA" example, limiting res

RE: grouping results by fields

2006-01-30 Thread zzzzz shalev

hey chris, i was using the hits.doc method while iterating,,, you've given me some hope!! i will look into the FieldCache Chris Hostetter <[EMAIL PROTECTED]> wrote: : currently , i am iterating through about 200-300 of the top docs and : creating the groups (so, as of now, the groups

Re: Number Searches vs Character

2006-01-30 Thread Chris Hostetter

PrefixQuery is implimented as a BooleanQuery using term expansion. what that means is that a prefix query on a common prefix is much more expensive then a prefix query on a less common prefix. not just in terms of hte number of documents that match, but because of the number of terms that match

RE: grouping results by fields

2006-01-30 Thread Chris Hostetter

: currently , i am iterating through about 200-300 of the top docs and : creating the groups (so, as of now, the groups are partial) , my : response time HAS to be at most 500-600 milli (query + groupings) or my : company will probably go with a commercial search engine such as FAST or : somethin

RE: grouping results by fields

2006-01-30 Thread zzzzz shalev

thanks for the advice guys! currently , i am iterating through about 200-300 of the top docs and creating the groups (so, as of now, the groups are partial) , my response time HAS to be at most 500-600 milli (query + groupings) or my company will probably go with a commercial search engine

Number Searches vs Character

2006-01-30 Thread Aigner, Thomas

I am curious what would be the difference between searching for a number verses a character. I have a large index consisting of a few fields (So index would look something like: " 123123123 my description my catalog" Searching for 12* is much slower than searching for de* I don't have a

[OT] Unidecode?

2006-01-30 Thread petite_abeille

Hello, Does anyone know of a Java port of Sean M. Burke's Unidecode? http://interglacial.com/~sburke/tpj/as_html/tpj22.html http://search.cpan.org/~sburke/Text-Unidecode-0.04/lib/Text/Unidecode.pm TIA. Cheers -- PA, Onnay Equitursay http://alt.textdrive.com/

Re: Help with indexing and query strategy

2006-01-30 Thread Rajesh Munavalli

For now, the best I could come up with is the following scheme SAMPLE DOCUMENTS: Lets say there are four documents: Doc1: st louis, missouri, usa Doc2: st louis du ha ha, quebec, canada Doc3: new york, NY, united states of america Doc4: ny, usa INDEX PHASE: -

RE: grouping results by fields

2006-01-30 Thread Chris Hostetter

An approach like mark is describing sould should be a lot more space efficient then the BitSet intersection approach i described before, but depending on how many groupings you want, i can immagine that it might be slower some cases. Unfortunately, it also only works if the grouping you wnat are

RE: grouping results by fields

2006-01-30 Thread mark harwood

> A simple solution if you only have 20,000 docs is > just to iterate > through the hits and count them up against each > color etc, The one thing to avoid is reader.document() calls in such a tight loop. This is always a killer. The best way I've found is to create one bitset for all the matchin

Re: Why do we use BitSet class ?

2006-01-30 Thread Chris Hostetter

: 1] Why do use BitSet Class ? : 2] Is it required in Filtering / Sorting of results or to Index ? BitSet is a usefull class for lots of things. the only time (that i know of) where it is part of the public lucene API is in the interaction between a Filter and the IndexSearcher ... so unless you

RE: grouping results by fields

2006-01-30 Thread Mike Streeton

A simple solution if you only have 20,000 docs is just to iterate through the hits and count them up against each color etc, this could be in a HitCollector. The balance here is performance vs memory usage, if you have a lot of users I would go for a solution that was less efficient but used a lot

Re: grouping results by fields

2006-01-30 Thread zzzzz shalev

hey Jim, thanks alot for the quick reply! much appreciated i will look a little closer into what is done in C|Net , seems more cost efficient than what im currently doing ;) however i am not sure how scaleable the solution is if , for example, i recieved 20,000 results and i ha

Unable to optimize index: cannot delete deletable.new

2006-01-30 Thread Dalton, Jeffery

I have a periodic process that runs as a timer task that periodically optimizes my search index. However, I am having difficulties with this process failing: java.io.IOException: Cannot overwrite: C:\04950_04959\deleteable.new at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory

Re: How to find "function()" - ?

2006-01-30 Thread Michael D. Curtin

Dmitry Goldenberg wrote: a) if I index "function()" as "function()" rather than "function", does that mean that if I search for "function", then it won't be found? -- the problem is that in some cases, the user will want to find function(), and in some cases just function -- can I accommodate

RE: How to find "function()" - ?

2006-01-30 Thread Dmitry Goldenberg

Michael, Yes, you're describing pretty much what I was thinking of but -- a) if I index "function()" as "function()" rather than "function", does that mean that if I search for "function", then it won't be found? -- the problem is that in some cases, the user will want to find function(), and

Re: Why do we use BitSet class ?

2006-01-30 Thread Erik Hatcher

Please do not cross-post your questions. Your questions are best asked solely to java-user. Erik On Jan 30, 2006, at 9:23 AM, Vikas Khengare wrote: Hi Friends I am very New to Lucene World !!! As this world is interesting to me So I want to go in deep level of it to

Why do we use BitSet class ?

2006-01-30 Thread Vikas Khengare

Hi Friends I am very New to Lucene World !!! As this world is interesting to me So I want to go in deep level of it to realize the beauty of it. So can you help me to realize that beauty ? I have question 1] Why do use BitSet Class ? 2] Is it required in Filtering / Sorting o

RE: Searching over more than one Fields

2006-01-30 Thread Mike Streeton

There are a number of ways of doing this. One way I would suggest if simply to store the CONTENTS fields and prefix it with the field name. So instead of storing a single CONTENTS field for a document, store a CONTENTS field for each other field with the field name prefixing each field value. E.

RE: searching specific documents

2006-01-30 Thread Mike Streeton

Use BitSets to intersect the two queries. First knock up a HitCollector that generates a bit set for the document set you want to search (A,B,C,X,Y,Z). Then do another query generating a bit set for the criteria on (C,X,Y). Then just interest the two bits sets using the "and" method. Mike www.ard

searching specific documents

2006-01-30 Thread Jebus

How do you search only certain documents. In the app I am writing before I start searching with Lucene I know all the documents that I want to search. For example I have documents A,B,C,X,Y,Z so before I start the search I know that I only want to search docs C,X,Y due to other non lucene criteria.

Re: Throughput doesn't increase when using more concurrent threads

2006-01-30 Thread Peter Keegan

I cranked up the dial on my query tester and was able to get the rate up to 325 qps. Unfortunately, the machine died shortly thereafter (memory errors :-( ) Hopefully, it was just a coincidence. I haven't measured 64-bit indexing speed, yet. Peter On 1/29/06, Daniel Noll <[EMAIL PROTECTED]> wrote

Re: deleting duplicate documents from my index

2006-01-30 Thread gekkokid

hi, thats exactly what i did :) works perfectly thanks _gk - Original Message - From: "Chris Hostetter" <[EMAIL PROTECTED]> To: Sent: Monday, January 30, 2006 5:56 AM Subject: Re: deleting duplicate documents from my index : Hi, im trying to delete duplicate documents from my inde

RE: Searching over more than one Fields

2006-01-30 Thread Gwyn Carwardine

I was happy to take the hit of storing the text twice. I have created an aggregate field called "CONTENTS" that has all the other fields concatenated together. I also created a list of the other fields (because they can vary from doc to doc) in another field "FIELDLIST" I search this field and fo

Reindexing

RE: Help with indexing and query strategy

RE: Help with indexing and query strategy

Re: Help with indexing and query strategy

RE: grouping results by fields

Re: Number Searches vs Character

RE: grouping results by fields

RE: grouping results by fields

Number Searches vs Character

[OT] Unidecode?

Re: Help with indexing and query strategy

RE: grouping results by fields

RE: grouping results by fields

Re: Why do we use BitSet class ?

RE: grouping results by fields

Re: grouping results by fields

Unable to optimize index: cannot delete deletable.new

Re: How to find "function()" - ?

RE: How to find "function()" - ?

Re: Why do we use BitSet class ?

Why do we use BitSet class ?

RE: Searching over more than one Fields

RE: searching specific documents

searching specific documents

Re: Throughput doesn't increase when using more concurrent threads

Related searches

Re: deleting duplicate documents from my index

RE: Searching over more than one Fields

28 matches

Site Navigation

Mail list logo

Footer information