RE: intra-word delimiters

2005-08-15 Thread Rajesh Munavalli
How about this (1) Follow the first three steps mentioned by Marvin (2) For step 4 instead of trying to come up with all concatenation, create concatenations of 3 words For example an entity of length n, W1 W2 W3Wn will have the index positions as follows Index Position Value 0

Re: intra-word delimiters

2005-08-15 Thread Marvin Humphrey
On Aug 15, 2005, at 8:53 PM, Marvin Humphrey wrote: Create a phrase query that when it encounters ab => { tokenlength => 2 } knows to look for something at position 3. Fencepost error! That should have been "position 2". Not that correcting the error makes the algo any more practical. ;)

Re: intra-word delimiters

2005-08-15 Thread Marvin Humphrey
On Aug 15, 2005, at 7:47 PM, Yonik Seeley wrote: That was the plan, but step (4) really seems problematic. - term expansion this way can lead to a lot of false matches - phrase queries with many bordering words break - settingt term positions such that phrase queries work on all combos of subw

Re: intra-word delimiters

2005-08-15 Thread Yonik Seeley
That was the plan, but step (4) really seems problematic. - term expansion this way can lead to a lot of false matches - phrase queries with many bordering words break - settingt term positions such that phrase queries work on all combos of subwords is non-trivial. It seems like a better approach

Re: intra-word delimiters

2005-08-15 Thread Marvin Humphrey
On Aug 15, 2005, at 3:16 PM, Yonik Seeley wrote: Another example: Source Text contains "Canon Powershot SD500 7MP Digital Elph" And I want to be able to match the following user queries: Power Shot SD 500 CanonPowerShotSD500 SD 500 7 MP digitalelph Canon-Powershot-SD 500 Any ideas? How abou

Re: Integrating lucene search with adobe search

2005-08-15 Thread Ben Litchfield
Andrew, There are a couple different open parameters that can be passed in when opening a PDF. See http://partners.adobe.com/public/developer/en/acrobat/PDFOpenParameters.pdf for the complete specification but an example for a url would be http://www.pdfbox.org/index.pdf#search="pdfbox"; It al

intra-word delimiters

2005-08-15 Thread Yonik Seeley
Does anyone have solutions for handling intraword delimiters (case changes, non-alphanumeric chars, and alpha-numeric transitions)? If the source text is Wi-Fi, we want to be able to match the following user queries: wi fi wifi wi-fi wi+fi WiFi One way is to index "wi", "fi", and "wifi". However

coord factor and MultiFieldSearch

2005-08-15 Thread Malay Desai
Hi, I have a question on the way coord factors affects multi-field searches. It looks like the sum of individual field hit scores is multiplied by a coord factor of (x/y) where x = no. of fields matched and y = total fields. This seems to penalize some results, where we get a very good quality

Re: QueryParser exception on escaped backslash preceding ) character

2005-08-15 Thread Erik Hatcher
On Aug 15, 2005, at 3:05 PM, Monsur Hossain wrote: We've actually been running into this sort of issue a lot, since we take a user generated query from a web page and then push it into a QueryParser. In general we've learned that escaping special characters is not enough to create a well fo

RE: QueryParser exception on escaped backslash preceding ) character

2005-08-15 Thread Monsur Hossain
We've actually been running into this sort of issue a lot, since we take a user generated query from a web page and then push it into a QueryParser. In general we've learned that escaping special characters is not enough to create a well formed query. Since our users aren't running complicated que

Re: Indexing document instances and retrieving instance attributes

2005-08-15 Thread Chris D
I'm still having problems finding a clean way of doing this. Currently my index has "_" filling in for empty fields in instances DOC1 FILEID 123 MIME test/html CONTENTblam blam blam etc. FILENAME File1 DATE 090909 AUTH _ SESSION11 FILEN

RE: QueryParser Exceptions only under load?

2005-08-15 Thread Andrew Boyd
Thanks for the reply. I believe your initial thought is probably the correct one! Thanks, Andrew -Original Message- From: "Palmer, Andrew MMI Woking" <[EMAIL PROTECTED]> Sent: Aug 15, 2005 12:03 PM To: java-user@lucene.apache.org, Andrew Boyd <[EMAIL PROTECTED]> Subject: RE: QueryParse

RE: QueryParser Exceptions only under load?

2005-08-15 Thread Palmer, Andrew MMI Woking
Andrew, My initial thought is that you are reusing the QueryParser for each of the requests. It is not a thread safe object. I was getting similar problems and changing the way that I used the QueryParser fixed it. This was on 1.4.3 so it might be different. Andrew -Original Message-

QueryParser Exceptions only under load?

2005-08-15 Thread Andrew Boyd
Hi all, I'm running lucene 1.9-rc with jdk 1.5/5.0 on JBoss 3.6 with tomcat 5.0. I'm using JMeter to do my load testing. I'm getting several different exceptions (NullPointer, ArrayIndexOutofBounds and ParseException) from QueryParser when I simulate 5 users (threads in JMeter)with no pausing

QueryParser Exceptions only under load?

2005-08-15 Thread Andrew Boyd
Hi all, I'm running lucene 1.9-rc with jdk 1.5/5.0 on JBoss 3.6 with tomcat 5.0. I'm using JMeter to do my load testing. I'm getting several different exceptions (NullPointer, ArrayIndexOutofBounds and ParseException) from QueryParser when I simulate 5 users (threads in JMeter)with no pausing

Integrating lucene search with adobe search

2005-08-15 Thread Andrew Boyd
Hello all, After I do my search and display the hits I get back I would like to pass the seach string that I used with lucene to acrobat reader when it opens. Has any one done this or has anyone seen any documents on how to do it? Thanks, Andrew Andrew Boyd Software Architect Sun Certified