How about this
(1) Follow the first three steps mentioned by Marvin
(2) For step 4 instead of trying to come up with all concatenation, create
concatenations of 3 words
For example an entity of length n, W1 W2 W3Wn will have the index positions
as follows
Index Position Value
0
On Aug 15, 2005, at 8:53 PM, Marvin Humphrey wrote:
Create a phrase query that when it encounters ab => { tokenlength
=> 2 } knows to look for something at position 3.
Fencepost error! That should have been "position 2".
Not that correcting the error makes the algo any more practical. ;)
On Aug 15, 2005, at 7:47 PM, Yonik Seeley wrote:
That was the plan, but step (4) really seems problematic.
- term expansion this way can lead to a lot of false matches
- phrase queries with many bordering words break
- settingt term positions such that phrase queries work on all combos
of subw
That was the plan, but step (4) really seems problematic.
- term expansion this way can lead to a lot of false matches
- phrase queries with many bordering words break
- settingt term positions such that phrase queries work on all combos
of subwords is non-trivial.
It seems like a better approach
On Aug 15, 2005, at 3:16 PM, Yonik Seeley wrote:
Another example:
Source Text contains "Canon Powershot SD500 7MP Digital Elph"
And I want to be able to match the following user queries:
Power Shot SD 500
CanonPowerShotSD500
SD 500 7 MP digitalelph
Canon-Powershot-SD 500
Any ideas?
How abou
Andrew,
There are a couple different open parameters that can be passed in when
opening a PDF. See
http://partners.adobe.com/public/developer/en/acrobat/PDFOpenParameters.pdf
for the complete specification but an example for a url would be
http://www.pdfbox.org/index.pdf#search="pdfbox";
It al
Does anyone have solutions for handling intraword delimiters (case
changes, non-alphanumeric chars, and alpha-numeric transitions)?
If the source text is Wi-Fi, we want to be able to match the following
user queries:
wi fi
wifi
wi-fi
wi+fi
WiFi
One way is to index "wi", "fi", and "wifi".
However
Hi,
I have a question on the way coord factors affects multi-field searches. It
looks like the sum of individual field hit scores is multiplied by a coord
factor of (x/y) where x = no. of fields matched and y = total fields. This
seems to penalize some results, where we get a very good quality
On Aug 15, 2005, at 3:05 PM, Monsur Hossain wrote:
We've actually been running into this sort of issue a lot, since we
take a
user generated query from a web page and then push it into a
QueryParser.
In general we've learned that escaping special characters is not
enough to
create a well fo
We've actually been running into this sort of issue a lot, since we take a
user generated query from a web page and then push it into a QueryParser.
In general we've learned that escaping special characters is not enough to
create a well formed query. Since our users aren't running complicated
que
I'm still having problems finding a clean way of doing this. Currently
my index has
"_" filling in for empty fields in instances
DOC1
FILEID 123
MIME test/html
CONTENTblam blam blam etc.
FILENAME File1
DATE 090909
AUTH _
SESSION11
FILEN
Thanks for the reply. I believe your initial thought is probably the correct
one!
Thanks,
Andrew
-Original Message-
From: "Palmer, Andrew MMI Woking" <[EMAIL PROTECTED]>
Sent: Aug 15, 2005 12:03 PM
To: java-user@lucene.apache.org, Andrew Boyd <[EMAIL PROTECTED]>
Subject: RE: QueryParse
Andrew,
My initial thought is that you are reusing the QueryParser for each of
the requests. It is not a thread safe object.
I was getting similar problems and changing the way that I used the
QueryParser fixed it. This was on 1.4.3 so it might be different.
Andrew
-Original Message-
Hi all,
I'm running lucene 1.9-rc with jdk 1.5/5.0 on JBoss 3.6 with tomcat 5.0.
I'm using JMeter to do my load testing. I'm getting several different
exceptions (NullPointer, ArrayIndexOutofBounds and ParseException) from
QueryParser when I simulate 5 users (threads in JMeter)with no pausing
Hi all,
I'm running lucene 1.9-rc with jdk 1.5/5.0 on JBoss 3.6 with tomcat 5.0.
I'm using JMeter to do my load testing. I'm getting several different
exceptions (NullPointer, ArrayIndexOutofBounds and ParseException) from
QueryParser when I simulate 5 users (threads in JMeter)with no pausing
Hello all,
After I do my search and display the hits I get back I would like to pass the
seach string that I used with lucene to acrobat reader when it opens. Has any
one done this or has anyone seen any documents on how to do it?
Thanks,
Andrew
Andrew Boyd
Software Architect
Sun Certified
16 matches
Mail list logo