try your query like ((ducted^1000 duct~2) +tape)
Or maybe (duct* +tape)
or even better you could try to do some stemming (Porter stemmer should get rid
of these ed-suffixes) and some of the above
if this does not help, have a look at lingpipe spellChecker class as this looks
like exactly what yo
Chris Nokleberg wrote:
I am using the QueryParser with a StandardAnalyzer. I would like to avoid
or auto-correct anything that would lead to a ParseException. For example,
I don't think you can get a parse exception from Google--even if you omit
a closing quote it looks like it just closes it for
On Tue, 06 Jun 2006 14:57:06 -0700, Chris Hostetter wrote:
> I took an approach similar to that, by escaping all of the "special'
> characters except '+', '-', and '"', and then stripping out all quotes if
> there was a non even amount ... this gave me a simplified version of the
> Lucene syntax th
On Jun 6, 2006, at 4:34 PM, Erick Erickson wrote:
Great Googly-Moogly Otis! .. how many blogs do you have?
Are you as old as I am or do you just like old Rock-n-Roll? If you're
getting this from the Apostrophe/Overnight Sensation album (Frank
Zappa),
Well, to be petty, it's from Nanook Ru
I'm implementing a spellchecker in my search and have a question.
After creating the index and spellchecker index, I pass in the word
"ducted tape" to search (I am expecting "duct tape" back).
I've played around with boosting the prefixes and suffixes, setting the
accuracy, passing in an Inde
There are about a zillion things that just go whoosh, right over my
head in my TV-less state
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
This isn't really a "learning search" issue as much as an issue of session
tracking and finding patterns. Eliminate search from the discussion and
the same questions could be a applied to generic product/document viewing...
"users who looked at products AAA and BBB also looked at product CCC"
T
Hi Everyone!
Working on a project that requires a Search query similiar to what is
seen on"amazon.com" in that after searching for and displaying an item,
the system shows:
"Users that have searched for "A" AND "B" have also searched
for "".
Where "B" and "" are other r
: > Great Googly-Moogly Otis! .. how many blogs do you have?
: Are you as old as I am or do you just like old Rock-n-Roll? If you're
: getting this from the Apostrophe/Overnight Sensation album (Frank Zappa),
: you can also listen to my all-time favorite song "I'm the slime".
Sorry, I was refrenc
Generally the best approach for restricting what results a person sees is
with a Filter ... if you do this, then you can get a BitSet from the
Filter which tells you everything they are allowed to see, if you then
also build a BitSet for each of the Terms you wantto "browse" by (again:
a Filter ca
It really depends on what syntax you want to support ... if you just want
basic term matching and do't want to let hte user specify field names, or
boosts or phrases, or ranges, or wildcards -- then just escape the
entirestring, that should make it impossible to get a parse exception.
I took an a
That way madness lies..
I suspect that you'll find that there are a few rules you can apply that
will allow you to "fix" a lot of queries, but... is that really what you
want to do? For instance, a user types
"a and or not b"
Whatever you do, it isn't what the *next* user who types somethin
Basically you need to pre-process the query and rewrite it in a way you
think it should be. Then catch the parse exception if you failed to
rewrite the query and display an error message on the screen (something
like - This kind of query is not supported, please rephrase your query).
HTH
Aviran
h
Great Googly-Moogly Otis! .. how many blogs do you have?
Are you as old as I am or do you just like old Rock-n-Roll? If you're
getting this from the Apostrophe/Overnight Sensation album (Frank Zappa),
you can also listen to my all-time favorite song "I'm the slime".
Erick
P.S. Haven't owned
Hi all,
I am using the QueryParser with a StandardAnalyzer. I would like to avoid
or auto-correct anything that would lead to a ParseException. For example,
I don't think you can get a parse exception from Google--even if you omit
a closing quote it looks like it just closes it for you (please cor
I find it very useful. I hope you will too.
On 6/6/06, digby <[EMAIL PROTECTED]> wrote:
Does everyone recommend getting this book? I'm just starting out with
Lucene and like to have a book beside me as well as the web / this
mailing list, but the book looks quite old now, has a 1-2 month del
: v2.0 might be a little while in coming (checkout Otis' blog
: http://www.jroller.com/page/otis?catname=%2FLucene).
Great Googly-Moogly Otis! .. how many blogs do you have?
http://lucenebook.com/blog/
http://www.jroller.com/page/otis
http://blog.simpy.com/
-Hoss
---
1) have you tried forcing a threaddump of the JVM when it hangs to see
what it's doing? (i don't remember which signal it is off the top of my
head, but even if it's not responding to SIGTERM it might respond to that)
: SIGTERM. I guess I'd feel more confident about using SIGKILL, if I knew that
Hi
I'm currently doing just that: using the php-java bridge. Here the
goal is to integrate Java-Lucene with a php4 based CMS (eZ publish),
so the Zend framework is not an answer (and premature imho). The code
we've written is a bit CMS specific, but you should be able to to do
the same quite fast
On 6/6/06, Alexander MASHTAKOV <[EMAIL PROTECTED]> wrote:
The other thing - performance. In order
to run faster - it's necessary to have opened
index, rather then open and close it for each request.
Index updates have to be serialized somehow and after
the update, it has to be re-opened again.
Has anyone tried to solve this task ?
--- Alexander MASHTAKOV <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Thank you for reply.
> I've also had a look at Zend framework. But, at
> this moment they do not support unicode,
> which is a mandatory requirement in my case.
>
> The other thing - performanc
Hi,
Thank you for reply.
I've also had a look at Zend framework. But, at
this moment they do not support unicode,
which is a mandatory requirement in my case.
The other thing - performance. In order
to run faster - it's necessary to have opened
index, rather then open and close it for each req
All sorted now. Of course, if I can loop through the properties of a
bean to add them as fields to the document, then I can certainly do the
same at query time to build the MultiFieldQueryParser. All done and
working great.
Thanks for all your comments.
digby wrote:
Basically, I've got a smal
Hi,
Zend Search Framework can help you. Take a look at
http://framework.zend.com/manual/en/zend.search.html
-
Zend_Search_Lucene is a general purpose text search engine written
entirely in PHP 5. Since it stores its index on the files
Other replies mention SOLR. I'm fairly new to SOLR, but have used
Lucene quite a bit. Based on your situation, it certainly sounds like
SOLR is worth looking into. I was able to convert a portion of one of
my sites from being SQL powered to SOLR powered in about a days work,
which includes lear
Basically, I've got a small app which allows me to update fields in
bunch of mysql tables using Hibernate. As I save each bean, I'm want to
add it to the lucene index aswell. However, I want the app to be as
generic as possible and at the moment it doesn't care what the bean is,
as long as ther
For querying, we have PHP talking to our Java application through sockets
and XML. Queries are set up in PHP, creating an XML document which
corresponds to a subset of the subclasses of
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Query.html.
If we'd had the PHP skill set at the
LOL. I gotta get a Lucene License Plate Frame though.
Rob Staveley (Tom) wrote:
It is better value than the tee shirt http://www.cafepress.com/lucene/
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-ma
I am also working on interfacing Lucene with PHP. Here are a couple
options that I have found useful:
Call Java directly from PHP:
http://php-java-bridge.sourceforge.net/
Solr - Interacts w/ Lucene via XML requests
http://incubator.apache.org/solr/index.html
There is mention of a PHP interface
Hi Folks,
I'm working on project that is going to have free-text
search mechanism. The project is completely based on
open source technologies, such as MySQL and PHP.
I'm reading about Lucene and think that this is
probably the first candidate.
BTW, the (obvious) question is: "How to integrate P
In my application I was queuing IDs of appropriate record in Database
not whole document. Document was created right before adding it to
index. All this work was done in separated thread, so other threads
responded very quickly.
It depends on your application and at what speed your new data co
You could always purchase the PDF from www.manning.com.
This book is essential in my view. Its also one of the clearest most engaging
IT books I've ever read.
- Original Message
From: digby <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, 6 June, 2006 11:55:26 AM
Subje
have a look at spring module 0.3. it has a lucene module which contains many
interesting classes LuceneIndexTemplate, LuceneSearchTemplate, and all kind of
factotires following spring concepts.
here is the url to the documentation: http://www.springframework.org/node/270
-Original Message---
Not sure if I understand exactly what you want to do, but would the
":" syntax that QueryParser understands work for you? That
is, you could send query text like
f1:foo f2:foo f3:foo
to search for "foo" in any of the 3 fields. If you need boolean capabilities
you can use parentheses, li
It is better value than the tee shirt http://www.cafepress.com/lucene/
smime.p7s
Description: S/MIME cryptographic signature
Grab it now, it is worth all this money.
- Original Message
From: digby <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, 6 June, 2006 11:59:53 AM
Subject: Lucene in Action
Does everyone recommend getting this book? I'm just starting out with
Lucene and like to have a b
Hi,
Or simply grab it online (paper or pdf eBook ) here :
http://www.manning.com/hatcher2/
--- Sven
Le mardi 6 juin 2006 à 13:05:45, vous écriviez :
MC> Try here..
MC> http://www.abebooks.co.uk
MC> Maybe they have one cheaper.
MC> Malcolm
Thanks, Karl. It would be good if maxBufferedDocs could respond dynamically
to available heap. It seems a shame to set <10 for the sake of sporadic
large documents. Failing that, it would be nice if we could explicitly
pre-flush buffers when we encounter a big field.
I'm increasingly thinking that
This is a good idea. I had been worried about the additional heap
requirement maintaining a queue, without being able to serialize/deserialize
Documents (i.e. a build up of Lucene Documents in RAM). I have been
marshalling addDocument() calls using a synchronized object; the same
threads have been
For what it's worth, Blackwell's uses Lucene for biblio searching.
Developed with the help of Lucene In Action.
--
Ian.
On 6/6/06, digby <[EMAIL PROTECTED]> wrote:
Thanks everyone, although now I'm not sure what to! Blackwells quicker
but more expensive, but is a new edition due...???
Think
Try here..
http://www.abebooks.co.uk
Maybe they have one cheaper.
Malcolm
- Original Message -
From: "digby" <[EMAIL PROTECTED]>
To:
Sent: Tuesday, June 06, 2006 11:55 AM
Subject: Re: Lucene in Action
Thanks everyone, although now I'm not sure what to! Blackwells quicker but
Thanks everyone, although now I'm not sure what to! Blackwells quicker
but more expensive, but is a new edition due...???
Think I'll blow the moths off my wallet and get on with it...
[EMAIL PROTECTED] wrote:
It's an invaluable book if you're new to Lucene. There have been some
changes to
IT'S A REALLY GOOD BOOK TO START OFF WITH
Regards and Thanks
Kinnar Kumar Sen
HCL Technologies Ltd.
Sec-60, Noida-201301
Ph: - 09313297423
TO SUCEED BE DIFFERENT BE DARING AND BE THERE FIRST
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
Sent: Tuesda
It's an invaluable book if you're new to Lucene. There have been some
changes to the Lucene API since the book was published but you shouldn't
let this put you off - they're relatively minor. I think Lucene In Action
v2.0 might be a little while in coming (checkout Otis' blog
http://www.jrolle
its a nice book!!
On 6/6/06, Irving, Dave <[EMAIL PROTECTED]> wrote:
It really helped me out loads and I would recommend it to anyone.
I gave up trying to obtain it from amazon - but got it in 2 days from
Blackwell Online
(http://bookshop.blackwell.co.uk)
> -Original Message-
> From
It really helped me out loads and I would recommend it to anyone.
I gave up trying to obtain it from amazon - but got it in 2 days from
Blackwell Online
(http://bookshop.blackwell.co.uk)
> -Original Message-
> From: news [mailto:[EMAIL PROTECTED] On Behalf Of digby
> Sent: 06 June 2006 11
On Tue, 2006-06-06 at 10:59 +0100, digby wrote:
>
> Does everyone recommend getting this book?
If you want to learn Lucene then this is definitely a book to get.
> I'm just starting out with Lucene and like to have a book beside me as
> well as the web / this mailing list, but the book looks qui
On Tue, 2006-06-06 at 10:47 +0100, digby wrote:
> I was wondering this exact question, but MultiFieldQueryParser still
> requires you to specify the field names. In my application I don't know
> the field names (they're automatically generated from beans using
> BeanUtils.getProperties()), so I'
Does everyone recommend getting this book? I'm just starting out with
Lucene and like to have a book beside me as well as the web / this
mailing list, but the book looks quite old now, has a 1-2 month delivery
wait time here in the UK and is quite expensive. Is it worth waiting for
a new editio
I was wondering this exact question, but MultiFieldQueryParser still
requires you to specify the field names. In my application I don't know
the field names (they're automatically generated from beans using
BeanUtils.getProperties()), so I've resorted to concatenating all the
fields into a sing
If your content handlers should respond quickly then you should move
indexing process to separate thread and maintain items in queue.
Rob Staveley (Tom) wrote:
This is a real eye-opener, Volodymyr. Many thanks. I guess that means that
my orphan-producing hangs must be addDocument() calls, and n
On Tue, 2006-06-06 at 10:43 +0100, Rob Staveley (Tom) wrote:
> You are right there are going to be a lot of tokens. The entire boxy of a
> text document is getting indexed in an unstored field, but I don't see how I
> can flush a partially loaded field.
Check these out:
http://lucene.apache.org/
You are right there are going to be a lot of tokens. The entire boxy of a
text document is getting indexed in an unstored field, but I don't see how I
can flush a partially loaded field.
-Original Message-
From: karl wettin [mailto:[EMAIL PROTECTED]
Sent: 06 June 2006 10:33
To: java-user
You can use MultiFieldQueryParser
Something like this
Query query = MultiFieldQueryParser.parse(new String[]{queryString,
queryString, queryString}, new String[]{ASSET_TITLE, ASSET_ARTICLE,
ASSET_DIRECTOR_NAMES }, new BooleanClause.Occur[]
{BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD,
I answered too quickly too :-)
The QA folk seem to reckon that a 132MB plain text file with no white space
is where it falls over. There are some accountancy e-mails with attachments
of ~170Mb like this, which we need to be able to field.
How would I go about flushing the IndexWriter?
-Orig
On Tue, 2006-06-06 at 10:22 +0100, Rob Staveley (Tom) wrote:
>
> Thanks for the response, Karl. I am using FSDirectory.
> -X:AggressiveHeap might reduce the number of times I get bitten by the
> problem, but I'm really looking for a streaming/serialised approach [I
> think!], which allows me to ha
Thanks for the response, Karl. I am using FSDirectory. -X:AggressiveHeap
might reduce the number of times I get bitten by the problem, but I'm really
looking for a streaming/serialised approach [I think!], which allows me to
handle objects which are larger than available memory. Using the
java.io.R
On Tue, 2006-06-06 at 10:11 +0100, Rob Staveley (Tom) wrote:
> Sometimes I need to index large documents. I've got just about as much
> heap
> as my application is allowed (-Xmx512m) and I'm using the unstored
> org.apache.lucene.document.Field constructed with a java.io.Reader,
> but I'm
> still s
On Tue, 2006-06-06 at 10:11 +0100, Rob Staveley (Tom) wrote:
> Sometimes I need to index large documents. I've got just about as much heap
> as my application is allowed (-Xmx512m) and I'm using the unstored
> org.apache.lucene.document.Field constructed with a java.io.Reader, but I'm
> still suffe
On Tue, 2006-06-06 at 14:38 +0530, Amaresh Kumar Yadav wrote:
> My document has six field and i want to search on three fields.
>
> Presently I am able to search on only TITLE field..
>
> query = QueryParser.parse(queryString, "TITLE", analyzer);
You want to use the MultiFieldQueryParser.
-
Sometimes I need to index large documents. I've got just about as much heap
as my application is allowed (-Xmx512m) and I'm using the unstored
org.apache.lucene.document.Field constructed with a java.io.Reader, but I'm
still suffering from java.lang.OutOfMemoryError when I index some large
document
Hi All,
Will u please give me some clue for searching on more than one field
of document.
My document has six field and i want to search on three fields.
Presently I am able to search on only TITLE field..
query = QueryParser.parse(queryString, "TITLE", analyzer);
Regards..
Amares
We wrote ours for NetSearch to handle this specific issue. I suggest you
create a holder class to hold the IndexReader and IndexSearcher, this
can close them in the finalizer. Clients keep the holder until they are
finished and then discard it. When it is completely de-referenced it
will be closed.
Hi,
when working with Spring, the best is to use Compass :
http://www.opensymphony.com/compass/ (if you can).
Regards,
Sami Dalouche
On Tue, 2006-06-06 at 00:27 -0400, Rajiv Roopan wrote:
> Hello,
> I'm using the spring framework to define my indexsearcher and
> indexwriter. They are defined
64 matches
Mail list logo