Erik Hatcher writes (4/5/2005 5:57 PM):
I have a need to implement wildcarded phrase queries, such as this:
"apach? luc*"
which would match "apache lucene", for example. This needs to also
support ordered and unordered proximity like SpanNearQuery does:
"apach? luc*"~10
I presume I'm goi
I have a need to implement wildcarded phrase queries, such as this:
"apach? luc*"
which would match "apache lucene", for example. This needs to also
support ordered and unordered proximity like SpanNearQuery does:
"apach? luc*"~10
I presume I'm going to have to key off of SpanQue
Hi folks. As promised, here is the first beta access to the php/lucene
work we were discussing earlier.
The url to the php front-end to the SFI working papers Lucene search is:
http://webdev.santafe.edu/research/publications/redfish/wpSearch.php
This provides a fairly simple search dialog, re
I suppose this should be addressed to Leo... anything we can do about
the issue mentioned below regarding wiki formatting?
Thanks,
Erik
Begin forwarded message:
From: Chris Hostetter <[EMAIL PROTECTED]>
Date: April 5, 2005 5:56:28 PM EDT
To: java-user@lucene.apache.org
Subject: Wiki form
Otis Gospodnetic wrote:
If you take this approach, keep in mind that you will also need to
handle regular application shutdowns, and also try to catch some
crashes/errors, in order to flush your in-memory queue of items
scheduled for indexing, and write them to disk.
Feel free to post the code, if
he wiki appears to have undergone some style cahnges recently, the layout
is a lot different now (and in my opinion: cleaner) but a side effect
seems to be that some page formatting which used to work no
longer does
Specifically, subSection headings that have leading whitespace, ie...
== Utilit
: >> I have documents with tokenized, indexes and stored field. This field
: >> contain one-two words usually. I need to be able to search exact
: >> matches for two words.
: >> For example search "John" should return documents with field
: >> containing "John" only, not "John Doe" or "John Foo".
:
:
: is it possible to filter the hits returned from a certain query?. for
: example if I have a search like this:
: Query searchQuery = queryParser.parse( query );
: Hits results = m_searcher.search( searchQuery );
: is there a way to use the results and find out how many of the return
For numeric fields, this will never happen.
For text fields, I could either
1) just use the first token generated (yuck)
2) don't run it through the analyzer (v1.0)
3) run it through an analyzer specific to range and prefix queries (post v1.0)
Since I know the schema, I can pick and choose di
On Apr 5, 2005, at 2:49 PM, Yonik Seeley wrote:
Just curious. I plan on overriding the current getRangeQuery() anyway
since it currently doesn't run the endpoints through the analyzer.
What will you do when multiple tokens are returned from the analyzer?
Erik
--
Was there any later thread on the QueryParser supporting open ended
range queries after this:
http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg07973.html
Just curious. I plan on overriding the current getRangeQuery() anyway
since it currently doesn't run the endpoints through the ana
Gusenbauer Stefan wrote:
>Erik Hatcher wrote:
>
>
>
>>On Apr 3, 2005, at 3:33 PM, Gusenbauer Stefan wrote:
>>
>>
>>
>>>Sorry for beeing late!
>>>Only the test code wouldn't be very useful for understanding because
>>>there are a lot of dependencies in the other code. I can explain what
>>>I
As an alternative, you could also take the approach taken for PyLucene:
compile the Java code with GCJ and generate bindings for Python with SWIG.
SWIG supports a number of languages in addition to Python such as Ruby, PHP,
Perl, and a bunch more.
For more information, see:
http://pylucene.osa
As Lucene native language is Java it should be more natural to access its
functionalities through JSP; anyway the idea of accessing Lucene
functionalities seems interesting as PHP is perhaps most widely deployed
server side scripting language.
I think that the way to provide access to Lucene AP
Optimize performance update (with tons of indexed fields):
We had a timing bug... ignore the hour I first reported. Here are the
current numbers:
indexed_fields=6791 index_size=3.9GB optimize_time=21min
indexed_fields=3216 index_size=2.0GB optimize_time=9min
indexed_fields=2080 index_size=1
If you take this approach, keep in mind that you will also need to
handle regular application shutdowns, and also try to catch some
crashes/errors, in order to flush your in-memory queue of items
scheduled for indexing, and write them to disk.
Feel free to post the code, if you want and can, so pe
I would recommend not optimizing your index that often. Another solution is to
use the multisearcher and keep one fully optimized primary index, and an
unoptimized secondary index that you add to. Then search against both. During
off peak hours you could merge the secondary index onto your pr
The compound index structure is meant for indexes with a large number of fields.
I was watching the files in the index directory of my compound index while
it was being optimized. The IndexWriter that I used was set to use
compound file.
It looks to me that Lucene first combined all existing segme
On Apr 5, 2005, at 5:44 AM, Yura Smolsky wrote:
EH> On Apr 4, 2005, at 4:34 PM, Yura Smolsky wrote:
Hello, java-user.
I have documents with tokenized, indexes and stored field. This field
contain one-two words usually. I need to be able to search exact
matches for two words.
For example search "Joh
Hello, Erik.
EH> On Apr 4, 2005, at 4:34 PM, Yura Smolsky wrote:
>> Hello, java-user.
>>
>> I have documents with tokenized, indexes and stored field. This field
>> contain one-two words usually. I need to be able to search exact
>> matches for two words.
>> For example search "John" should retur
Hi,
we are using a very cautious method for batch upating.
We have long (hours) running updates on our index, but
complete reindexing would even be longer (days). But I
guess our strategy could be scaled down to hours or even
less.
So what we do is, we keep two instances
of the index. There is
Hi
Thank you for replying so quickly. I am very pleased as I have just started
down the road of implementing a solution which is very nearly exactly like the
one you describe below. It is good to know that I am not heading down a dead
end. I hadn't thought about the re-indexing thread pausin
Hi,
please see comments below.
On Tue, Apr 05, 2005 at 08:38:04AM +0100, Lee Turner wrote:
> Hi
>
> I was wondering whether anyone has any experience of multithreaded
> updates to indexes. I the web app I am working on there are additions,
> updates and deletes that need to happen to the index t
Hi
I was wondering whether anyone has any experience of multithreaded
updates to indexes. I the web app I am working on there are additions,
updates and deletes that need to happen to the index throughout the
runtime of the application. Also, the application is run in a cluster
with each app
24 matches
Mail list logo