Many thanks for confirming the principles should work fine. It is a
load off my mind! :)
On index update, a small Event is triggered into a Buffer, that is
periodically (every 30 seconds) processed to coalesce them, then
ensure that any open IndexSearcher in the cache is closed.
On 12/07
Thanks for the advice. That ought to reduce contention a bit in that
particular method.
I've been reviewing a large amount of thread dumps today and I was wondering
if it's common to see many threads that look like this:
"tcpConnection-8080-20" daemon prio=5 tid=0x081ba000 nid=0x810ac00 waiting
f
Would that show up in the TermVectors?
Yes, but uou would need a scheme for identifying "original, unstemmed" terms vs
stems. For example, you could use another field and analyzer for the unstemmed forms.
Andrew Boyd wrote:
What about storing the unstemed word with the same position as the
If you want really real-time updates of search results, then yes.
However, maybe you can live with near-real-time results, in which cases
you can add some logic to your application to check for index version
only every N requests/minutes/hours.
Otis
--- Aalap Parikh <[EMAIL PROTECTED]> wrote:
What about storing the unstemed word with the same position as the stemmed
word. Would that show up in the TermVectors?
-Original Message-
From: mark harwood <[EMAIL PROTECTED]>
Sent: Jul 8, 2005 10:44 AM
To: java-user@lucene.apache.org, Andrew Boyd <[EMAIL PROTECTED]>
Subject: Re: How t
2500 vs 84. Wow. That's quite a few OR statements I would be saving
following your guide of just indexing the parts of the datetime I plan
to search on. Every ms count.
Now I have a clear picture of how range query works. Great stuff. Thanks.
Btw, coming from a db background I'm so used to wri
>I don't really know a lot about what gets loaded into
memory when you
>make/use a new searcher, but the one thing i've
learned from experience
>is
>that the FieldCache (which gets used when you sort on
a field) contains
>every term in the field you are sorting on, and an
instance of
>FieldCache
Paul - I'm doing the same (smaller indices) for Simpy.com for similar
reasons (fast, independent and faster reindexing, etc.). Each index
has its own IndexSearcher, and they are kept in a LRU data structure.
Before each search the index version is checked, and new IndexSearcher
created in case th
SearchBlox Software has released Version 3.0 of its J2EE Content Search
Software.
SearchBlox delivers out-of-the-box search functionality for quick and easy
integration with websites, applications, intranets and portals. SearchBlox
uses the Lucene Search API and incorporates integrated HTTP/HTTPS
Hi Nick,
Without looking at the source of that method, I'd suggest first trying
the multifile index format (you can easily convert to it by setting the
new format on IndexWriter and optimizing it). I'd be interested to
know if this eliminates the problem, or at least makes it harder to
hit.
Otis
Hi,
I am not sure if I have to index using Field.Text or Field.Keyword. I know that
:
Keyword-Isn't analyzed, but is indexed and stored in the index verbatim.
This type is suitable for fields whose original value should be preserved in
its entirety, such as URLs, file system paths, dates, person
Hey Otis,
Thanks for the hasty response and apologies for my delayed response. It was
Friday and time to go :)
The queries we're running are very varied (wildcard, phrase, normal). The
index is only about a 1/2 gig in size (maybe 250,000 documents). The machine
is running FreeBSD 5.3 with ~2 gig
On Jul 11, 2005, at 1:45 AM, [EMAIL PROTECTED] wrote:
Did a google serach on the problem when using the range search
phrase of "+datefield:[199801 TO 200512]" (date stored as
"MMDD") which returns 1 million hits.
error: org.apache.lucene.search.BooleanQuery$TooManyClauses
Adding "-Do
Hello,
We are already using this design in production for a email job application
system.
Each client (company) have an account and may have multiple users
When a new client is created, a new lucene index is automatically created when
new job-applications arrive for this account.
Job applicati
14 matches
Mail list logo