Are there specific queries that cause the out of memory problem? Or will any
query do it?
How large is the index?
MultiSearcher allows you to search over multiple indexes, and is well
supported throughout the API. How you split your indexes is depends on what
you want to achieve. There are many
The latest binary "stable" release is 1.4.3. Though not officially
released, Lucene 1.9 is available from the source code repository, and,
IMHO, is more than ready for day to day use. You will need to check the
code out with subversion or cvs via the apache code repository and build it
your self.
That is certainly the behaviour I would expect. The "+" means the term or
phrase is required - you are requiring words that are not stored in your
index.
Why don't remove the "+"? Alternately you could run the search, and if no
matches are found, run it again without the second argument. I've fo
27;ll be around 5.000 database records with three indexed fields:
> id, title(1 line) and description(around three lines). I was even
> considering using the in memory feature for faster access but I'm new to
> lucene and I'don't know if that I'll cause my problems in the f
If this is a small index and it won't change after install (you are just
using it to search, not to index), place it in a sub-directory of WEB-INF.
If it is a larger index (something you don't want to copy frequently), or it
will change after install, then you shouldn't keep it inside your web
app
We build indexes, then share those indexes (along with files and database
records) with our client installations.
We now have multiple clients, and they are beginning to say things like,
"I'd like this group of documents here, and this little bit over here, and
ah yea that document there too
In the sandbox at
http://lucene.apache.org/java/docs/lucene-sandbox/
There is a link to the WordNet repository:
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/WordNet
it should be:
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/wordnet
Where "wordnet" is not capitalized.
J
I've run through exactly the same train of thought. Php is an efficient and
effective web development language - Java provides excellent libraries for
developing powerful business logic layer. Wouldn't it be nice to couple the
two together? The answer is no, it would suck. You end up with some clus
Thanks Erik, seems Otis keeps a very nice blog about simply
http://blog.simpy.com/blojsom/blog/ that's full of helpful advice on the
topic.
On 10/4/05, Erik Hatcher <[EMAIL PROTECTED]> wrote:
>
>
> On Oct 4, 2005, at 11:52 AM, mark harwood wrote:
>
> >> Is anyone out there incorporating folksonom
I've been reading about Folksonomies (
http://en.wikipedia.org/wiki/Folksonomy), and I would like to incorporate
them into a project I'm developing with Lucene.
The concept is pretty simple, a targeted community of users add labels of
their choosing (just off the top of their head, not from a list
aka more simplistic) and therefore probably more scalable?
This is probably a question for the Nutch user list, but why doesn't
Nutch use the Lucene Summarizer?
Thoughts, comments?
- Jeff
-Original Message-
From: Dan Funk [mailto:[EMAIL PROTECTED]
Sent: Friday, September 23, 2005
What you are doing is a good, scalable practice. You need to store
those email messages somewhere outside of Lucene, and use a unique id to
correlate the two. When you want to display relevant text for a search
result, find the file on disk, and pass it through the Lucene
Highlighter (see th
Yep, runs great on the zaurus and we got lucene running on an Ipaq 3970
as well (we used the Creme JVM). Not sure what you would need to do
for the Blackberry, PDAs are so different, but I'd love to hear if you
get it working.
christopher may wrote:
Well it is being run on the Sharp Zarus
r any help.
-Tom
--
Dan Funk
Software Engineer
Information Technology Solutions
Battelle Charlottesville Operations
1000 Research Park Boulevard, Suite 105
Charlottesville, Virginia 22911
434.984.0951 x244
434.984.0947 (fax)
[EMAIL PROT
People indexing XML documents tend to deal with the same kind of problem,
there is an excellent article at the URL below showing how they handled
some fairly
complex hierarchical queries.
http://www.idealliance.org/papers/xmle02/dx_xmle02/papers/03-02-08/03-02-08.html
Rohit Lodha wrote:
Hi A
Currently I'm working with a single index where content is indexed by
it's original printed page. I have to show the total number of matching
documents, so I end up running through all the hits and taking an order
of magnitude hit on performance as I calculate the number of unique
documents. I
We deliver HTML web sites to our clients on a CD. It often remains on
that CD, and they pass the CD around, and use it when they need to do
research on some topic.
We would like to offer them the ability to search the contents of the CD.
We can not install any software on their windows mach
You could have a parentId field in each document - which will give you a
nice hierarchy. You could also create a topicId (Linux, Microsoft,
etc...) and a storyId. At that point you can quickly identify the topic
and story for the message - and you can also search within a specific
thread (AND
Lucene uses a lock file to prevent simultaneous writes to index. You
can just delete the file at
C:\DOCUME~1\tom\LOCALS~1\Temp\Lucene-81022e186820264e5b78801c219b8e8b-commit.lock
and be on your way.
avrootshell wrote:
Hi,
I'm using using lucene for full text search.
It worked gr8.
But now
difficult to port - all I had was a web service- and
that moved over without a hitch.
christopher may wrote:
What are you running as far as the OS ? And thanks for the responce.
From: Dan Funk <[EMAIL PROTECTED]>
Reply-To: java-user@lucene.apache.org
To: java-user@lucene.apache
code in the J2me wireless toolkit ? Any
help would be appreciated, Thanks
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Dan Funk
Software Engineer
Information Technology
r this, it seems that the term positions could still be
useful?
Any suggestions would be appreciated.
Thanks,
Fred Toth
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Dan Fun
2660
http://www.taluskie.com
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Dan Funk
Software Engineer
Information Technology Solutions
Battelle Charlottesville Operations
1000 Research Park Boulevar
I don't understand - this is all happening in the background right?
Why not just add the document to the index, then execute all the queries
(with an extra clause to restrict results to that document) and see what
hits?
Robert Watkins wrote:
Okay, I only bought your book a few days ago, so I ha
24 matches
Mail list logo