Just use 1.5 if you can. I've been using it for months on simpy.com
along with Lucene 1.9 and haven't had any problems so far.
Otis
--- "Sharma, Siddharth" <[EMAIL PROTECTED]> wrote:
> I have downloaded Lucene 1.4.3
> I am trying to narrow down on the JRE version to use.
> We have the flexibili
Xin,
Look for a Lucene-based spell checked in Lucene's contrib directory (in
SVN).
Otis
--- Xin Herbert Wu <[EMAIL PROTECTED]> wrote:
> Anyone plug-in a spell checker into lucene to implement google-like
> function
> "do you mean .?" for wrong spelled word or phrase?
>
> Also, which spell ch
Thanks. I will take a look at those classes.
I do need to support search queries like:
- Find all files that are named foo.doc.
- Find all the files that have not been accessed in last 6
months(atime).
- Find all PDF files with size > 2 MB
The HW requirements are flexible in terms of memory and CP
Hi,
> From: [EMAIL PROTECTED]
>
> I am looking at Lucene to index and search file metadata -
> filename, size, permissions, mtime, ctime, atime, etc.
>
> I do not need to index and search the contents of the file. I
> was wondering if Lucene is the right choice for such an
> application. Th
Hi,
I am looking at Lucene to index and search file metadata - filename,
size, permissions, mtime, ctime, atime, etc.
I do not need to index and search the contents of the file. I was
wondering if Lucene is the right choice for such an application. This
will be at enterprise level so there could
Place the lucene jar file in the WEB-INF/lib directory of your web
application prior to creating its war.
If your ISP inspects the war and removes all jar files within it, then I
suppose you might just have to place all the lucene classes under
WEB-INF/classes of your web application as 'loose cla
Anyone plug-in a spell checker into lucene to implement google-like function
"do you mean .?" for wrong spelled word or phrase?
Also, which spell checker product is good?
Thanks!
-Xin
Hello,
My provider only allows to upload war files. My problem is I make a war
archive out of the lucene-1.4.3.jar file and my jsp webpages based on
lucene. And this does not work. I hava one solution to solve my problem:
I have to unpack the lucene-1.4.3.jar file and pack it again with my
.j
I have downloaded Lucene 1.4.3
I am trying to narrow down on the JRE version to use.
We have the flexibility to use 1.3.1 up.
Which JVM will be the best for running Lucene?
I saw a note on the FAQ that said that Lucene will run on 1.3.1 but will
require 1.4 to compile.
Why would anyone want to com
> Ah, so the fact that "1" actually appears many times in the string you
> give Lucene is important. Neat application!
>
> Sounds like the custom Analyzer (really a custom TokenStream) approach
> suggested by others may be the way for you to go. If the information
> you get from the MySQL profile
Richard Jones wrote:
If you're willing to continue subsetting / summarizing the data out into
Lucene, how about subsetting it out into a dedicated MySQL instance for
this purpose? 100 artists * 1M profiles * 2 ints * 4 bytes/int =
roughly 1 GB of data, which would easily fit into RAM. Queries
> If you're willing to continue subsetting / summarizing the data out into
> Lucene, how about subsetting it out into a dedicated MySQL instance for
> this purpose? 100 artists * 1M profiles * 2 ints * 4 bytes/int =
> roughly 1 GB of data, which would easily fit into RAM. Queries should
> be pret
> since Lucene doesn't
> currently support negative boosts
See here for an approach to negative boosts:
http://wiki.apache.org/jakarta-lucene/CommunityContributions
Cheers
Mark
___
Yahoo! Messenger
Richard Jones wrote:
The data i'm dealing with is stored over a few mysql dbs on different
machines, horizontally partitioned so each user is assigned to a single db.
The queries i'm doing can be done in SQL in parallel over all machines then
combined, which i've tested - it's unacceptably slo
Not sure if this is feasible, but is there someway you could use a
"fake" analyzer that you constructed using your hashtable/termvector and
then have it output the tokens directly from the hashtable via the
TokenStream? Maybe you would have to pass in an empty/dummy string to
the field constru
Hi Erik
Our lucene-powered music search went live this week, so your search should
work now: http://www.last.fm/explore/search.php?q=Michael+Hedges
Before we discovered lucene our search sucked *really* badly ;)
Adding multiple fields like this is similar to what i'm doing now (i am using
whites
Others can correct me if I am wrong, but I don't think a "pure" Rochio
feedback loop is possible in the current state, since Lucene doesn't
currently support negative boosts
(http://lucene.apache.org/java/docs/queryparsersyntax.html). Having
said that, what we do, in a nutshell is similar to w
> I can think of a few ways. If elegance is your goal, then a little
> relational database theory might help. Specifically, instead of having
> one record per listener, have one record per listener-artist
> combination, with three fields: listenerid, artistid, and count. Your
> example above wo
On 2 Nov 2005, at 08:10, Richard Jones wrote:
If i've listened to Radiohead (id 1) 10 times, Coldplay (id 2) 5
times and
Beck (id 3) 2 times, the field would look like this "1 1 1 1 1 1 1
1 1 1 2 2
2 2 2 3 3"
I use this index for quickly finding "top fans" of an artist or
combination of
Stefan Gusenbauer <[EMAIL PROTECTED]> writes:
> Is there an add on for lucene to get a real vector representation?
> Does anyone has experiences with this issue?
No code, but some small thinking. You can do hacks with boosts and
whatnot, but I think in the end you really want a new Query subclas
I've some thoughts about Lucene and Relevance Feedback. I want to
implement some variation of the Roccio Formula and there is the problem.
The formula is like this:
Query(new) = alpha * Query(old) + beta * Sum(Relevant Documents) - gamma
* Sum(Non Relevant Documents)
The relevant documents in
Richard Jones wrote:
Hi,
I'm using lucene (which rocks, btw ;) behind the scenes at www.last.fm for
various things, and i've run into a situation that seems somewhat inelegant
regarding populating fields which i already know the termvector for.
I'm creating a document for each user (last.fm t
Dear fellow users,
I was wondering if anyone is using Lucene right now to index data derived from
business object models. My general problem is to index data which may be the
result of an expensive computation involving a graph of objects (for example
computing which customer has which items in
Hi,
I'm using lucene (which rocks, btw ;) behind the scenes at www.last.fm for
various things, and i've run into a situation that seems somewhat inelegant
regarding populating fields which i already know the termvector for.
I'm creating a document for each user (last.fm tracks music taste for pe
Hello,
You could try looking at
http://www.nabble.com/Hierarchical-Documents-t242604.html#a677841
where this has been discussed a little before.
Regards
Paul I.
Urvashi Gadi
25 matches
Mail list logo