Hi
My site has large database of Television and Movie titles, in English,
Spanish language. The movie data starts from year 1928 till date for
selected studios like MGM, Disney etc . The site user should be capable to
search movie or tv series by title, description, actors or characters. The
You can just assign the field B some weight when creating the index?
--
Chris Lu
Lucene Search RAD on Any Database
http://www.dbsight.net
On 8/29/05, raymondcreel (sent by Nabble.com) <[EMAIL PROTECTED]> wrote:
>
> Is it possible to write a custom sort for a query such that the fir
What seems to be working for me is a punctuation filter that removes / -
_ etc and makes the token without them. Then "most" of the time the
word XYZZZY_DE_SA0001 will be tokenized as XYZZZYDESA0001. For this to
work, you will have to use the same punctuation filter on the strings
before you sear
I am running lucene 1.4.3 and I have a situation where after adding and
removing some objects, my index becomes corrupt. I have been careful to
make sure that all adds and removes happen in a single thread (although
I know that isn't necessarly needed) and it still occurs. I am not sure
how to go
Why not build a self-extracting jar file and extract the contents of the
index to a temp directory?
http://www.javaworld.com/javaworld/javatips/jw-javatip120.html
Thomas Lepkowski wrote:
Hello,
I have a set of index files that I'd like to distribute with my Java
application. The only way
Is it possible to write a custom sort for a query such that the first N
documents that match a certain additional criteria get pushed to the top of the
sort? For instance say you sort your query based on field A, but you want to
tweak the results such that the first 10 documents in the result
Perhaps because you are not iterating over all the documents?
numDocs() == maxDocs() - numer_of_deleted_docs
So first try replacing numDocs() with maxDocs()
-Yonik
On 8/29/05, Derya Kasapoglu <[EMAIL PROTECTED]> wrote:
> Hi,
>
> if i delete a document from index, what does the it do?
> I want t
On Monday 29 August 2005 19:21, Jeremy Meyer wrote:
> The expected behavior is to sometimes treat a character as indicating a
> new token and other times to ignore the same character?
It depends on whether there are digits in the token. It's documented in
the javacc source for the tokenizer(?).
Hi,
if i delete a document from index, what does the it do?
I want to know because if i delete documents from index which are
not anymore in the dokument directories like that:
IndexReader reader = IndexReader.open(dir);
for (int i=0; i if (!file.exists()) reader.delete(i);
}
reader.cl
On Monday 29 August 2005 17:24, Greg Conway wrote:
> Hello. I've got a problem perhaps some of you have help with.
>
> I have an application that has to use fairly long queries (containing about
30 terms or'ed together) against an index of about 500K documents. Because
of the limited vocabula
Hi Tom,
You could distribute your index files in a plain old directory outside
of a jar file and install them with your application, then use
FSDirectory to read from the installed location.
But I can think of at least two ways to get the index files packaged
into the application jar. One would b
this would indeed be useful, it's something i've considered doing as well. i'm
assuming a read-only implementation (perhaps with some static method for
creating a JAR from an existing Directory); not a concurrently indexed and
searched impl.
does anybody know of such code, or of any limitation
To add to other comments:
This functionality should also look at how common a term is in the corpus.
Using the corpus as "correct" set of terms to search on isn't always what
you want if the corpus is unclean (misspellings, etc.)
I believe this is why if you search on an uncommon term, Google w
The expected behavior is to sometimes treat a character as indicating a new
token and other times to ignore the same character?
This sounds like behavior that should be much better documented than it
currently is.
Why would this be the default? What cases is it meant for?
-Original Message--
Constructing a separated index as a dictionary is one part of solution.
The other part is to construct a dictionary with a list of possible
"good words".
By "good words", I mean all leagal queries, not necessarily "correct words".
Two approaches I can think of:
* Use a word list(it may not be the
That's StandardAnalyzer's expeceted behaviour. If you want
tokenization to occur only on white spaces, use WhitespaceAnalyzer. If
you want custom behaviour, you should write an Analyzer (there should
be a FAQ entry with an example).
Otis
--- "Is, Studcio" <[EMAIL PROTECTED]> wrote:
> Hello,
>
Hi,
I have been trying to control where lucene creates the search index for
my web application.
I am tweaking the following code in order to specify the location for
the index, but it seems that lucene is creating the index in the
location from where my CreateIndex.class is invoked.
Here is the
Hello,
I have a set of index files that I'd like to distribute with my Java
application. The only way this seems practical is to place the index files
in a jar file. I tried this, but the search choked when I told IndexSearcher
the index path inside the jar file ( and placed the jar file path i
Hello. I've got a problem perhaps some of you have help with.
I have an application that has to use fairly long queries (containing about 30
terms or'ed together) against an index of about 500K documents. Because of the
limited vocabulary I'm indexing and querying over (~2000 terms), the size
Thank you for your help!
But it doesn't work that way!!
My code is:
IndexReader reader = IndexReader.open(dir);
for (int i=0; i --- Ursprüngliche Nachricht ---
> Von: "Mordo, Aviran (EXP N-NANNATEK)" <[EMAIL PROTECTED]>
> An: java-user@lucene.apache.org
> Betreff: RE: UpdateIndex
> Datum: Mon,
Hello,
I'm using Lucene for a few weeks now in a small project and just ran
into a problem. My index contains words that contain one or more
underlines, e.g. XYZZZY_DE_SA0001 or XYZZZY_AT0001. Unfortunately the
tokenizer tokenizes / splits the word into multiple tokens at the
underscores, except
No, just at the end of the delete loop get a new reader instance.
Aviran
http://www.aviransplace.com
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Monday, August 29, 2005 10:10 AM
To: java-user@lucene.apache.org
Subject: RE: UpdateIndex
for (int i=0; i ---
for (int i=0; i --- Ursprüngliche Nachricht ---
> Von: "Mordo, Aviran (EXP N-NANNATEK)" <[EMAIL PROTECTED]>
> An: java-user@lucene.apache.org
> Betreff: RE: UpdateIndex
> Datum: Mon, 29 Aug 2005 09:28:59 -0400
>
> After you delete / add documents, you need to get a new IndexReader
> instance to re
On Aug 29, 2005, at 9:05 AM, Markus Fischer wrote:
I currently pass the search tokens as Vector to my query function
and construct the string to pass to the QueryParse.parse() by hand.
StringBuffer qStr = new StringBuffer();
qStr.append("title:" + queryString.trim() + "^7 ");
[...]
and this a
After you delete / add documents, you need to get a new IndexReader instance to
reflect the changes.
HTH
Aviran
http://www.aviransplace.com
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Monday, August 29, 2005 7:32 AM
To: java-user@lucene.apache.org
Subje
Hi,
I currently pass the search tokens as Vector to my query function and
construct the string to pass to the QueryParse.parse() by hand.
StringBuffer qStr = new StringBuffer();
qStr.append("title:" + queryString.trim() + "^7 ");
[...]
and this append for every field I want to search in.
Whe
Not sure what Jeeves does, but we index the answers and also store the
questions in a lookup table. During a search we submit a regular search
to the FAQ index and we also do some query side analysis to see if the
input query is similar to any of the stored questions.
-Grant
>>> [EMAIL PROTECTED
Hi,
over again a question about updating!
I update my index by first deletion all the documents from index which are
not anymore in the document directories, then i delete all documents from
index which have changed and at last i add all documents to the index which
are not in the index but in the
java.net had an article on this not long ago. See
http://today.java.net/pub/a/today/2005/08/09/didyoumean.html .
On Mon, 29 Aug 2005, Martin Rode wrote:
Hi everybody,
Has anyone tried to code a solution like Google's "Did you mean?" in Lucene?
I would be very happy to hear your ideas, approa
Hi
Luceners
Apologies..
Has any body
on the Form attempted to use Lucene for
search on FAQ like the
website
"ASK
JEEVES"
If So
,Please enlighten me with some ideas...
WITH WARM REGARDS HAVE A NICE DAY [
N.S.KARTHIK]
Hi Martin,
you might want to have a look at
http://today.java.net/pub/a/today/2005/08/09/didyoumean.html
This article discusses a solution that uses a separate index consisting of
n-grams as a dictionary. I haven't tried it myself yet, but I will give it a
try in the near future.
Regards,
Quoting Martin Rode <[EMAIL PROTECTED]>:
> Hi everybody,
>
> Has anyone tried to code a solution like Google's "Did you mean?" in
> Lucene?
>
> I would be very happy to hear your ideas, approaches, suggestions.
I know that what Google does is look at consecutive queries by the same user
that are
Hi everybody,
Has anyone tried to code a solution like Google's "Did you mean?" in
Lucene?
I would be very happy to hear your ideas, approaches, suggestions.
Best,
Martin
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For ad
33 matches
Mail list logo