permission control or category-wise search with Lucene

2005-08-29 Thread seema pai
Hi My site has large database of Television and Movie titles, in English, Spanish language. The movie data starts from year 1928 till date for selected studios like MGM, Disney etc . The site user should be capable to search movie or tv series by title, description, actors or characters. The

Re: custom sort

2005-08-29 Thread Chris Lu
You can just assign the field B some weight when creating the index? -- Chris Lu Lucene Search RAD on Any Database http://www.dbsight.net On 8/29/05, raymondcreel (sent by Nabble.com) <[EMAIL PROTECTED]> wrote: > > Is it possible to write a custom sort for a query such that the fir

RE: Inconsistent tokenizing of words containing underscores.

2005-08-29 Thread Aigner, Thomas
What seems to be working for me is a punctuation filter that removes / - _ etc and makes the token without them. Then "most" of the time the word XYZZZY_DE_SA0001 will be tokenized as XYZZZYDESA0001. For this to work, you will have to use the same punctuation filter on the strings before you sear

Corrupted indexes

2005-08-29 Thread Eric Bressler
I am running lucene 1.4.3 and I have a situation where after adding and removing some objects, my index becomes corrupt. I have been careful to make sure that all adds and removes happen in a single thread (although I know that isn't necessarly needed) and it still occurs. I am not sure how to go

Re: Index files in jar

2005-08-29 Thread Dan Funk
Why not build a self-extracting jar file and extract the contents of the index to a temp directory? http://www.javaworld.com/javaworld/javatips/jw-javatip120.html Thomas Lepkowski wrote: Hello, I have a set of index files that I'd like to distribute with my Java application. The only way

custom sort

2005-08-29 Thread raymondcreel (sent by Nabble.com)
Is it possible to write a custom sort for a query such that the first N documents that match a certain additional criteria get pushed to the top of the sort? For instance say you sort your query based on field A, but you want to tweak the results such that the first 10 documents in the result

Re: IndexReader delete(int i)

2005-08-29 Thread Yonik Seeley
Perhaps because you are not iterating over all the documents? numDocs() == maxDocs() - numer_of_deleted_docs So first try replacing numDocs() with maxDocs() -Yonik On 8/29/05, Derya Kasapoglu <[EMAIL PROTECTED]> wrote: > Hi, > > if i delete a document from index, what does the it do? > I want t

Re: Inconsistent tokenizing of words containing underscores.

2005-08-29 Thread Daniel Naber
On Monday 29 August 2005 19:21, Jeremy Meyer wrote: > The expected behavior is to sometimes treat a character as indicating a > new token and other times to ignore the same character? It depends on whether there are digits in the token. It's documented in the javacc source for the tokenizer(?).

IndexReader delete(int i)

2005-08-29 Thread Derya Kasapoglu
Hi, if i delete a document from index, what does the it do? I want to know because if i delete documents from index which are not anymore in the dokument directories like that: IndexReader reader = IndexReader.open(dir); for (int i=0; i if (!file.exists()) reader.delete(i); } reader.cl

Re: query performance behavior not as expected

2005-08-29 Thread Paul Elschot
On Monday 29 August 2005 17:24, Greg Conway wrote: > Hello. I've got a problem perhaps some of you have help with. > > I have an application that has to use fairly long queries (containing about 30 terms or'ed together) against an index of about 500K documents. Because of the limited vocabula

RE: Index files in jar

2005-08-29 Thread Jon Schuster
Hi Tom, You could distribute your index files in a plain old directory outside of a jar file and install them with your application, then use FSDirectory to read from the installed location. But I can think of at least two ways to get the index files packaged into the application jar. One would b

RE: Index files in jar

2005-08-29 Thread Sale, Doug
this would indeed be useful, it's something i've considered doing as well. i'm assuming a read-only implementation (perhaps with some static method for creating a JAR from an existing Directory); not a concurrently indexed and searched impl. does anybody know of such code, or of any limitation

Re: Did you mean?

2005-08-29 Thread Jason Haruska
To add to other comments: This functionality should also look at how common a term is in the corpus. Using the corpus as "correct" set of terms to search on isn't always what you want if the corpus is unclean (misspellings, etc.) I believe this is why if you search on an uncommon term, Google w

RE: Inconsistent tokenizing of words containing underscores.

2005-08-29 Thread Jeremy Meyer
The expected behavior is to sometimes treat a character as indicating a new token and other times to ignore the same character? This sounds like behavior that should be much better documented than it currently is. Why would this be the default? What cases is it meant for? -Original Message--

Re: Did you mean?

2005-08-29 Thread Chris Lu
Constructing a separated index as a dictionary is one part of solution. The other part is to construct a dictionary with a list of possible "good words". By "good words", I mean all leagal queries, not necessarily "correct words". Two approaches I can think of: * Use a word list(it may not be the

Re: Inconsistent tokenizing of words containing underscores.

2005-08-29 Thread Otis Gospodnetic
That's StandardAnalyzer's expeceted behaviour. If you want tokenization to occur only on white spaces, use WhitespaceAnalyzer. If you want custom behaviour, you should write an Analyzer (there should be a FAQ entry with an example). Otis --- "Is, Studcio" <[EMAIL PROTECTED]> wrote: > Hello, >

Index location

2005-08-29 Thread Mufaddal Khumri
Hi, I have been trying to control where lucene creates the search index for my web application. I am tweaking the following code in order to specify the location for the index, but it seems that lucene is creating the index in the location from where my CreateIndex.class is invoked. Here is the

Index files in jar

2005-08-29 Thread Thomas Lepkowski
Hello, I have a set of index files that I'd like to distribute with my Java application. The only way this seems practical is to place the index files in a jar file. I tried this, but the search choked when I told IndexSearcher the index path inside the jar file ( and placed the jar file path i

query performance behavior not as expected

2005-08-29 Thread Greg Conway
Hello. I've got a problem perhaps some of you have help with. I have an application that has to use fairly long queries (containing about 30 terms or'ed together) against an index of about 500K documents. Because of the limited vocabulary I'm indexing and querying over (~2000 terms), the size

RE: UpdateIndex

2005-08-29 Thread Derya Kasapoglu
Thank you for your help! But it doesn't work that way!! My code is: IndexReader reader = IndexReader.open(dir); for (int i=0; i --- Ursprüngliche Nachricht --- > Von: "Mordo, Aviran (EXP N-NANNATEK)" <[EMAIL PROTECTED]> > An: java-user@lucene.apache.org > Betreff: RE: UpdateIndex > Datum: Mon,

Inconsistent tokenizing of words containing underscores.

2005-08-29 Thread Is, Studcio
Hello, I'm using Lucene for a few weeks now in a small project and just ran into a problem. My index contains words that contain one or more underlines, e.g. XYZZZY_DE_SA0001 or XYZZZY_AT0001. Unfortunately the tokenizer tokenizes / splits the word into multiple tokens at the underscores, except

RE: UpdateIndex

2005-08-29 Thread Mordo, Aviran (EXP N-NANNATEK)
No, just at the end of the delete loop get a new reader instance. Aviran http://www.aviransplace.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Monday, August 29, 2005 10:10 AM To: java-user@lucene.apache.org Subject: RE: UpdateIndex for (int i=0; i ---

RE: UpdateIndex

2005-08-29 Thread dozean
for (int i=0; i --- Ursprüngliche Nachricht --- > Von: "Mordo, Aviran (EXP N-NANNATEK)" <[EMAIL PROTECTED]> > An: java-user@lucene.apache.org > Betreff: RE: UpdateIndex > Datum: Mon, 29 Aug 2005 09:28:59 -0400 > > After you delete / add documents, you need to get a new IndexReader > instance to re

Re: Creating parser query "by hand"

2005-08-29 Thread Erik Hatcher
On Aug 29, 2005, at 9:05 AM, Markus Fischer wrote: I currently pass the search tokens as Vector to my query function and construct the string to pass to the QueryParse.parse() by hand. StringBuffer qStr = new StringBuffer(); qStr.append("title:" + queryString.trim() + "^7 "); [...] and this a

RE: UpdateIndex

2005-08-29 Thread Mordo, Aviran (EXP N-NANNATEK)
After you delete / add documents, you need to get a new IndexReader instance to reflect the changes. HTH Aviran http://www.aviransplace.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Monday, August 29, 2005 7:32 AM To: java-user@lucene.apache.org Subje

Creating parser query "by hand"

2005-08-29 Thread Markus Fischer
Hi, I currently pass the search tokens as Vector to my query function and construct the string to pass to the QueryParse.parse() by hand. StringBuffer qStr = new StringBuffer(); qStr.append("title:" + queryString.trim() + "^7 "); [...] and this append for every field I want to search in. Whe

Re: ASK JEEVES

2005-08-29 Thread Grant Ingersoll
Not sure what Jeeves does, but we index the answers and also store the questions in a lookup table. During a search we submit a regular search to the FAQ index and we also do some query side analysis to see if the input query is similar to any of the stored questions. -Grant >>> [EMAIL PROTECTED

Re: UpdateIndex

2005-08-29 Thread dozean
Hi, over again a question about updating! I update my index by first deletion all the documents from index which are not anymore in the document directories, then i delete all documents from index which have changed and at last i add all documents to the index which are not in the index but in the

Re: Did you mean?

2005-08-29 Thread Joseph B. Ottinger
java.net had an article on this not long ago. See http://today.java.net/pub/a/today/2005/08/09/didyoumean.html . On Mon, 29 Aug 2005, Martin Rode wrote: Hi everybody, Has anyone tried to code a solution like Google's "Did you mean?" in Lucene? I would be very happy to hear your ideas, approa

ASK JEEVES

2005-08-29 Thread Karthik N S
Hi  Luceners Apologies..   Has any body on the Form attempted  to use  Lucene   for  search  on  FAQ   like the website "ASK JEEVES" If So ,Please enlighten me with some ideas... WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK]

AW: Did you mean?

2005-08-29 Thread Martin.Stein
Hi Martin, you might want to have a look at http://today.java.net/pub/a/today/2005/08/09/didyoumean.html This article discusses a solution that uses a separate index consisting of n-grams as a dictionary. I haven't tried it myself yet, but I will give it a try in the near future. Regards,

Re: Did you mean?

2005-08-29 Thread Dave Kor
Quoting Martin Rode <[EMAIL PROTECTED]>: > Hi everybody, > > Has anyone tried to code a solution like Google's "Did you mean?" in > Lucene? > > I would be very happy to hear your ideas, approaches, suggestions. I know that what Google does is look at consecutive queries by the same user that are

Did you mean?

2005-08-29 Thread Martin Rode
Hi everybody, Has anyone tried to code a solution like Google's "Did you mean?" in Lucene? I would be very happy to hear your ideas, approaches, suggestions. Best, Martin - To unsubscribe, e-mail: [EMAIL PROTECTED] For ad