Sorting & SQL-Database

2006-06-30 Thread Dominik Bruhn
Hy, i use Lucene to index a SQL-Table which contains three fields: a index-field, the text to search in and another field. When adding a lucene document I let Lucene index the search-field and also save the id along with it in the lucene index. Uppon searching I collect all ids and add them to

Re: Any existing query types that support equivalent of "-not interested" ?

2006-06-30 Thread markharw00d
Maybe this: SpanNotQuery(interested, SpanNearQuery(not,interested)) with a SpanTermQuery for each term? Thanks, Paul. This is working well for me and I can happily use multiple SpanTermQueries embedded in a SpanOrQuery in place of each of the single words in your example. SpanNotQuery

RE: Null field values

2006-06-30 Thread Seeta Somagani
Hi Erick, The fields that are missing are sort of primary keys and they exist in all the documents (including those that were returned in my search results) when I browsed through the index using Luke. And the field names are exactly the same all in the same case. I never get the three field value

Re: Null field values

2006-06-30 Thread Erick Erickson
There is no requirement that every document contain values for every field. Doc A could have fields z, y, x, and Doc B could have fields x, w, v. So, when you say "some of the values are being returned as null", do you mean that you *never* get any values for some field or you get values for a fie

Null field values

2006-06-30 Thread Seeta Somagani
Hi, I indexed some XML files using Lucene. When I open up the index using Luke, I can see that all the fields are stored correctly in the index. But, when I try to grab the fields from the hits, after searching, some of the values are being returned as null. Any suggestions about what might be

Re: Changing the MergeFactor - should I reindex?

2006-06-30 Thread Monsur Hossain
Thanks Otis. When two segments are merged into one, are they merged contiguously, so that the individual .del files for the individual segments will no longer be needed? Or are the .del files merged into a single larger .del file for the new segment? Monsur On 6/30/06, Otis Gospodnetic <[EMA

Re: Spellchecker Download at lucene wiki outdated

2006-06-30 Thread Otis Gospodnetic
Thanks Martin, I removed the attachment and the link to it. In the future, please feel free to edit Wiki pages yourself, that's the main idea behind Wikis. :) Otis - Original Message From: Martin Braun <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Friday, June 30, 2006 7:36:

Re: Changing the MergeFactor - should I reindex?

2006-06-30 Thread Otis Gospodnetic
Hi Monsur, You don't need to reindex everything after changing merge factor. As for growing the .del file, I believe the .del file simply keeps track of documents that have been deleted (and should thus not show up in search results). Thus, I think you can let the file grow, although at some

Re: many many boolean queries

2006-06-30 Thread Martin Kobele
thank you! Martin On Friday 30 June 2006 14:21, Erick Erickson wrote: > The tradeoff is that it'll blow up eventually . I have a really hard > time trusting increasing the clause count, since eventually, more > data/terms/something will blow my limit again. > > You probably want to think serious

Changing the MergeFactor - should I reindex?

2006-06-30 Thread Monsur Hossain
I have a system of 2 servers, one to index and one to search. The index server updates the Lucene index and then copies the 200 meg index over to the search server. Originally, the index server would optimize the index before copying. To improve performance, I stopped optimizing, dropped the me

Re: Spellchecker Download at lucene wiki outdated

2006-06-30 Thread Chris Hostetter
: I don't know who can update the Wiki Pages so I am just mailing here. anyone can edit the wiki, just create an account (click "Login" and it will give you that option) : So I wanted to build _only_ the spellcheck-contrib from the : SVN-repository, but it seems to me that there are no ant-tar

Re: many many boolean queries

2006-06-30 Thread Erick Erickson
The tradeoff is that it'll blow up eventually . I have a really hard time trusting increasing the clause count, since eventually, more data/terms/something will blow my limit again. You probably want to think seriously about using a filter, perhaps with a RegexTermEnum. The folks who really know

many many boolean queries

2006-06-30 Thread Martin Kobele
Hi, since I use many wildcards, I get the exception, that the number of boolean queries exceeds the default value (1024). I could simply increase the value to like 10,000 or something What would be the trade-off of using a high max value? Thanks! Martin

Re: Any existing query types that support equivalent of "-not interested" ?

2006-06-30 Thread Paul Elschot
On Friday 30 June 2006 19:09, markharw00d wrote: > Erik Hatcher wrote: > > > wouldn't this work? +interested -"not interested" > > Hi Erik. > Yes, sorry brain is disengaged with all the heat here - my example > wasn't great and my scenario may be more complex than I originally > outlined. I

Re: Any existing query types that support equivalent of "-not interested" ?

2006-06-30 Thread markharw00d
Erik Hatcher wrote: wouldn't this work? +interested -"not interested" Hi Erik. Yes, sorry brain is disengaged with all the heat here - my example wasn't great and my scenario may be more complex than I originally outlined. I may have 20 different ways of saying "interested" and want to q

Re: Any existing query types that support equivalent of "-not interested" ?

2006-06-30 Thread Erik Hatcher
wouldn't this work? +interested -"not interested" On Jun 30, 2006, at 11:47 AM, mark harwood wrote: As an example - I want to search for the word "interested" without finding docs that have "not" immediately preceding it. I couldn't see anything in SpanQuerys that would help and you can't

Any existing query types that support equivalent of "-not interested" ?

2006-06-30 Thread mark harwood
As an example - I want to search for the word "interested" without finding docs that have "not" immediately preceding it. I couldn't see anything in SpanQuerys that would help and you can't construct phrase queries like "-not interested". If something doesn't exist I'll probably look into writi

Re: question

2006-06-30 Thread Aleksander M. Stensby
Dammit, i just wrote a long mail, and opera found out that it should just delete it before i got to send it:( gr.. well, A short version. You are misunderstanding the field functions I think. final Document doc = new Document(); doc.add(new Field("resume_name", resume_name, Store.YES, I

RE: Lock File

2006-06-30 Thread WATHELET Thomas
Ok thanks I understand now. Thanks a lot. -Original Message- From: Michael McCandless [mailto:[EMAIL PROTECTED] Sent: 30 June 2006 16:10 To: java-user@lucene.apache.org Subject: Re: Lock File > It's not possible to change lockDir because it's a final static > varriables? > Is it possib

Re: Lock File

2006-06-30 Thread Michael McCandless
It's not possible to change lockDir because it's a final static varriables? Is it possible to change the lockDir? Correct, because it's final you cannot change it directly. But, you can set the Java system property org.apache.lucene.lockDir. This will change the lock directory, because the f

Re: question

2006-06-30 Thread Erick Erickson
Several things (on a "quick look" basis). I don't see where you are retrieving the document fields you index. You're indexing "resume_name" and "details" fields. But your search code is trying to get the "path" and "title" fields out of the document. They won't be there. Of course, you may not ha

Re: question

2006-06-30 Thread Aleksander M. Stensby
Dont have time for a big answer, but you are searching in a field called "contents", and from my quick glance, there is no such field? As i can see, you have two fields called "resume_name" and "details" So, you would have to search in the field details. Also remember that field-names are ca

RE: Lock File

2006-06-30 Thread WATHELET Thomas
It's not possible to change lockDir because it's a final static varriables? Is it possible to change the lockDir? -Original Message- From: Michael McCandless [mailto:[EMAIL PROTECTED] Sent: 29 June 2006 22:26 To: java-user@lucene.apache.org Subject: Re: Lock File > When I create an ind

Re: question

2006-06-30 Thread amit_kkumar
hi, i am able to index the database but while searching it does'nt show any result i am sending u part of my code just chk and tell me where is error code--- class DBReader { static final File INDEX_DIR = new File("index"); public static void main(String args[]) { try { // The newInstance() cal

Re: Lexical error when asterisk quantifier is first in query string

2006-06-30 Thread Erik Hatcher
On Jun 30, 2006, at 8:29 AM, Björn Ekengren wrote: queryParser.ParseException: Lexical error at line 1, column 2. Encountered: after : "" I get this when i enter a query with a asterisk (or questionmark) first in the query ( "*cene" ). Is this a bug or am I doing something wrong ? Wildc

Re: Are there any problems with the hits.length() in luc 1.4?

2006-06-30 Thread Aleksander M. Stensby
Maybe you didn't close / open your writer/reader and that you where searching with a snapshot of the old index. And that after you had closed the writer and opened a new reader you got the correct result..? On Fri, 30 Jun 2006 14:22:40 +0200, Marcus Falck <[EMAIL PROTECTED]> wrote: Hi,

AW: Lexical error when asterisk quantifier is first in query string

2006-06-30 Thread Johannes Christen
Leading wildcards (*) are not allowed by the query parser. See FAQ: http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-4d62118417eaef0dcb87f4370583f809848ea695 Jo -Ursprüngliche Nachricht- Von: Björn Ekengren [mailto:[EMAIL PROTECTED] Gesendet: Freitag, 30. Juni 2006 14:30 An

Re: Lock File

2006-06-30 Thread Michael McCandless
I have a clustered environment, with a load-balancer in the front assigning connections. Is it better to have one of the cluster running a searcher as a webservice (to be accessed by the other machines in the cluster) or to have a IndexReader/Searcher for each machine in the cluster? Ahh, OK

Lexical error when asterisk quantifier is first in query string

2006-06-30 Thread Björn Ekengren
queryParser.ParseException: Lexical error at line 1, column 2. Encountered: after : "" I get this when i enter a query with a asterisk (or questionmark) first in the query ( "*cene" ). Is this a bug or am I doing something wrong ? /B

Are there any problems with the hits.length() in luc 1.4?

2006-06-30 Thread Marcus Falck
Hi, I'm indexing around 200 million articles in lucene. I have for the moment added around 60 articles. Using this technique: 5000 docs in RAMDir Flush RAMDir to FSDir to create segmentfile with 5000. Mergefactor 10. I'm searching using the multisearcher. When I had around 4

RE: Lucene indexing PPT

2006-06-30 Thread mcarcelen
Hello Nick! Thanks for your help, it´s useful for me Bye -Mensaje original- De: Nick Burch [mailto:[EMAIL PROTECTED] Enviado el: viernes, 30 de junio de 2006 12:19 Para: java-user@lucene.apache.org Asunto: Re: Lucene indexing PPT On Fri, 30 Jun 2006, mcarcelen wrote: > I´m trying to bui

Spellchecker Download at lucene wiki outdated

2006-06-30 Thread Martin Braun
Hi all, I don't know who can update the Wiki Pages so I am just mailing here. The download of spellchecker1.1.zip contribution does not work with Lucene-2.0 anymore. http://wiki.apache.org/jakarta-lucene/SpellChecker?highlight=spellchecker1.1.zip So I wanted to build _only_ the spellcheck-contri

Re: Lucene indexing PPT

2006-06-30 Thread Nick Burch
On Fri, 30 Jun 2006, mcarcelen wrote: > I´m trying to build a index with PPT files. I have downloaded the api > POI, "poi.bin.3.0" and "poi.src.3.0", but I don´t know where may I have > to unzip them. I´d like to build the index by the command line, the same > way as I don't know about the lucene

Lucene indexing PPT

2006-06-30 Thread mcarcelen
Hi everybody! I´m trying to build a index with PPT files. I have downloaded the api POI, "poi.bin.3.0" and "poi.src.3.0", but I don´t know where may I have to unzip them. I´d like to build the index by the command line, the same way as > java -cp lucene-core-2.0.0.jar;lucene-demos-2.0.0.jar org.a

RE: HitCollector and Sort Objects

2006-06-30 Thread Ramana Jelda
Yeah!! There are no methods that you mentioned. But there are some ways to do this. TopFieldDocs:search(Query query, Filter filter, int n, Sort sort) If above method does not solve your purpose, then My suggestion is to use method search(Query query, Filter filter, HitCollector results) and pa