Does lucene performance suffer with a lot of empty fields ?

2006-08-01 Thread Mek

I have 1 generic index, but am Indexing a lot of different things, like
actors, politicians, scientists, sportsmen.

And as you can see that though there are some common fields, like name &
DOB, there are also fields for each of these types of people that are
different.
e.g. Actors will have "Movies, TV shows, ", politicians will have "Political
party...", scientists will have "publications, inventions ..."

Also, I do not want to create multiple indexes, as the number of such types
& hence the number of indices can get out of hand, eg I could decide to add
"footballers", "tennis players".

I am sure I am not the 1st who's facing this problem.


From what I gather, I can go ahead & create an Index & for each Document &

only add the relevant fields. Is this correct?
I should still be able to search with queries like "mel Movies:braveheart".
Right ?

Would this impact the search performance ?
Any other words of caution for me ?

Thanks,
mek


FileNotFoundException

2006-08-01 Thread WATHELET Thomas
When the indexing process still running on a index and I try to search
something on this index I retrive this error message:
java.io.FileNotFoundException:
\\tradluxstmp01\JavaIndex\tra\index_EN\_2hea.fnm (The system cannot find
the file specified)

How can I solve this.



RE: Sorting

2006-08-01 Thread Chris Hostetter

: I take your point that Berkley DB would be much less clumsy, but an
: application that's already using a relational database for other purposes
: might as well use that relational database, no?

if you already have some need to access data about each matching doc from
a relational DB, then sure you might as well let it sort for you -- but
just bcause your APP has some DB connections open doesn't mean that's a
worthwhile reason to ask it to do the sort ... your app might have some
netowrk connections open to an IMAP server as well .. that doesn't mean
you should convert the docs to email messages and ask the IMAP server to
sort them :)

: I'm not really with you on the random access file, Chris. Here's where I am
: up to with my [mis-]understanding...
:
: I want to sort on 2 terms. Happily these can be ints (the first is an INT
: corresponding to a 10 minute timestamp "YYMMDDHHI" and the second INT is a
: hash of a string, used to group similar documents together within those 10
: minute timestamps). When I initially warm up the FieldCache (first search
: after opening the Searcher), I start by generating two random access files
: with int values at offsets corresponding to document IDs for each of these;
: the first file would have ints corresponding to the timestamp and the second
: would have integers corresponding to the hash. I'd then need to generate a
: third file which is equivalent to an array dimensioned by document ID, with
: document IDs in compound sort order??

i'm not sure why you think you need the third file ... you should be
able to use the two files you created exactly the way the existing code
would use the two arrays if you were using an in memory FieldCache (with
file seeks instead of array lookups) .. i think the class you want to look
at is FieldSortedHitQueue

: In a big index, it will take a while to walk through all of the documents to
: generate the first two random access files and the sort process required to
: generate the sorted file is going to be hard work.

well .. yes.  but that's the trade off, the reason for the RAM based
FieldCache is speed .. if you don't have that RAM to use, then doing
the same things on disk gets slower.


Bear in mind, there have been some improvements recently to the ability to
grab individual stored fields per document (FieldSelector is the name of
the class i think) ... i haven't tried those out yet, but they could make
Sorting on a stored field (which wouldn't require building up any cache -
RAM or Disk based) feasible regardless of the size of your result sets ...
but i haven't tried that yet.



-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: EMAIL ADDRESS: Tokenize (i.e. an EmailAnalyzer)

2006-08-01 Thread Chris Hostetter

: Sure I would love to!  Can you ping me at [EMAIL PROTECTED] and
: let me know what I need to do?   Do I just post it to JIRA?

instructions on submitting code can be found in the wiki..

http://wiki.apache.org/jakarta-lucene/HowToContribute

note in particular that since you are primarily submiting new files,
you'll need to "svn add" them locally in order for them to be included in
patches created by "svn diff".

As for where it might make sense for them to live: there is an existing
"contrib/analyzers" package which might make the most sense.

Also note that while test cases aren't stricly mandatory for newly
contributed code, it does go a long way towards documenting expected
behavior, and encouraging committers to commit it :)



-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: dash-words

2006-08-01 Thread Martin Braun
Hi Yonik,

>> So a Phrase search to "The xmen story" will fail. With a slop of 1 the
>> doc will be found.
>>
>> But when generating the query I won't know when to use a slop. So adding
>> slops isn't a nice solution.
> 
> If you can't tolerate slop, this is a problem.

I use the WordDelimiterFilter now without slop, because in other cases
it's an amelioration. But I (or better my app) stumbled now over a non
Phrase Query:

If I am searching for a title named (sorry for the german example).

"lage der arbeiterjugend in westberlin" (indexed with
WordDelimiterFilter + lowercase)

with a query like this +arbeiterjugend +west-berlin I get no results.

org.apache.lucene.queryParser.QueryParser.parse makes this query (with
WordDelimiterFilter) with Default QueryParser.AND_OPERATOR:

+titel:arbeiterjugend +titel:"west (berlin westberlin)"

with +arbeiterjugend +westberlin I get the result.

It seems that the synonyms don't work with the query. How do you solve
this in Solr? Do I have to build a TermQuery?

thanks in advance,

martin







-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Sorting

2006-08-01 Thread Rob Staveley (Tom)
>  file seeks instead of array lookups

I'm with you now. So you do seeks in your comparator. For a large index you
might as well use java.io.RandomAccessFile for the "array", because there
would be little value in buffering when the comparator is liable to jump all
around the file. This sounds very expensive, though. If you don't open a
Searcher to frequently, it makes sense (in my muddled mind) to pre-sort to
reduce the number of seeks. That was the half-baked idea of the third file,
which essentially orders document IDs.

> Bear in mind, there have been some improvements recently to the ability to
grab individual stored fields per document

I can't see anything like that in 2.0. Is that something in the Lucene HEAD
build?

-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: 01 August 2006 09:37
To: java-user@lucene.apache.org
Subject: RE: Sorting


: I take your point that Berkley DB would be much less clumsy, but an
: application that's already using a relational database for other purposes
: might as well use that relational database, no?

if you already have some need to access data about each matching doc from a
relational DB, then sure you might as well let it sort for you -- but just
bcause your APP has some DB connections open doesn't mean that's a
worthwhile reason to ask it to do the sort ... your app might have some
netowrk connections open to an IMAP server as well .. that doesn't mean you
should convert the docs to email messages and ask the IMAP server to sort
them :)

: I'm not really with you on the random access file, Chris. Here's where I
am
: up to with my [mis-]understanding...
:
: I want to sort on 2 terms. Happily these can be ints (the first is an INT
: corresponding to a 10 minute timestamp "YYMMDDHHI" and the second INT is a
: hash of a string, used to group similar documents together within those 10
: minute timestamps). When I initially warm up the FieldCache (first search
: after opening the Searcher), I start by generating two random access files
: with int values at offsets corresponding to document IDs for each of
these;
: the first file would have ints corresponding to the timestamp and the
second
: would have integers corresponding to the hash. I'd then need to generate a
: third file which is equivalent to an array dimensioned by document ID,
with
: document IDs in compound sort order??

i'm not sure why you think you need the third file ... you should be able to
use the two files you created exactly the way the existing code would use
the two arrays if you were using an in memory FieldCache (with file seeks
instead of array lookups) .. i think the class you want to look at is
FieldSortedHitQueue

: In a big index, it will take a while to walk through all of the documents
to
: generate the first two random access files and the sort process required
to
: generate the sorted file is going to be hard work.

well .. yes.  but that's the trade off, the reason for the RAM based
FieldCache is speed .. if you don't have that RAM to use, then doing the
same things on disk gets slower.


Bear in mind, there have been some improvements recently to the ability to
grab individual stored fields per document (FieldSelector is the name of the
class i think) ... i haven't tried those out yet, but they could make
Sorting on a stored field (which wouldn't require building up any cache -
RAM or Disk based) feasible regardless of the size of your result sets ...
but i haven't tried that yet.



-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


smime.p7s
Description: S/MIME cryptographic signature


searching oracle databse records using apache Lucene

2006-08-01 Thread Sandip

Hi All,

I am confused with Apache Lucene.

I want to search my databse table records using apache lucene.
But what i found is that Lucene is full-text search engine.This means is it
only used to search documents text or anything else ?

I want to search my databse like e.g.

select * from tableName where username="abc";
using apache lucene.

I am using Oracle 9i/Java for this.
Any idea/link/suggessions will be very much appreciable.

Thanks in advance
Sandip Patil.
-- 
View this message in context: 
http://www.nabble.com/searching-oracle-databse-records-using-apache-Lucene-tf2032743.html#a5591986
Sent from the Lucene - Java Users forum at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Seach In slide with lucene

2006-08-01 Thread aslam bari
Dear All,
  I am facing a unknown situaltion. I am using webdav search, it is working 
fine, i know it is slower than lucene. I am using jakarta-slide-2.1 and 
lucene-2.1.  I have configured my domain.xml file as:-
   
  
  ./index
  
   
  I saw that in store/index folder is getting created. But in searching slide 
is not using lucene index.
   
  Can anybody tell me :-
  1) If i am using right versions of slide and lucene.
  1). How can i search slide using lucene index.
  2). What will be the structure of query.
   
  Thanks in Advance...


-
 Here’s a new way to find what you're looking for - Yahoo! Answers 

Re: searching oracle databse records using apache Lucene

2006-08-01 Thread amit_kkumar
hi sandip,

first get all those fields on which you want search and store
it in some var.

then apply indexing with these var.

then fire ur search query

regards

amit kumar

DISCLAIMER
==
This e-mail may contain privileged and confidential information which is the 
property of Persistent Systems Pvt. Ltd. It is intended only for the use of the 
individual or entity to which it is addressed. If you are not the intended 
recipient, you are not authorized to read, retain, copy, print, distribute or 
use this message. If you have received this communication in error, please 
notify the sender and delete all copies of this message. Persistent Systems 
Pvt. Ltd. does not accept any liability for virus infected mails.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: FileNotFoundException

2006-08-01 Thread Michael McCandless



When the indexing process still running on a index and I try to search
something on this index I retrive this error message:
java.io.FileNotFoundException:
\\tradluxstmp01\JavaIndex\tra\index_EN\_2hea.fnm (The system cannot find
the file specified)

How can I solve this.


Could you provide some more context about your application or a small 
test case that shows the error happening?  This sounds likely to be a 
locking issue.


Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: FileNotFoundException

2006-08-01 Thread WATHELET Thomas
For the index process I use IndexModifier class.
That happens when I try to search something into the index in the same
time that the index process still running. 

the code for indexing:
  System.setProperty("org.apache.lucene.lockDir", System
.getProperty("user.dir"));
File folder = new File(getIndexPath());
Directory dir = null;
if (folder.isDirectory() && folder.exists()) {
dir = FSDirectory.getDirectory(getIndexPath(), false);
} else if (!folder.isFile() && !folder.exists()) {
dir = FSDirectory.getDirectory(getIndexPath(), true);
} else {
System.out.println("Bad index folder");
System.exit(1);
}
boolean newIndex = true;
if (dir.fileExists("segments")) {
newIndex = false;
}
// long lastindexation = dir.fileModified("segments");
writer = new IndexModifier(dir, new SimpleAnalyzer(), newIndex);
dir.close();
writer.setUseCompoundFile(true);
  ...

Code For searching:

  MultiSearcher multisearch = new MultiSearcher(indexsearcher);
  Hits hits = this.multisearch.search(this.getBoolQuery());
  ...

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED] 
Sent: 01 August 2006 13:45
To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException


> When the indexing process still running on a index and I try to search
> something on this index I retrive this error message:
> java.io.FileNotFoundException:
> \\tradluxstmp01\JavaIndex\tra\index_EN\_2hea.fnm (The system cannot
find
> the file specified)
> 
> How can I solve this.

Could you provide some more context about your application or a small 
test case that shows the error happening?  This sounds likely to be a 
locking issue.

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Seach In slide with lucene

2006-08-01 Thread Erik Hatcher
I believe you'll need to inquire with the Slide community, which  
unfortunately is a bit inactive lately.


Erik


On Aug 1, 2006, at 7:31 AM, aslam bari wrote:


Dear All,
  I am facing a unknown situaltion. I am using webdav search, it is  
working fine, i know it is slower than lucene. I am using jakarta- 
slide-2.1 and lucene-2.1.  I have configured my domain.xml file as:-


  
  ./index
  

  I saw that in store/index folder is getting created. But in  
searching slide is not using lucene index.


  Can anybody tell me :-
  1) If i am using right versions of slide and lucene.
  1). How can i search slide using lucene index.
  2). What will be the structure of query.

  Thanks in Advance...


-
 Here’s a new way to find what you're looking for - Yahoo! Answers



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Does lucene performance suffer with a lot of empty fields ?

2006-08-01 Thread Erick Erickson

I can't speak to performance, but there's no problem having different fields
for different documents. Stated differently, you don't need to have all
fields in all documents. It took me a while to get my head out of database
tables and accept this 

I doubt there's a problem with speed, but as always some measurements over
your particular data count most.

Erick

On 8/1/06, Mek <[EMAIL PROTECTED]> wrote:


I have 1 generic index, but am Indexing a lot of different things, like
actors, politicians, scientists, sportsmen.

And as you can see that though there are some common fields, like name &
DOB, there are also fields for each of these types of people that are
different.
e.g. Actors will have "Movies, TV shows, ", politicians will have
"Political
party...", scientists will have "publications, inventions ..."

Also, I do not want to create multiple indexes, as the number of such
types
& hence the number of indices can get out of hand, eg I could decide to
add
"footballers", "tennis players".

I am sure I am not the 1st who's facing this problem.

From what I gather, I can go ahead & create an Index & for each Document &
only add the relevant fields. Is this correct?
I should still be able to search with queries like "mel
Movies:braveheart".
Right ?

Would this impact the search performance ?
Any other words of caution for me ?

Thanks,
mek




Re: searching oracle databse records using apache Lucene

2006-08-01 Thread Erick Erickson

You're absolutely right, lucene is a text searching tool, not a database
tool. There's no point in trying to jump through hoops to use lucene if your
database already works for you.

If you're trying to do text searches, particularly if want to ask questions
like "find the words biggest and large within 5 words of each other", then
you might want to think about lucene. Or even if you want to just make
simple searches over text.

But to select rows from a database table, there's no reason to try to use
lucene. Use a database API instead.

Best
Erick

On 8/1/06, Sandip <[EMAIL PROTECTED]> wrote:



Hi All,

I am confused with Apache Lucene.

I want to search my databse table records using apache lucene.
But what i found is that Lucene is full-text search engine.This means is
it
only used to search documents text or anything else ?

I want to search my databse like e.g.

select * from tableName where username="abc";
using apache lucene.

I am using Oracle 9i/Java for this.
Any idea/link/suggessions will be very much appreciable.

Thanks in advance
Sandip Patil.
--
View this message in context:
http://www.nabble.com/searching-oracle-databse-records-using-apache-Lucene-tf2032743.html#a5591986
Sent from the Lucene - Java Users forum at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: FileNotFoundException

2006-08-01 Thread Erick Erickson

two things come to mind

1> are you absolutely sure that your reader and writer are pointing to the
same place? Really, absolutely, positively sure? You've hard-coded the path
into both writer and reader just to be really, absolutely positively sure?
Or, you could let the writer close and *then* try the reader to see if it's
a timing issue or a path issue.

2> You say that the indexer is still open. Is there any chance it hasn't yet
written anything to disk? I'm not sure of the internals, but there has been
some discussion that internally a writer uses a RAMdir for a while then
periodically flushes the results to disk. It's possible that you're writer
hasn't written anything yet.

3> (so I can't count). Have you used Luke to open your index to see if that
works (and the file is in the place you expect)?

FWIW
Erick

On 8/1/06, WATHELET Thomas <[EMAIL PROTECTED]> wrote:


For the index process I use IndexModifier class.
That happens when I try to search something into the index in the same
time that the index process still running.

the code for indexing:
  System.setProperty("org.apache.lucene.lockDir", System
.getProperty("user.dir"));
File folder = new File(getIndexPath());
Directory dir = null;
if (folder.isDirectory() && folder.exists()) {
dir = FSDirectory.getDirectory(getIndexPath(), false);
} else if (!folder.isFile() && !folder.exists()) {
dir = FSDirectory.getDirectory(getIndexPath(), true);
} else {
System.out.println("Bad index folder");
System.exit(1);
}
boolean newIndex = true;
if (dir.fileExists("segments")) {
newIndex = false;
}
// long lastindexation = dir.fileModified("segments");
writer = new IndexModifier(dir, new SimpleAnalyzer(), newIndex);
dir.close();
writer.setUseCompoundFile(true);
  ...

Code For searching:

  MultiSearcher multisearch = new MultiSearcher(indexsearcher);
  Hits hits = this.multisearch.search(this.getBoolQuery());
  ...

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED]
Sent: 01 August 2006 13:45
To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException


> When the indexing process still running on a index and I try to search
> something on this index I retrive this error message:
> java.io.FileNotFoundException:
> \\tradluxstmp01\JavaIndex\tra\index_EN\_2hea.fnm (The system cannot
find
> the file specified)
>
> How can I solve this.

Could you provide some more context about your application or a small
test case that shows the error happening?  This sounds likely to be a
locking issue.

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




RE: FileNotFoundException

2006-08-01 Thread WATHELET Thomas
It's the same when I try to open the index with luke 

-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED] 
Sent: 01 August 2006 15:24
To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException

two things come to mind

1> are you absolutely sure that your reader and writer are pointing to
the
same place? Really, absolutely, positively sure? You've hard-coded the
path
into both writer and reader just to be really, absolutely positively
sure?
Or, you could let the writer close and *then* try the reader to see if
it's
a timing issue or a path issue.

2> You say that the indexer is still open. Is there any chance it hasn't
yet
written anything to disk? I'm not sure of the internals, but there has
been
some discussion that internally a writer uses a RAMdir for a while then
periodically flushes the results to disk. It's possible that you're
writer
hasn't written anything yet.

3> (so I can't count). Have you used Luke to open your index to see if
that
works (and the file is in the place you expect)?

FWIW
Erick

On 8/1/06, WATHELET Thomas <[EMAIL PROTECTED]> wrote:
>
> For the index process I use IndexModifier class.
> That happens when I try to search something into the index in the same
> time that the index process still running.
>
> the code for indexing:
>   System.setProperty("org.apache.lucene.lockDir", System
> .getProperty("user.dir"));
> File folder = new File(getIndexPath());
> Directory dir = null;
> if (folder.isDirectory() && folder.exists()) {
> dir = FSDirectory.getDirectory(getIndexPath(), false);
> } else if (!folder.isFile() && !folder.exists()) {
> dir = FSDirectory.getDirectory(getIndexPath(), true);
> } else {
> System.out.println("Bad index folder");
> System.exit(1);
> }
> boolean newIndex = true;
> if (dir.fileExists("segments")) {
> newIndex = false;
> }
> // long lastindexation = dir.fileModified("segments");
> writer = new IndexModifier(dir, new SimpleAnalyzer(),
newIndex);
> dir.close();
> writer.setUseCompoundFile(true);
>   ...
>
> Code For searching:
>
>   MultiSearcher multisearch = new
MultiSearcher(indexsearcher);
>   Hits hits = this.multisearch.search(this.getBoolQuery());
>   ...
>
> -Original Message-
> From: Michael McCandless [mailto:[EMAIL PROTECTED]
> Sent: 01 August 2006 13:45
> To: java-user@lucene.apache.org
> Subject: Re: FileNotFoundException
>
>
> > When the indexing process still running on a index and I try to
search
> > something on this index I retrive this error message:
> > java.io.FileNotFoundException:
> > \\tradluxstmp01\JavaIndex\tra\index_EN\_2hea.fnm (The system cannot
> find
> > the file specified)
> >
> > How can I solve this.
>
> Could you provide some more context about your application or a small
> test case that shows the error happening?  This sounds likely to be a
> locking issue.
>
> Mike
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: searching oracle databse records using apache Lucene

2006-08-01 Thread Vasily Borisov
Eric, 
I'm sure that is entirely true. E.g. in E&P industry we have a bunch of
legacy relational databases that are tremendously complex. 
Therefore the presentation layer for them is never good since
the user is exposed to the data model complexity every time he uses this
database. 

So, giving up on the structured API idea, flattening out the content and
indexing is seems to be not a bad idea at all. Especially if you are
facing the problem of integrating results from several legacy databases
or integrating the flat file store with a database. 

Regards,
 Vasily   


On Tue, 2006-08-01 at 09:18 -0400, Erick Erickson wrote:
> You're absolutely right, lucene is a text searching tool, not a database
> tool. There's no point in trying to jump through hoops to use lucene if your
> database already works for you.
> 
> If you're trying to do text searches, particularly if want to ask questions
> like "find the words biggest and large within 5 words of each other", then
> you might want to think about lucene. Or even if you want to just make
> simple searches over text.
> 
> But to select rows from a database table, there's no reason to try to use
> lucene. Use a database API instead.
> 
> Best
> Erick
> 
> On 8/1/06, Sandip <[EMAIL PROTECTED]> wrote:
> >
> >
> > Hi All,
> >
> > I am confused with Apache Lucene.
> >
> > I want to search my databse table records using apache lucene.
> > But what i found is that Lucene is full-text search engine.This means is
> > it
> > only used to search documents text or anything else ?
> >
> > I want to search my databse like e.g.
> >
> > select * from tableName where username="abc";
> > using apache lucene.
> >
> > I am using Oracle 9i/Java for this.
> > Any idea/link/suggessions will be very much appreciable.
> >
> > Thanks in advance
> > Sandip Patil.
> > --
> > View this message in context:
> > http://www.nabble.com/searching-oracle-databse-records-using-apache-Lucene-tf2032743.html#a5591986
> > Sent from the Lucene - Java Users forum at Nabble.com.
> >
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
---
Vasily Borisov
Director Business Development
Kadme AS
Tel: +47 51 87 42 54
Fax: + 47 51 87 42 51
Mob: +47 45 20 40 42

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: FileNotFoundException

2006-08-01 Thread Erick Erickson

So it sounds like you're not writing the index to the place you think you
are. Have you just looked in the directories and checked that there are
files there? If Luke can't find them, they're not where you think they are.
Especially if your writer had closed before you looked.

Erick

On 8/1/06, WATHELET Thomas <[EMAIL PROTECTED]> wrote:


It's the same when I try to open the index with luke

-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED]
Sent: 01 August 2006 15:24
To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException

two things come to mind

1> are you absolutely sure that your reader and writer are pointing to
the
same place? Really, absolutely, positively sure? You've hard-coded the
path
into both writer and reader just to be really, absolutely positively
sure?
Or, you could let the writer close and *then* try the reader to see if
it's
a timing issue or a path issue.

2> You say that the indexer is still open. Is there any chance it hasn't
yet
written anything to disk? I'm not sure of the internals, but there has
been
some discussion that internally a writer uses a RAMdir for a while then
periodically flushes the results to disk. It's possible that you're
writer
hasn't written anything yet.

3> (so I can't count). Have you used Luke to open your index to see if
that
works (and the file is in the place you expect)?

FWIW
Erick

On 8/1/06, WATHELET Thomas <[EMAIL PROTECTED]> wrote:
>
> For the index process I use IndexModifier class.
> That happens when I try to search something into the index in the same
> time that the index process still running.
>
> the code for indexing:
>   System.setProperty("org.apache.lucene.lockDir", System
> .getProperty("user.dir"));
> File folder = new File(getIndexPath());
> Directory dir = null;
> if (folder.isDirectory() && folder.exists()) {
> dir = FSDirectory.getDirectory(getIndexPath(), false);
> } else if (!folder.isFile() && !folder.exists()) {
> dir = FSDirectory.getDirectory(getIndexPath(), true);
> } else {
> System.out.println("Bad index folder");
> System.exit(1);
> }
> boolean newIndex = true;
> if (dir.fileExists("segments")) {
> newIndex = false;
> }
> // long lastindexation = dir.fileModified("segments");
> writer = new IndexModifier(dir, new SimpleAnalyzer(),
newIndex);
> dir.close();
> writer.setUseCompoundFile(true);
>   ...
>
> Code For searching:
>
>   MultiSearcher multisearch = new
MultiSearcher(indexsearcher);
>   Hits hits = this.multisearch.search(this.getBoolQuery());
>   ...
>
> -Original Message-
> From: Michael McCandless [mailto:[EMAIL PROTECTED]
> Sent: 01 August 2006 13:45
> To: java-user@lucene.apache.org
> Subject: Re: FileNotFoundException
>
>
> > When the indexing process still running on a index and I try to
search
> > something on this index I retrive this error message:
> > java.io.FileNotFoundException:
> > \\tradluxstmp01\JavaIndex\tra\index_EN\_2hea.fnm (The system cannot
> find
> > the file specified)
> >
> > How can I solve this.
>
> Could you provide some more context about your application or a small
> test case that shows the error happening?  This sounds likely to be a
> locking issue.
>
> Mike
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: searching oracle databse records using apache Lucene

2006-08-01 Thread Erick Erickson

I agree completely. I was mostly responding to what appeared to be an
attempt to use lucene to actually execute a database query, which is
entirely different from restructing legacy data into a more-usable form as
you point out, and in which case all bets are off.

Erick

On 8/1/06, Vasily Borisov <[EMAIL PROTECTED]> wrote:


Eric,
I'm sure that is entirely true. E.g. in E&P industry we have a bunch of
legacy relational databases that are tremendously complex.
Therefore the presentation layer for them is never good since
the user is exposed to the data model complexity every time he uses this
database.

So, giving up on the structured API idea, flattening out the content and
indexing is seems to be not a bad idea at all. Especially if you are
facing the problem of integrating results from several legacy databases
or integrating the flat file store with a database.

Regards,
 Vasily


On Tue, 2006-08-01 at 09:18 -0400, Erick Erickson wrote:
> You're absolutely right, lucene is a text searching tool, not a database
> tool. There's no point in trying to jump through hoops to use lucene if
your
> database already works for you.
>
> If you're trying to do text searches, particularly if want to ask
questions
> like "find the words biggest and large within 5 words of each other",
then
> you might want to think about lucene. Or even if you want to just make
> simple searches over text.
>
> But to select rows from a database table, there's no reason to try to
use
> lucene. Use a database API instead.
>
> Best
> Erick
>
> On 8/1/06, Sandip <[EMAIL PROTECTED]> wrote:
> >
> >
> > Hi All,
> >
> > I am confused with Apache Lucene.
> >
> > I want to search my databse table records using apache lucene.
> > But what i found is that Lucene is full-text search engine.This means
is
> > it
> > only used to search documents text or anything else ?
> >
> > I want to search my databse like e.g.
> >
> > select * from tableName where username="abc";
> > using apache lucene.
> >
> > I am using Oracle 9i/Java for this.
> > Any idea/link/suggessions will be very much appreciable.
> >
> > Thanks in advance
> > Sandip Patil.
> > --
> > View this message in context:
> >
http://www.nabble.com/searching-oracle-databse-records-using-apache-Lucene-tf2032743.html#a5591986
> > Sent from the Lucene - Java Users forum at Nabble.com.
> >
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
---
Vasily Borisov
Director Business Development
Kadme AS
Tel: +47 51 87 42 54
Fax: + 47 51 87 42 51
Mob: +47 45 20 40 42

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: searching oracle databse records using apache Lucene

2006-08-01 Thread karl wettin
On Tue, 2006-08-01 at 15:32 +0200, Vasily Borisov wrote:
> the presentation layer for them is never good since the user is
> exposed to the data model complexity 

Isn't that why we have facades?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: FileNotFoundException

2006-08-01 Thread WATHELET Thomas
I'm sure that it's the good location.
When the index process is finished then I can access the index.
I know why but I don't know how to solve it.
When I indexing a lot of file with the extension cfs are created and
after few second the file are merge in an other file
ex:
I have a file with this name _8df.cfs and after few second this file
disappeared (because it merged with an other file with a new name) so
the IndexSearcher can't find it.

-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED] 
Sent: 01 August 2006 15:49
To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException

So it sounds like you're not writing the index to the place you think
you
are. Have you just looked in the directories and checked that there are
files there? If Luke can't find them, they're not where you think they
are.
Especially if your writer had closed before you looked.

Erick

On 8/1/06, WATHELET Thomas <[EMAIL PROTECTED]> wrote:
>
> It's the same when I try to open the index with luke
>
> -Original Message-
> From: Erick Erickson [mailto:[EMAIL PROTECTED]
> Sent: 01 August 2006 15:24
> To: java-user@lucene.apache.org
> Subject: Re: FileNotFoundException
>
> two things come to mind
>
> 1> are you absolutely sure that your reader and writer are pointing to
> the
> same place? Really, absolutely, positively sure? You've hard-coded the
> path
> into both writer and reader just to be really, absolutely positively
> sure?
> Or, you could let the writer close and *then* try the reader to see if
> it's
> a timing issue or a path issue.
>
> 2> You say that the indexer is still open. Is there any chance it
hasn't
> yet
> written anything to disk? I'm not sure of the internals, but there has
> been
> some discussion that internally a writer uses a RAMdir for a while
then
> periodically flushes the results to disk. It's possible that you're
> writer
> hasn't written anything yet.
>
> 3> (so I can't count). Have you used Luke to open your index to see if
> that
> works (and the file is in the place you expect)?
>
> FWIW
> Erick
>
> On 8/1/06, WATHELET Thomas <[EMAIL PROTECTED]> wrote:
> >
> > For the index process I use IndexModifier class.
> > That happens when I try to search something into the index in the
same
> > time that the index process still running.
> >
> > the code for indexing:
> >   System.setProperty("org.apache.lucene.lockDir", System
> > .getProperty("user.dir"));
> > File folder = new File(getIndexPath());
> > Directory dir = null;
> > if (folder.isDirectory() && folder.exists()) {
> > dir = FSDirectory.getDirectory(getIndexPath(), false);
> > } else if (!folder.isFile() && !folder.exists()) {
> > dir = FSDirectory.getDirectory(getIndexPath(), true);
> > } else {
> > System.out.println("Bad index folder");
> > System.exit(1);
> > }
> > boolean newIndex = true;
> > if (dir.fileExists("segments")) {
> > newIndex = false;
> > }
> > // long lastindexation = dir.fileModified("segments");
> > writer = new IndexModifier(dir, new SimpleAnalyzer(),
> newIndex);
> > dir.close();
> > writer.setUseCompoundFile(true);
> >   ...
> >
> > Code For searching:
> >
> >   MultiSearcher multisearch = new
> MultiSearcher(indexsearcher);
> >   Hits hits = this.multisearch.search(this.getBoolQuery());
> >   ...
> >
> > -Original Message-
> > From: Michael McCandless [mailto:[EMAIL PROTECTED]
> > Sent: 01 August 2006 13:45
> > To: java-user@lucene.apache.org
> > Subject: Re: FileNotFoundException
> >
> >
> > > When the indexing process still running on a index and I try to
> search
> > > something on this index I retrive this error message:
> > > java.io.FileNotFoundException:
> > > \\tradluxstmp01\JavaIndex\tra\index_EN\_2hea.fnm (The system
cannot
> > find
> > > the file specified)
> > >
> > > How can I solve this.
> >
> > Could you provide some more context about your application or a
small
> > test case that shows the error happening?  This sounds likely to be
a
> > locking issue.
> >
> > Mike
> >
> >
-
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> >
> >
> >
-
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: FileNotFoundException

2006-08-01 Thread Supriya Kumar Shyamal
I think its a directory access synchronisation problem, I have also 
posted about this before. The scenario can be like this ..


When Indexwriter object is created it reads the segment information from 
the file "segments" which nothing but list of files with .cfs or mayn 
more type, at teh same time IndexSearcher object is created which also 
make a list of index files from segements file, then you invoke the some 
write operation which triggers the index pemrging, fragmenting etc 
started haoppening and it modifies the file list in the segments file, 
but still we have the IndexerSearcher object with old file list and 
probably that throws the FileNotFoundExcpetion becuase physically the 
file is not there.


May be I am wrong but I try to put some light on this issue.

I posted the similar problem with subject "FileNotFoundException: occurs 
during the optimization of index", I am also experiencing the similar 
problem when the index optimization task runs on the index and 
parallally search function is also running.


thx,
supriya

WATHELET Thomas wrote:

I'm sure that it's the good location.
When the index process is finished then I can access the index.
I know why but I don't know how to solve it.
When I indexing a lot of file with the extension cfs are created and
after few second the file are merge in an other file
ex:
I have a file with this name _8df.cfs and after few second this file
disappeared (because it merged with an other file with a new name) so
the IndexSearcher can't find it.

-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED] 
Sent: 01 August 2006 15:49

To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException

So it sounds like you're not writing the index to the place you think
you
are. Have you just looked in the directories and checked that there are
files there? If Luke can't find them, they're not where you think they
are.
Especially if your writer had closed before you looked.

Erick

On 8/1/06, WATHELET Thomas <[EMAIL PROTECTED]> wrote:
  

It's the same when I try to open the index with luke

-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED]
Sent: 01 August 2006 15:24
To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException

two things come to mind

1> are you absolutely sure that your reader and writer are pointing to
the
same place? Really, absolutely, positively sure? You've hard-coded the
path
into both writer and reader just to be really, absolutely positively
sure?
Or, you could let the writer close and *then* try the reader to see if
it's
a timing issue or a path issue.

2> You say that the indexer is still open. Is there any chance it


hasn't
  

yet
written anything to disk? I'm not sure of the internals, but there has
been
some discussion that internally a writer uses a RAMdir for a while


then
  

periodically flushes the results to disk. It's possible that you're
writer
hasn't written anything yet.

3> (so I can't count). Have you used Luke to open your index to see if
that
works (and the file is in the place you expect)?

FWIW
Erick

On 8/1/06, WATHELET Thomas <[EMAIL PROTECTED]> wrote:


For the index process I use IndexModifier class.
That happens when I try to search something into the index in the
  

same
  

time that the index process still running.

the code for indexing:
  System.setProperty("org.apache.lucene.lockDir", System
.getProperty("user.dir"));
File folder = new File(getIndexPath());
Directory dir = null;
if (folder.isDirectory() && folder.exists()) {
dir = FSDirectory.getDirectory(getIndexPath(), false);
} else if (!folder.isFile() && !folder.exists()) {
dir = FSDirectory.getDirectory(getIndexPath(), true);
} else {
System.out.println("Bad index folder");
System.exit(1);
}
boolean newIndex = true;
if (dir.fileExists("segments")) {
newIndex = false;
}
// long lastindexation = dir.fileModified("segments");
writer = new IndexModifier(dir, new SimpleAnalyzer(),
  

newIndex);


dir.close();
writer.setUseCompoundFile(true);
  ...

Code For searching:

  MultiSearcher multisearch = new
  

MultiSearcher(indexsearcher);


  Hits hits = this.multisearch.search(this.getBoolQuery());
  ...

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED]
Sent: 01 August 2006 13:45
To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException


  

When the indexing process still running on a index and I try to


search


something on this index I retrive this error message:
java.io.FileNotFoundException:
\\tradluxstmp01\JavaIndex\tra\index_EN\_2hea.fnm (The system


cannot
  

find
  

the file specified)

How can I solve this.


Could you provide some more context about your ap

RE: FileNotFoundException

2006-08-01 Thread WATHELET Thomas
Have you solved thisproblem? 

-Original Message-
From: Supriya Kumar Shyamal [mailto:[EMAIL PROTECTED] 
Sent: 01 August 2006 16:30
To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException

I think its a directory access synchronisation problem, I have also 
posted about this before. The scenario can be like this ..

When Indexwriter object is created it reads the segment information from 
the file "segments" which nothing but list of files with .cfs or mayn 
more type, at teh same time IndexSearcher object is created which also 
make a list of index files from segements file, then you invoke the some 
write operation which triggers the index pemrging, fragmenting etc 
started haoppening and it modifies the file list in the segments file, 
but still we have the IndexerSearcher object with old file list and 
probably that throws the FileNotFoundExcpetion becuase physically the 
file is not there.

May be I am wrong but I try to put some light on this issue.

I posted the similar problem with subject "FileNotFoundException: occurs 
during the optimization of index", I am also experiencing the similar 
problem when the index optimization task runs on the index and 
parallally search function is also running.

thx,
supriya

WATHELET Thomas wrote:
> I'm sure that it's the good location.
> When the index process is finished then I can access the index.
> I know why but I don't know how to solve it.
> When I indexing a lot of file with the extension cfs are created and
> after few second the file are merge in an other file
> ex:
> I have a file with this name _8df.cfs and after few second this file
> disappeared (because it merged with an other file with a new name) so
> the IndexSearcher can't find it.
>
> -Original Message-
> From: Erick Erickson [mailto:[EMAIL PROTECTED] 
> Sent: 01 August 2006 15:49
> To: java-user@lucene.apache.org
> Subject: Re: FileNotFoundException
>
> So it sounds like you're not writing the index to the place you think
> you
> are. Have you just looked in the directories and checked that there are
> files there? If Luke can't find them, they're not where you think they
> are.
> Especially if your writer had closed before you looked.
>
> Erick
>
> On 8/1/06, WATHELET Thomas <[EMAIL PROTECTED]> wrote:
>   
>> It's the same when I try to open the index with luke
>>
>> -Original Message-
>> From: Erick Erickson [mailto:[EMAIL PROTECTED]
>> Sent: 01 August 2006 15:24
>> To: java-user@lucene.apache.org
>> Subject: Re: FileNotFoundException
>>
>> two things come to mind
>>
>> 1> are you absolutely sure that your reader and writer are pointing to
>> the
>> same place? Really, absolutely, positively sure? You've hard-coded the
>> path
>> into both writer and reader just to be really, absolutely positively
>> sure?
>> Or, you could let the writer close and *then* try the reader to see if
>> it's
>> a timing issue or a path issue.
>>
>> 2> You say that the indexer is still open. Is there any chance it
>> 
> hasn't
>   
>> yet
>> written anything to disk? I'm not sure of the internals, but there has
>> been
>> some discussion that internally a writer uses a RAMdir for a while
>> 
> then
>   
>> periodically flushes the results to disk. It's possible that you're
>> writer
>> hasn't written anything yet.
>>
>> 3> (so I can't count). Have you used Luke to open your index to see if
>> that
>> works (and the file is in the place you expect)?
>>
>> FWIW
>> Erick
>>
>> On 8/1/06, WATHELET Thomas <[EMAIL PROTECTED]> wrote:
>> 
>>> For the index process I use IndexModifier class.
>>> That happens when I try to search something into the index in the
>>>   
> same
>   
>>> time that the index process still running.
>>>
>>> the code for indexing:
>>>   System.setProperty("org.apache.lucene.lockDir", System
>>> .getProperty("user.dir"));
>>> File folder = new File(getIndexPath());
>>> Directory dir = null;
>>> if (folder.isDirectory() && folder.exists()) {
>>> dir = FSDirectory.getDirectory(getIndexPath(), false);
>>> } else if (!folder.isFile() && !folder.exists()) {
>>> dir = FSDirectory.getDirectory(getIndexPath(), true);
>>> } else {
>>> System.out.println("Bad index folder");
>>> System.exit(1);
>>> }
>>> boolean newIndex = true;
>>> if (dir.fileExists("segments")) {
>>> newIndex = false;
>>> }
>>> // long lastindexation = dir.fileModified("segments");
>>> writer = new IndexModifier(dir, new SimpleAnalyzer(),
>>>   
>> newIndex);
>> 
>>> dir.close();
>>> writer.setUseCompoundFile(true);
>>>   ...
>>>
>>> Code For searching:
>>>
>>>   MultiSearcher multisearch = new
>>>   
>> MultiSearcher(indexsearcher);
>> 
>>>   Hits hits = this.multisearch.search(this.getBoolQuery());
>>>   ...
>>>
>>> -Original Message-
>>

Re: FileNotFoundException

2006-08-01 Thread Supriya Kumar Shyamal
I should say not exactly, the temporary solution I made is that, I 
always copy the existing index to different directory run the 
modification or optimization task and then copy back, somethign like 
flip flop mechanism..


current index <-- searcher
copy to --> temp index <-- run optimization
temp index <-- switch searcher, so searcher pomits to temtp index
copy back --> current index <-- swicth back the searcher  again

This is somehow the critical issue and there is some promise in lucene 
saying that the locking mechanism will be much more sophisticated in 
future release.


Thanks,
supriya

WATHELET Thomas wrote:
Have you solved thisproblem? 


-Original Message-
From: Supriya Kumar Shyamal [mailto:[EMAIL PROTECTED] 
Sent: 01 August 2006 16:30

To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException

I think its a directory access synchronisation problem, I have also 
posted about this before. The scenario can be like this ..


When Indexwriter object is created it reads the segment information from 
the file "segments" which nothing but list of files with .cfs or mayn 
more type, at teh same time IndexSearcher object is created which also 
make a list of index files from segements file, then you invoke the some 
write operation which triggers the index pemrging, fragmenting etc 
started haoppening and it modifies the file list in the segments file, 
but still we have the IndexerSearcher object with old file list and 
probably that throws the FileNotFoundExcpetion becuase physically the 
file is not there.


May be I am wrong but I try to put some light on this issue.

I posted the similar problem with subject "FileNotFoundException: occurs 
during the optimization of index", I am also experiencing the similar 
problem when the index optimization task runs on the index and 
parallally search function is also running.


thx,
supriya

WATHELET Thomas wrote:
  

I'm sure that it's the good location.
When the index process is finished then I can access the index.
I know why but I don't know how to solve it.
When I indexing a lot of file with the extension cfs are created and
after few second the file are merge in an other file
ex:
I have a file with this name _8df.cfs and after few second this file
disappeared (because it merged with an other file with a new name) so
the IndexSearcher can't find it.

-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED] 
Sent: 01 August 2006 15:49

To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException

So it sounds like you're not writing the index to the place you think
you
are. Have you just looked in the directories and checked that there are
files there? If Luke can't find them, they're not where you think they
are.
Especially if your writer had closed before you looked.

Erick

On 8/1/06, WATHELET Thomas <[EMAIL PROTECTED]> wrote:
  


It's the same when I try to open the index with luke

-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED]
Sent: 01 August 2006 15:24
To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException

two things come to mind

1> are you absolutely sure that your reader and writer are pointing to
the
same place? Really, absolutely, positively sure? You've hard-coded the
path
into both writer and reader just to be really, absolutely positively
sure?
Or, you could let the writer close and *then* try the reader to see if
it's
a timing issue or a path issue.

2> You say that the indexer is still open. Is there any chance it

  

hasn't
  


yet
written anything to disk? I'm not sure of the internals, but there has
been
some discussion that internally a writer uses a RAMdir for a while

  

then
  


periodically flushes the results to disk. It's possible that you're
writer
hasn't written anything yet.

3> (so I can't count). Have you used Luke to open your index to see if
that
works (and the file is in the place you expect)?

FWIW
Erick

On 8/1/06, WATHELET Thomas <[EMAIL PROTECTED]> wrote:

  

For the index process I use IndexModifier class.
That happens when I try to search something into the index in the
  


same
  


time that the index process still running.

the code for indexing:
  System.setProperty("org.apache.lucene.lockDir", System
.getProperty("user.dir"));
File folder = new File(getIndexPath());
Directory dir = null;
if (folder.isDirectory() && folder.exists()) {
dir = FSDirectory.getDirectory(getIndexPath(), false);
} else if (!folder.isFile() && !folder.exists()) {
dir = FSDirectory.getDirectory(getIndexPath(), true);
} else {
System.out.println("Bad index folder");
System.exit(1);
}
boolean newIndex = true;
if (dir.fileExists("segments")) {
newIndex = false;
}
// long lastindexation = dir.fileModified("segments");
writer 

Re: FileNotFoundException

2006-08-01 Thread Michael McCandless


I think its a directory access synchronisation problem, I have also 
posted about this before. The scenario can be like this ..


When Indexwriter object is created it reads the segment information from 
the file "segments" which nothing but list of files with .cfs or mayn 
more type, at teh same time IndexSearcher object is created which also 
make a list of index files from segements file, then you invoke the some 
write operation which triggers the index pemrging, fragmenting etc 
started haoppening and it modifies the file list in the segments file, 
but still we have the IndexerSearcher object with old file list and 
probably that throws the FileNotFoundExcpetion becuase physically the 
file is not there.


May be I am wrong but I try to put some light on this issue.

I posted the similar problem with subject "FileNotFoundException: occurs 
during the optimization of index", I am also experiencing the similar 
problem when the index optimization task runs on the index and 
parallally search function is also running.


Lucene has file-based locking for exactly this reason.  Can you 
double-check that the same lockDir is being used in both your 
IndexModifier process and your searching process?


Also: this directory can't be an NFS mount -- there are known problems 
with the current Lucene locking implementation and NFS file systems. 
Are you using NFS?


Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: FileNotFoundException

2006-08-01 Thread WATHELET Thomas
Yes 

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED] 
Sent: 01 August 2006 17:10
To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException


> I think its a directory access synchronisation problem, I have also 
> posted about this before. The scenario can be like this ..
> 
> When Indexwriter object is created it reads the segment information
from 
> the file "segments" which nothing but list of files with .cfs or mayn 
> more type, at teh same time IndexSearcher object is created which also

> make a list of index files from segements file, then you invoke the
some 
> write operation which triggers the index pemrging, fragmenting etc 
> started haoppening and it modifies the file list in the segments file,

> but still we have the IndexerSearcher object with old file list and 
> probably that throws the FileNotFoundExcpetion becuase physically the 
> file is not there.
> 
> May be I am wrong but I try to put some light on this issue.
> 
> I posted the similar problem with subject "FileNotFoundException:
occurs 
> during the optimization of index", I am also experiencing the similar 
> problem when the index optimization task runs on the index and 
> parallally search function is also running.

Lucene has file-based locking for exactly this reason.  Can you 
double-check that the same lockDir is being used in both your 
IndexModifier process and your searching process?

Also: this directory can't be an NFS mount -- there are known problems 
with the current Lucene locking implementation and NFS file systems. 
Are you using NFS?

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: FileNotFoundException

2006-08-01 Thread Michael McCandless


Yes 


Yes, you're certain you have the same lock dir for both modifier & 
search process?


Or, Yes you're using NFS as your lock dir?

Or, both?

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: FileNotFoundException

2006-08-01 Thread WATHELET Thomas
Ok if I well understood I have to put the lock file at the same place in
my indexing process and searching process. 

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED] 
Sent: 01 August 2006 17:14
To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException


> Yes 

Yes, you're certain you have the same lock dir for both modifier & 
search process?

Or, Yes you're using NFS as your lock dir?

Or, both?

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: FileNotFoundException

2006-08-01 Thread Michael McCandless



Ok if I well understood I have to put the lock file at the same place in
my indexing process and searching process. 


That's correct.

And, that place can't be an NFS mounted directory (until we fix locking 
implementation...).


The two different processes will use this lock file to make sure it's 
safe to read from or write to the files in the index.


Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: FileNotFoundException

2006-08-01 Thread WATHELET Thomas
Ok thanks a lot.

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED] 
Sent: 01 August 2006 17:19
To: java-user@lucene.apache.org
Subject: Re: FileNotFoundException


> Ok if I well understood I have to put the lock file at the same place
in
> my indexing process and searching process. 

That's correct.

And, that place can't be an NFS mounted directory (until we fix locking 
implementation...).

The two different processes will use this lock file to make sure it's 
safe to read from or write to the files in the index.

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: FileNotFoundException

2006-08-01 Thread Supriya Kumar Shyamal
Yes, I use the nfs mount to share the index for other search instance 
and all the instances have same lock directory configured, but the only 
the difference is that nfs mount is read-only mount, so I have to 
disable the lock mechanism for search instances, only lock is enabled 
for index modification instance. We have 6 jboss cluster for our 
application. so 5 instances of jboss search on the same index and the 
6th instance used for index update.


supriya

Michael McCandless wrote:


I think its a directory access synchronisation problem, I have also 
posted about this before. The scenario can be like this ..


When Indexwriter object is created it reads the segment information 
from the file "segments" which nothing but list of files with .cfs or 
mayn more type, at teh same time IndexSearcher object is created 
which also make a list of index files from segements file, then you 
invoke the some write operation which triggers the index pemrging, 
fragmenting etc started haoppening and it modifies the file list in 
the segments file, but still we have the IndexerSearcher object with 
old file list and probably that throws the FileNotFoundExcpetion 
becuase physically the file is not there.


May be I am wrong but I try to put some light on this issue.

I posted the similar problem with subject "FileNotFoundException: 
occurs during the optimization of index", I am also experiencing the 
similar problem when the index optimization task runs on the index 
and parallally search function is also running.


Lucene has file-based locking for exactly this reason.  Can you 
double-check that the same lockDir is being used in both your 
IndexModifier process and your searching process?


Also: this directory can't be an NFS mount -- there are known problems 
with the current Lucene locking implementation and NFS file systems. 
Are you using NFS?


Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






--
Mit freundlichen Grüßen / Regards

Supriya Kumar Shyamal

Software Developer
tel +49 (30) 443 50 99 -22
fax +49 (30) 443 50 99 -99
email [EMAIL PROTECTED]
___
artnology GmbH
Milastr. 4
10437 Berlin
___

http://www.artnology.com
__

News / Aktuelle Projekte:
* artnology gewinnt Ausschreibung des Bundesministeriums des Innern:
  Softwarelösung für die Verwaltung der Sammlung zeitgenössischer
  Kunstwerke zur kulturellen Repräsentation des Bundes.

Projektreferenzen:
* Globaler eShop und Corporate-Site für Springer: www.springeronline.com
* E-Detailing-Portal für Novartis: www.interaktiv.novartis.de
* Service-Center-Plattform für Biogen: www.ms-life.de
* eCRM-System für Grünenthal: www.gruenenthal.com

___ 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: FileNotFoundException

2006-08-01 Thread Michael McCandless


Yes, I use the nfs mount to share the index for other search instance 
and all the instances have same lock directory configured, but the only 
the difference is that nfs mount is read-only mount, so I have to 
disable the lock mechanism for search instances, only lock is enabled 
for index modification instance. We have 6 jboss cluster for our 
application. so 5 instances of jboss search on the same index and the 
6th instance used for index update.


OK unfortunately this won't work.

Well, it will "work" but you'll hit occasional FileNotFoundExceptions on 
your searchers, whenever a searcher tries to restart itself while the 
updater is writing a new segments file.


Even though the searcher's are read only, they still need to briefly 
hold the commit lock to ensure the updater doesn't write a new segments 
file while the searcher is reading it (and opening each segment).


We are working towards a fix for lock files over NFS mounts, first by 
decoupling locking from directory implementation 
(http://issues.apache.org/jira/browse/LUCENE-635) and second by creating 
better LockFactory implementations for different cases (eg at least a 
locking implementation based on native OS locks).  But this is still in 
process...


I think the best workaround for now is to take an approach like Solr:

  http://incubator.apache.org/solr/features.html
  http://incubator.apache.org/solr/tutorial.html

whereby the single writer will occasionally (at a known safe time) make 
a snapshot of its index, and then the multiple searchers can switch to 
that index once it's safe.


Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Search matching

2006-08-01 Thread Simon Willnauer

I guess so, but without any information about your code nobody can tell what.
If you provide more information you willl get help!!

regards simon

On 8/1/06, Rajiv Roopan <[EMAIL PROTECTED]> wrote:

Hello, I have an index of locations for example. I'm indexing one field
using SimpleAnalyzer.

doc1: albany ny
doc2: hudson ny
doc3: new york ny
doc4: new york mills ny

when I search for "new york ny" , the first result returned is always "new
york mills ny". Am I doing something incorrect?

thanks in advance,
rajiv




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Search matching

2006-08-01 Thread Rajiv Roopan

Ok, this is how I'm indexing. Both in indexing and searching I'm using
SimpleAnalyzer()

String loc = "New York, NY";
doc.add(new Field("location", loc, Field.Store.NO, Field.Index.TOKENIZED));

String loc2 = "New York Mills, NY";
doc.add(new Field("location", loc2, Field.Store.NO, Field.Index.TOKENIZED
));


and this is how I'm searching...

 String searchStr = "New York, NY";
   Analyzer analyzer = new SimpleAnalyzer();
   QueryParser parser = new QueryParser("location", analyzer);
   parser.setDefaultOperator(QueryParser.AND_OPERATOR);
   Query query = parser.parse( searchStr );

  Hits hits = searcher.search( query );

I've tried all query types and everytime "new york mills, ny" is in hits(0).
Both results have a score of 1.0. I know I can add some kind of sort to
always make the shorter field first. But shouldn't the first by default, due
to the scoring algorithm, be "new york, ny" because it's a shorter field?

let me know if i'm missing something. thanks!

rajiv

On 8/1/06, Simon Willnauer <[EMAIL PROTECTED]> wrote:


I guess so, but without any information about your code nobody can tell
what.
If you provide more information you willl get help!!

regards simon

On 8/1/06, Rajiv Roopan <[EMAIL PROTECTED]> wrote:
> Hello, I have an index of locations for example. I'm indexing one field
> using SimpleAnalyzer.
>
> doc1: albany ny
> doc2: hudson ny
> doc3: new york ny
> doc4: new york mills ny
>
> when I search for "new york ny" , the first result returned is always
"new
> york mills ny". Am I doing something incorrect?
>
> thanks in advance,
> rajiv
>
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: FileNotFoundException

2006-08-01 Thread Michael McCandless



For the index process I use IndexModifier class.
That happens when I try to search something into the index in the same
time that the index process still running. 


the code for indexing:
  System.setProperty("org.apache.lucene.lockDir", System
.getProperty("user.dir"));
File folder = new File(getIndexPath());
Directory dir = null;
if (folder.isDirectory() && folder.exists()) {
dir = FSDirectory.getDirectory(getIndexPath(), false);
} else if (!folder.isFile() && !folder.exists()) {
dir = FSDirectory.getDirectory(getIndexPath(), true);
} else {
System.out.println("Bad index folder");
System.exit(1);
}
boolean newIndex = true;
if (dir.fileExists("segments")) {
newIndex = false;
}
// long lastindexation = dir.fileModified("segments");
writer = new IndexModifier(dir, new SimpleAnalyzer(), newIndex);
dir.close();
writer.setUseCompoundFile(true);
  ...


BTW, one thing that I don't think is right is the "dir.close()" 
statement after you creat the IndexModifier.  I think you should not 
call dir.close() until you are done with the IndexModifier (ie, at the 
same time you call IndexModifier.close()).


It sounds like it's unrelated to your NFS locking issue but still could 
cause other problems...


Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Search matching

2006-08-01 Thread Erik Hatcher

Rajiv,

Have a look at the details provided by IndexSearcher.explain() for  
those documents, and you'll get some insight into the factors used to  
rank them.  Since both scores are 1.0, you'll probably want to  
implement your own custom Similarity and override the lengthNorm() to  
adjust that factor.


Another technique you can use is to expand a users query into a more  
sophisticated boolean query, such that a users query for "new york  
ny" would become (in Query.toString format): +new +york +ny "new york  
ny", which would boost exact matches.


Erik


On Aug 1, 2006, at 1:19 PM, Rajiv Roopan wrote:


Ok, this is how I'm indexing. Both in indexing and searching I'm using
SimpleAnalyzer()

String loc = "New York, NY";
doc.add(new Field("location", loc, Field.Store.NO,  
Field.Index.TOKENIZED));


String loc2 = "New York Mills, NY";
doc.add(new Field("location", loc2, Field.Store.NO,  
Field.Index.TOKENIZED

));


and this is how I'm searching...

 String searchStr = "New York, NY";
   Analyzer analyzer = new SimpleAnalyzer();
   QueryParser parser = new QueryParser("location", analyzer);
   parser.setDefaultOperator(QueryParser.AND_OPERATOR);
   Query query = parser.parse( searchStr );

  Hits hits = searcher.search( query );

I've tried all query types and everytime "new york mills, ny" is in  
hits(0).
Both results have a score of 1.0. I know I can add some kind of  
sort to
always make the shorter field first. But shouldn't the first by  
default, due
to the scoring algorithm, be "new york, ny" because it's a shorter  
field?


let me know if i'm missing something. thanks!

rajiv

On 8/1/06, Simon Willnauer <[EMAIL PROTECTED]> wrote:


I guess so, but without any information about your code nobody can  
tell

what.
If you provide more information you willl get help!!

regards simon

On 8/1/06, Rajiv Roopan <[EMAIL PROTECTED]> wrote:
> Hello, I have an index of locations for example. I'm indexing  
one field

> using SimpleAnalyzer.
>
> doc1: albany ny
> doc2: hudson ny
> doc3: new york ny
> doc4: new york mills ny
>
> when I search for "new york ny" , the first result returned is  
always

"new
> york mills ny". Am I doing something incorrect?
>
> thanks in advance,
> rajiv
>
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Search with accents

2006-08-01 Thread Eduardo S. Cordeiro

Hello there,

I have a brazilian portuguese index, which has been analyzed with
BrazilianAnalyzer. When searching words with accents, however, they're
not found -- for instance, if the index contains some text with the
word "maçã" and I search for that very word, I get no hits, but if I
search "maca" (which is another portuguese word) then the document
containing "maçã" is found.

I've seen posts in the archive indicating that I should use
ISOLatin1AccentFilter to handle this, but I don't quite see how:
should I leave indexation as it is and use this filter only for search
queries or should I apply it in both cases?

Thank you,
Eduardo Cordeiro


RE: Search with accents

2006-08-01 Thread Zhang, Lisheng
Hi,

Have you used the same BrazilianAnalyzer when 
searching?

Best regards, Lisheng

-Original Message-
From: Eduardo S. Cordeiro [mailto:[EMAIL PROTECTED]
Sent: Tuesday, August 01, 2006 1:40 PM
To: java-user@lucene.apache.org
Subject: Search with accents


Hello there,

I have a brazilian portuguese index, which has been analyzed with
BrazilianAnalyzer. When searching words with accents, however, they're
not found -- for instance, if the index contains some text with the
word "maçã" and I search for that very word, I get no hits, but if I
search "maca" (which is another portuguese word) then the document
containing "maçã" is found.

I've seen posts in the archive indicating that I should use
ISOLatin1AccentFilter to handle this, but I don't quite see how:
should I leave indexation as it is and use this filter only for search
queries or should I apply it in both cases?

Thank you,
Eduardo Cordeiro

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Search with accents

2006-08-01 Thread Eduardo S. Cordeiro

Yes...here's how I create my QueryParser:

QueryParser parser = new QueryParser("text", new BrazilianAnalyzer());

2006/8/1, Zhang, Lisheng <[EMAIL PROTECTED]>:

Hi,

Have you used the same BrazilianAnalyzer when
searching?

Best regards, Lisheng

-Original Message-
From: Eduardo S. Cordeiro [mailto:[EMAIL PROTECTED]
Sent: Tuesday, August 01, 2006 1:40 PM
To: java-user@lucene.apache.org
Subject: Search with accents


Hello there,

I have a brazilian portuguese index, which has been analyzed with
BrazilianAnalyzer. When searching words with accents, however, they're
not found -- for instance, if the index contains some text with the
word "maçã" and I search for that very word, I get no hits, but if I
search "maca" (which is another portuguese word) then the document
containing "maçã" is found.

I've seen posts in the archive indicating that I should use
ISOLatin1AccentFilter to handle this, but I don't quite see how:
should I leave indexation as it is and use this filter only for search
queries or should I apply it in both cases?

Thank you,
Eduardo Cordeiro

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




RE: Search with accents

2006-08-01 Thread Zhang, Lisheng
Hi,

In this case I guess we may need to find out what 
exactly BrazilianAnalyzer do on the input string:

BrazilianAnalyzer braAnalyser = new BrazilianAnalyzer();
TokenStream ts1 = braAnalyzer.tokenStream("text", new StringReader(queryStr));
... // what BrazilianAnalyzer do?

Also what exactly ISOLatin1AccentFilter can do:

WhiteSpaceAnalyzer wsAnalyzer = new wsAnalyzer();
TokenStream tmpts = wsAnalyzer.tokenStream("text", new StringReader(queryStr));
TokenStream ts2 = new ISOLatin1AccentFilter(tmpts);
 // what ISOLatin1AccentFilter do?

to see what is wrong with ts1 and see if ts2 can 
do better job? I have never used ISOLatin1AccentFilter
before, I am not sure if the way to test it is really
OK, here I merely suggest a way to test.

Best regards, Lisheng
 

-Original Message-
From: Eduardo S. Cordeiro [mailto:[EMAIL PROTECTED]
Sent: Tuesday, August 01, 2006 2:34 PM
To: java-user@lucene.apache.org
Subject: Re: Search with accents


Yes...here's how I create my QueryParser:

QueryParser parser = new QueryParser("text", new BrazilianAnalyzer());

2006/8/1, Zhang, Lisheng <[EMAIL PROTECTED]>:
> Hi,
>
> Have you used the same BrazilianAnalyzer when
> searching?
>
> Best regards, Lisheng
>
> -Original Message-
> From: Eduardo S. Cordeiro [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, August 01, 2006 1:40 PM
> To: java-user@lucene.apache.org
> Subject: Search with accents
>
>
> Hello there,
>
> I have a brazilian portuguese index, which has been analyzed with
> BrazilianAnalyzer. When searching words with accents, however, they're
> not found -- for instance, if the index contains some text with the
> word "maçã" and I search for that very word, I get no hits, but if I
> search "maca" (which is another portuguese word) then the document
> containing "maçã" is found.
>
> I've seen posts in the archive indicating that I should use
> ISOLatin1AccentFilter to handle this, but I don't quite see how:
> should I leave indexation as it is and use this filter only for search
> queries or should I apply it in both cases?
>
> Thank you,
> Eduardo Cordeiro
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Does lucene performance suffer with a lot of empty fields ?

2006-08-01 Thread Chris Hostetter
: >From what I gather, I can go ahead & create an Index & for each Document &
: only add the relevant fields. Is this correct?
: I should still be able to search with queries like "mel Movies:braveheart".
: Right ?
:
: Would this impact the search performance ?
: Any other words of caution for me ?

it will absolutely work -- the one performance issue you may want to
consider is that by default a "fieldNorm" is computed for every document
and every field, and these are kept in memory -- there is a way to turn
them off on a per field basis (you have to turn them off for every doc, if
even one doc wants a norm for field X, then every doc gets a norm for
field X)

how to "omitNorms" for a field, and what the pros (save space) and cons
(no "lengthNorm" or "field boosts") are has been discussed extensively in
the past.  search the archives for anything i've put in quotes and you'll
find lots of info on this.


-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Sorting

2006-08-01 Thread Chris Hostetter
: I'm with you now. So you do seeks in your comparator. For a large index you
: might as well use java.io.RandomAccessFile for the "array", because there
: would be little value in buffering when the comparator is liable to jump all

yep .. that's what i was getting at ... but i'm not so sure that buffering
won't be usefull.  I've i'm not mistaken, all Scorers are by contract
expected to score docs in docId order so when your hits are being
collected for sorting, you should allways be moving forward in the file
-- but you may skip ahead alot when the result set isn't a high percentage
of the total number of docs.
(i may be wrong about all Scorers going in docId order ... if you
explicilty use the 1.4 BooleanScorer you may not get that behavior, but i
think everything else works that way ... perhaps someone else can verify
that)

: around the file. This sounds very expensive, though. If you don't open a
: Searcher to frequently, it makes sense (in my muddled mind) to pre-sort to
: reduce the number of seeks. That was the half-baked idea of the third file,
: which essentially orders document IDs.

presort on what exactly, the field you want to sort on?  -- That's
esentially what the TermEnum is.  I'm not sure how having that helps you
... let's assume you've got some data structure (let's not worry about the
file/ram or TermEnum distinction just yet) containing every document in
your index of 100,000,000 products sorted on the price field, and you've
done a search for "apple" and there are 1,000,000 docIds for matching
products ready to be collected by your new custom Scoring code ... how
does the full list of all docIds sorted by price help you as you are given
docIds and have to decide if that doc is better or worse then the docs
you've already collected?

: > Bear in mind, there have been some improvements recently to the ability to
: grab individual stored fields per document
:
: I can't see anything like that in 2.0. Is that something in the Lucene HEAD
: build?

I guess so ... search the java-dev archives for "lazy field loading" or
"Fieldable" .. that should find some of the discussion about it and the
jira issue with the changes.


-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: About search performance

2006-08-01 Thread zhongyi yuan

My question is about deal with the multi clauses booleanQuery,  the
num of clauses is giant and induce the performance.So I want some
other method to replace this query to improve the performance. Now
through filter achieve the goal.

Thanks for the suggestions.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Indexsearcher - one instance in PHP via javabridge?

2006-08-01 Thread Michael Imbeault

Hello everyone,

I'm having tons of fun right now with Lucene indexing a large (15 
millions documents) library. I'm developing the web front end, and I 
read on this mailing list that it's better to have one instance of 
IndexSearcher.


I'm using Lucene in PHP via JavaBridge (and Tomcat), but I can't figure 
out how to instantiate an unique copy of IndexSearcher, somehow get it 
in my webpage for each thread, and destroy it before I add stuff to the 
index (at the end of each day).


I'm trying to follow these instructions, but I have zero experience with 
Java, JVMs, Tomcat, etc. Could somebody help me with this one? Thanks in 
advance!


Instructions:
I commend you for giving all the information that's relevant. For the sake
of simplicity, and because it is the vast majority of use cases, could you
endorse the following as the simplest, most correct way (i.e. a best
practice) to implement Lucene for Web applications.

1- create an IndexSearcher instance in the servlet's init() method, and
cache it in the web application context,

2- in the doGet or doPost() methods, lookup the index searcher instance in
the web application context and use it to run queries,

3- close the IndexSearcher in the destroy() method.

This is *simple*, and *correct*. It doesn't create a new IndexSearcher per
query, doesn't use a static field, nor a singleton, nor a pool. All ideas
that have been suggested but have issues, or are more difficult to
implement.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Indexsearcher - one instance in PHP via javabridge?

2006-08-01 Thread Chris Hostetter

: I'm trying to follow these instructions, but I have zero experience with
: Java, JVMs, Tomcat, etc. Could somebody help me with this one? Thanks in
: advance!

if you want to eliminate your need to write java code (or servlets)
completely take a look at Solr ... it provides a webservicesish API
for indexing and searching, and handles all of the Lucene "best
practices" for you...

http://incubator.apache.org/solr/
http://incubator.apache.org/solr/tutorial.html

There's even some examples on the wiki about how to talk to Solr via PHP
(but i don't really know anything about PHP so I can't comment on the
quality of the code)

http://wiki.apache.org/solr/SolPHP



-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]