Re: Off Topic: Lucene vs Derby (vs MySQL) for spatial indexing

2005-07-28 Thread Otis Gospodnetic
Ah, I saw it about a month or two ago when moving Simpy to PostgreSQL 8.0.3. I think I saw mentions of Java inside PostgreSQL in a development version (8.1.*). Otis -- http://simpy.com --- Dan Armbrust <[EMAIL PROTECTED]> wrote: > Otis Gospodnetic wrote: > > >You may also want to consider Pos

RE: Lucene vs Derby (vs MySQL) for spatial indexing

2005-07-28 Thread Tony Schwartz
Lucene will work perfectly for what you need. Use a "RangeFilter" for the latitude/longitude and a "Sort" on your population. If you have a crazy amount of data and limited memory, you can modify lucene easily (open source) to to handle filtering and sorting in a more "memory friendly" way. Since

Re: Searching a URL with a PrefixQuery / Too Many Clauses (again...)

2005-07-28 Thread Erik Hatcher
On Jul 28, 2005, at 12:37 PM, Chris May wrote: Works beautifully (at least on my 30K-document test index ). I'll need to do some fiddling if I want to allow partial URLs (i.e. http://www2.warwick.ac.uk/ab* to match http://www2.warwick.ac.uk/ about) but I can see how to do that, I think (and

Off Topic: Lucene vs Derby (vs MySQL) for spatial indexing

2005-07-28 Thread Dan Armbrust
Otis Gospodnetic wrote: You may also want to consider PostgreSQL for a few reasons: 3) it seems that the new versions let you embed Java directly into the database (perhaps something like Oracle's Java-embedding thing). Really? I realize this is off topic, but could you point me to some d

Re: European Languages search problem

2005-07-28 Thread Martin Rode
Otis, Thanks for the quick reply. The idea to emit multiple tokens is great! I was looking for a solution of another problem: I want to present a word completition list to the user, so I use reader.terms(new Term("start","here"). If I start searching at "henrie", the reader.terms() should re

Re: European Languages search problem

2005-07-28 Thread Otis Gospodnetic
Hi Martin, When you write your own tokenizer/analyzer for this, you'll probably want to emit multiple tokens for words that have umlauts and such - one version with ä -> ae, the other with ä -> a perhaps. As for stripping accents from characters, somebody posted ISOLatinFilter.java (I think that

European Languages search problem

2005-07-28 Thread Martin Rode
Hello everybody, First of congrats for that great piece of software! I am working on a Europe-wide project, where we have texts on more than one European language, namely French, German, and English. Having tried the German and the FrenchAnalyzer both are not satisfying for what I need. The

Re: Searching a URL with a PrefixQuery / Too Many Clauses (again...)

2005-07-28 Thread Chris May
Works beautifully (at least on my 30K-document test index ). I'll need to do some fiddling if I want to allow partial URLs (i.e. http:// www2.warwick.ac.uk/ab* to match http://www2.warwick.ac.uk/about) but I can see how to do that, I think (and I'm not sure I need it anyway). Thanks Scott!

Index merge and java heap space

2005-07-28 Thread Chris Fraschetti
I've read of people combining smaller indexer to help distribute indexing and such, but I've been unable to find any descriptions of large index merges. I've seen a post of two in regards to a merge taking a nice amount of heap space (I've also observed this) but I wanted to poll you folks to see h

Re: Lucene vs Derby (vs MySQL) for spatial indexing

2005-07-28 Thread Dave Kor
Quoting Andrew Boyd <[EMAIL PROTECTED]>: > I did a small demonstration application using lucene's range query and it > worked fine. > I didn't use a DB at all > > > "Mosul_Iraq.html", "E043.13535" > "Mosul_Iraq.html", "N36.33608" > > Having the directional (E, W, N, S) worked out well > > Andrew

Re: Lucene vs Derby (vs MySQL) for spatial indexing

2005-07-28 Thread Andrew Boyd
I did a small demonstration application using lucene's range query and it worked fine. I didn't use a DB at all "Mosul_Iraq.html", "E043.13535" "Mosul_Iraq.html", "N36.33608" Having the directional (E, W, N, S) worked out well Andrew -Original Message- From: Barry Carter <[EMAIL PRO

RE: Hardware Question

2005-07-28 Thread Michael Celona
Will using a striped raid configuration (i.e. raid 5/10 ) yield the same performance improvements as using multiple drives with ParallelIndexReader. Also, for searching are you suggesting using ParallelMultiSearcher against multiple indexes on separate drives and/or using ParallelIndexReader. Mic

Re: updating an index... with existing documents ?

2005-07-28 Thread Erik Hatcher
On Jul 28, 2005, at 8:36 AM, Paul Libbrecht wrote: Dare I ask wether this implies that the fields are stored ? I don't quite understand. The "reconstruct" feature of Luke (and thus the code you can borrow from) does not require that fields are stored - it pulls the indexed terms from the

RE: Hardware Question

2005-07-28 Thread Michael Celona
Someone posted to turn CFS off. I wasn't sure what that was, after I looked it up I still unsure why someone use that for Lucene. Michael -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 27, 2005 6:20 PM To: java-user@lucene.apache.org; [EMAIL

Re: updating an index... with existing documents ?

2005-07-28 Thread Paul Libbrecht
Dare I ask wether this implies that the fields are stored ? thanks paul Le 28 juil. 05, à 14:26, Erik Hatcher a écrit : It is possible to reconstruct a document from the index, but it is a potentially lossy proposition, since stemming and other manglings might have gone on. Look at Luke and

Re: updating an index... with existing documents ?

2005-07-28 Thread Erik Hatcher
Paul, It is possible to reconstruct a document from the index, but it is a potentially lossy proposition, since stemming and other manglings might have gone on. Look at Luke and see how it does it (you can "reconstruct and edit" a document from its UI). Erik On Jul 28, 2005, at 5:37

Re: OutOfMemoryError

2005-07-28 Thread Lasse L
Hi, If I replace my lucene wrapper with a dummy one the problem goes away. If I close my index-thread every 30 minutes and start a new thread it also goes away. If I exit the thread on OutOfMemory errors it regains all memory. I do not use static variables. If I did they wouldn't get garbage colle

Re: Regex for legal user search input

2005-07-28 Thread Erik Hatcher
This really requires some experimentation, and I encourage all that are curious to write a little bit of toy code to play with combinations of analyzers and QueryParser techniques. On Jul 28, 2005, at 2:07 AM, Alex Kiselevski wrote: Just to make it clear for me. The last Lucene version suppo

updating an index... with existing documents ?

2005-07-28 Thread Paul Libbrecht
hi, My mission is currently to update an index by marking adding a flag field on some documents. For this, I seem to have the only following possibility: - search for the documents in question, store them, filter them - modify the documents in accordance - delete the modified documents - put b

Index scalability, IDF normalization for distributed indices

2005-07-28 Thread Sergey Chernov
Hi all, I would like to know, if I use LUCENE in distributed environment, e.g. on two indices on different document sets and in different locations, does Searcher use local IDF values for every index separately during query execution or it computes and uses one global IDF value? Another ques

Re: Query text Tokenize issue

2005-07-28 Thread Erik Hatcher
On Jul 27, 2005, at 7:26 PM, Indu Abeyaratna wrote: I have a field index as keyword. And have two records "J400-C-V1- S10-T1" and "J400-C-V-S10-T1" When I search for "J400-C-V1-S10-T1", it returns me matching record, but when I Search for "J400-C-V-S10-T1" it doesn't return the matching

Re: another problem with Multisearcher

2005-07-28 Thread Daniel Cortes
It's very strange because the first search works good , but next search not works and give me the error message java.io.IOException: Bad file descriptor at java.io.RandomAccessFile.seek(Native Method) at org.apache.lucene.store.FSInputStream.readInternal(FSDirectory.java:415)

Re: Lucene vs Derby (vs MySQL) for spatial indexing

2005-07-28 Thread markharw00d
MySQL has spatial extensions now too. Your queries lack any free-text criteria so are probably best handled by a database, not Lucene.. >>In case anyone's interested, I'm writing a zoomable/pannable world map Save yourself some time. Just use the Google maps API. :-) __