Re: Range Filter again - (time field)

2006-12-18 Thread Paul . Illingworth
Hello Abdul, The approach with dates and times is to index the full date and time but to the resolution that you require as opposed to splitting the time and date into separate fields. If you have a requirement to search to millisecond resolution then index to millisecond resolutiuon using Dat

Re: Converting SQL statement to Lucene query

2006-06-07 Thread Paul . Illingworth
You could take a look at Apaches Jackrabbit - it does this sort of thing. Its not exactly a library but it might give you some pointers. My understanding is that it uses an SQL like syntax for defining queries that are converted into an abstract syntax tree which it can then convert into any q

Re: Lucene in Action

2006-06-06 Thread Paul . Illingworth
It's an invaluable book if you're new to Lucene. There have been some changes to the Lucene API since the book was published but you shouldn't let this put you off - they're relatively minor. I think Lucene In Action v2.0 might be a little while in coming (checkout Otis' blog http://www.jrolle

Re: IOException Access Denied errors [ modified]

2006-05-24 Thread Paul . Illingworth
Another wild guess - it seems to be throwing the exception when merging segments. Are you sure you've got write access to the directory that the lock file is being created in. Lucene In Action has some details about index locking and how you can change the location of the lock file - I'm not s

Re: IOException Access Denied errors [ modified]

2006-05-24 Thread Paul . Illingworth
I can only think that the problem you're having is peculiar to your setup or the way in which you are using Lucene. A wild guess - are you reaching quota limits on your filesystem or something like this? Regards Paul I..

Re: OutOfMemory and IOException Access Denied errors

2006-05-19 Thread Paul . Illingworth
I guess you are executing your SQL and getting the whole result set. There are options on the JDBC Statement class that can be used for controlling the fetch size - by using these you should be able to limit the amount of data returned from the database so you don't get OOM. I haven't used the

Re: Partial token matches

2006-04-27 Thread Paul . Illingworth
Another approach maybe to use n-grams. Index each word as follows 2 gram field in nf fo or rm ma at 3 gram field inf nfo for orm rma mat 4 gram field info nfor form orm rmat and so on. To search for term "form" simply search the 4 gram field. The prefix query approach may suffer

RE: Return all distinct values

2006-03-30 Thread Paul . Illingworth
The IndexReader.terms() method gets a list of all the terms in an index. You need to somehow limit this to the terms for your ZipCode field which I don't know how to do. Luke has the ability to do this though so it is certainly possible. Regards Paul I.

Re: Joins between index and database

2006-03-24 Thread Paul . Illingworth
You can get the results from the database and then create either some boolean clauses to append to your existing Lucene query if the number of results from the data base is small. If the number of results from the database is large then you can create a filter. (Assuming you have some common k

Re: Open an IndexWriter in parallel with an IndexReader on the same index.

2006-02-21 Thread Paul . Illingworth
I have a set of classes similar in function to IndexModifier but a little more advanced. The idea is to keep the IndexReaders and IndexWriters open as long as possible only closing them when absolutely necessary. Using the concurrency package allows for me to have multiple readers and a singl

Re: Two strange things in Lucene

2006-01-24 Thread Paul . Illingworth
The TooManyClauses exception is due to the prefix query being rewritten to a boolean query that exceeds the boolean queries maximum number of clauses. Its an unchecked exception from the search method that you should probably explicitly catch and then return a helpful message to the user maybe

Index merging

2005-12-08 Thread Paul . Illingworth
Hello all, Whilst merging one index into another using IndexWriter.addIndexes(IndexReader[]) I got the following error. (index _file_path)\_5z.fnm (The system cannot find the file specified) It would appear that this occurred during the adding of the indexes. The indexes I was merging to an

Re: Search Problem

2005-11-29 Thread Paul . Illingworth
Lucene is case sensitive. Make sure the case in your query matches the case in the index. You could also try selecting the keyword analyser in Luke. Paul I. Dirk Hennig

RE: Insert new records into index

2005-11-11 Thread Paul . Illingworth
I queue up all my index operations. If the app stops the queue gets saved to disk. When the app restarts the queue is loaded and everything carries on. I haven't looked at the app failing just yet. I know the JVM has hooks that can be used to ensure clean up code gets called when the JVM exits

Re: Insert new records into index

2005-11-11 Thread Paul . Illingworth
Hello, You really do need to batch up your deletes and inserts otherwise it will take a long time. If you can, do all your deletes and then all of your inserts. I have gone to the trouble of queueing index operations and when a new operation comes along I reorder the job queue to ensure delet

Re: indexing records in hierachy

2005-11-02 Thread Paul . Illingworth
Hello, You could try looking at http://www.nabble.com/Hierarchical-Documents-t242604.html#a677841 where this has been discussed a little before. Regards Paul I. Urvashi Gadi

Committing IndexReader changes without closing

2005-10-07 Thread Paul . Illingworth
Hello, I have a situation where I wish to open an IndexReader and keep it open. I never want to add anything to this index but do want to delete from it. Periodically I would like to flush any deletions that may have been made to the index to disk (to protect the changes from being lost if the

Lucene 1.9 and Java 1.4

2005-09-28 Thread Paul . Illingworth
Dear all, I have been trying to follow some of the developments for the new version of Lucene (1.9?). My understanding is that this will require Java 1.4. Is this correct? Is this because of changes to "core" functionality within Lucene or is it because some new additional classes require Java

Re: MultiSearcher... Multiple Analyzer

2005-09-14 Thread Paul . Illingworth
Just some thoughts - no answers. As the analyser for each index is different then the query produced by the query parser will be different. It may be that you will have to create a query per index then run the multiple queries on each index separately. You would then need to somehow combine

Re: AW: cancel search

2005-09-09 Thread Paul . Illingworth
You could always create a subclass of RuntimeException and throw and catch this instead. "Kunemann Frank" <[EMAIL PROTECTED]> wrote on 09/09/2005 10:01:56: > Exceptions didn't work as you need to implement the HitCollector > class. Its method "collect" doesn't throw any exceptions and I don'

Re: Updating a Document without re-analyzing

2005-09-08 Thread Paul . Illingworth
Hello Paul, I came across this yesterday. http://mail-archives.apache.org/mod_mbox/lucene-java-dev/200504.mbox/[EMAIL PROTECTED] My understanding is that by splitting your fields into two indexes and putting your keyword fields into one and your complicated stuff into the other then you ca

Re: Updating the index and searching

2005-09-08 Thread Paul . Illingworth
Hello Brian, Updating an index is very straightforward. Simply open the index writer for your existing index and add the new documents. The issue is that if you need to search on the updated index you need to open a new index reader in order to see the new documents. This is the timeconsuming

Updating the index and searching

2005-09-07 Thread Paul . Illingworth
Hello, I have an index into which documents get added and updated (by deleting and adding). When I run queries on the index these have to take into account all changes on the index so I open a new IndexReader. What I am finding is that when the index is large the opening of the index takes a c

Does order of BooleanQuery clauses affect search performance?

2005-08-26 Thread Paul . Illingworth
A simple question and I guess it may have been asked before. Does the order of Querys in a BooleanQuery affect search speed? By this I mean if the first clause of a BooleanQuery only returns a few results and the second clause returns lots of results and the two are ANDed is this faster than t

Re: Hierarchical Documents

2005-08-23 Thread Paul . Illingworth
I have been struggling with this sort of problem for some time and still haven't got an ideal solution. Initially I was going to go for the approach Erik has suggested for similar reasons - it allowed me to search within categories and within sub categories of those categories very simply. Un

RE: Lucene and numerical fields search

2005-07-12 Thread Paul . Illingworth
Hi Mickaƫl, Take a look at the org.apache.lucene.search.DateFilter class that comes with Lucene. This does date range filtering (I am using a modified version of this class for filtering my date format). It should be relatively strightforward to modify this for filtering numeric ranges. If yo

Re: Lucene and numerical fields search

2005-07-12 Thread Paul . Illingworth
I have similar requirements. To get around the "Too many clauses" problem I am creating a Filter (this takes one or two seconds to create on an index of around 25 documents) instead of using the RangeQuery. It's not ideal but it does sidestep the problem. If you are using the same range in

Re: Performance with multi index

2005-06-16 Thread Paul . Illingworth
I guess that if you have 10 indexes each with a merge factor of 10 with documents evenly distributed across those indexes then on average there will be a merge every 100 documents. If you have a single index there will be a merge every 10 documents. If you increase your merge factor from 10

Flushing IndexWriters and IndexReaders

2005-06-07 Thread Paul . Illingworth
I am using Lucene in an environment where searches are being carried out whilst documents are being added and deleted. Currently I have some index management code which caches the IndexReader and IndexWriter instances ensuring only one is ever open at a time. When a document is added then an In