Re: Lucene index on relational data

2008-04-14 Thread Rajesh parab
Hi Everyone, Any help around this topic will be very useful. Is anyone partitioning the data into 2 or more indexes and using parallelReader to search these indexes? If yes, how do you handle updates to the indexes and make sure the doc ids for all indexes are in same order? Regards, Rajesh ---

Re: Lucene index on relational data

2008-04-13 Thread Rajesh parab
Thanks Karl. I think your solution would be useful in case we would like to partition the index into two indexes and use ParallelReader to query both indexes simultaneously. If this solution is not getting including inside future Lucene releases, what other options we have to update just one of t

Re: Lucene index on relational data

2008-04-13 Thread Rajesh parab
Hi Mathieu, I can definitely store the foreign key inside the dynamic index. However if I understand correctly, for ParallelReader to work properly, doc ids for all documents in both primary and secondary (dynamic) index should be in same order. How can we achieve it if there are frequest changes

Re: Lucene index on relational data

2008-04-12 Thread Karl Wettin
Rajesh parab skrev: How do we specify the primary key or doc id so that newly added document will use the same doc id. Do you have any sample code that makes use of this patch? Sorry, there is only the test case in the patch. Secondly, there was a comment saying it is a proof of concept and

Re: Lucene index on relational data

2008-04-12 Thread Mathieu Lecarme
Regarding data and its relationships - the use case I am trying to solve is to partition my data into 2 indexes, a primary index that will contains majority of the data and it is fairly static. The secondary index will have related information for the same data set in primary index and this relate

Re: Lucene index on relational data

2008-04-11 Thread Rajesh parab
Thanks Karl. How do we specify the primary key or doc id so that newly added document will use the same doc id. Do you have any sample code that makes use of this patch? Secondly, there was a comment saying it is a proof of concept and not a real project. Is anyone using this patch on their produ

Re: Lucene index on relational data

2008-04-11 Thread Karl Wettin
Rajesh parab skrev: https://issues.apache.org/jira/browse/LUCENE-879 <> As per the hack you mentioned inside JIRA, if some of the documents are deleted and re-inserted into secondary index, the other documents inside the index do not change their doc id. However, the newly added documents will

Re: Lucene index on relational data

2008-04-11 Thread Rajesh parab
<> How much data do you have? I have a hard time to understand the relationship between your objects and what sort of normalized data you add to the documents. If you are lucky it is just a single or few fields that needs to be updated and you can manage to keep it in RAM and rebuild the whole thin

Re: Lucene index on relational data

2008-04-11 Thread Rajesh parab
While going over the forum, I found one more thread where Otis has asked similar question around the syncronization of doc ids between 2 indexes. http://www.gossamer-threads.com/lists/lucene/java-user/50227?search_string=parallelreader;#50227 Otis, Have you found the answer to your question? Reg

Re: Lucene index on relational data

2008-04-11 Thread Karl Wettin
How much data do you have? I have a hard time to understand the relationship between your objects and what sort of normalized data you add to the documents. If you are lucky it is just a single or few fields that needs to be updated and you can manage to keep it in RAM and rebuild the whole th

Re: Lucene index on relational data

2008-04-11 Thread Rajesh parab
Thanks Mathieu, On your comments on partitioning of data - <> Yes. You can index unfolded data, wich take lot of space, or use two query in two index. The first build a Filter for the second, just like with the previous JDBC example. You can even cache the filter, like Solr does with its faceted

Re: Lucene index on relational data

2008-04-11 Thread Rajesh parab
Thanks for details Karl. I was looking for something like it. However, I have a question around the warning mentioned in javadoc of parallelReader. It says - It is up to you to make sure all indexes are created and modified the same way. For example, if you add documents to one index, you need t

Re: Lucene index on relational data

2008-04-11 Thread Mathieu Lecarme
Le 11 avr. 08 à 19:29, Rajesh parab a écrit : Thanks for these pointers Mathieu. We have earlier looked at Compass, but the main issue with database index is DB vendor support for BLOB locator. I understand that Oracle provides has this support to get the partial data from BLOB, but I guess the

Re: Lucene index on relational data

2008-04-11 Thread Karl Wettin
Hi Rajesh, I think you are looking for ParallelReader. public class ParallelReader extends IndexReader An IndexReader which reads multiple, parallel indexes. Each index added must have the same number of doc

Re: Lucene index on relational data

2008-04-11 Thread Rajesh parab
Thanks for these pointers Mathieu. We have earlier looked at Compass, but the main issue with database index is DB vendor support for BLOB locator. I understand that Oracle provides has this support to get the partial data from BLOB, but I guess the simiar support is not available in SQL Server an

Re: Lucene index on relational data

2008-04-11 Thread Mathieu Lecarme
Have a look at Compass 2.0M3 http://www.kimchy.org/searchable-cascading-mapping/ Your multiple index will be nice for massive write. In a classical read/write ratio, Compass will be much easier. M. Rajesh parab a écrit : Hi, We are using Lucene 2.0 to index data stored inside relational dat

Lucene index on relational data

2008-04-10 Thread Rajesh parab
Hi, We are using Lucene 2.0 to index data stored inside relational database. Like any relational database, our database has quite a few one-to-one and one-to-many relationships. For example, let’s say an Object A has one-to-many relationship with Object X and Object Y. As we need to de-normalize r