Hi , We have been developing an enterprise logging service at the Wachovia bank. The logs (Busines, application, error) for all the bank related applications are consolidated at one single location in an Oracle 10g Database.
In our second phase, we are now building a high perforinmg report viewer over it. So our search algorithm does not go to the Oracle 10g DB. We therfore avoid network and I/O. Our serach algorith now goes to a LUCENE index. We have Lucene indexes created for each application. These indexes are present on the same machine, where the search algorithm runs. As more applications at the bank are now beginning to consume this service, the Lucene Index is now growing. One of my team leads has suggested the following approach to resolve this issue: *I think the best approach is to restrict the Index size , is to keep it for some limited time and then archive the same. In case user wants to search against the old files then we might need to provide some configuration using which the lucene searcher can point to the achieved file and search the content. To implement this we need to rename the Index file with from and to date before its archived. While searching against the older files, user need to provide the date range and then the app can point to the relevant archived index files for search. Let me know your thoughts on this. * ** At present this sounds the most logical to me. But then we begin to store the Lucene indexes on a diffferent machine. This might again cause the search algorithm to make a network trip, if the serach is based on old archived data. Is there a better design to resolve the above concern. Does Lucene provid some sort of API to handle the above scenario's? Regards, Sandeep.