Re: Using Hibernate to store Lucene Indexes in a Database

2006-09-08 Thread Tomi NA
On 9/8/06, Néstor Boscán <[EMAIL PROTECTED]> wrote: To reduce administration tasks. If you want to move your application from server to server you'll have to move the index files. I want to be able to move my application by just moving my database schema and deploying an ear. Regards, Néstor Bo

Re: Indexing MS Powerpoint files with Lucene

2006-09-08 Thread Tomi NA
On 9/7/06, Andrzej Bialecki <[EMAIL PROTECTED]> wrote: Tomi NA wrote: > On 9/7/06, Nick Burch <[EMAIL PROTECTED]> wrote: >> On Thu, 7 Sep 2006, Tomi NA wrote: >> > On 9/7/06, Venkateshprasanna <[EMAIL PROTECTED]> wrote: >> >> Is there

Re: Indexing MS Powerpoint files with Lucene

2006-09-07 Thread Tomi NA
On 9/7/06, Nick Burch <[EMAIL PROTECTED]> wrote: On Thu, 7 Sep 2006, Tomi NA wrote: > On 9/7/06, Venkateshprasanna <[EMAIL PROTECTED]> wrote: >> Is there any filter available for extracting text from MS Powerpoint files >> and indexing them? >> The lucene websit

Re: Indexing MS Powerpoint files with Lucene

2006-09-07 Thread Tomi NA
On 9/7/06, Venkateshprasanna <[EMAIL PROTECTED]> wrote: Is there any filter available for extracting text from MS Powerpoint files and indexing them? The lucene website suggests the POI project, which, it seems does not support PPT files as of now. http://jakarta.apache.org/poi/hslf/index.html

Re: Using Lucene Index for Business Intelligence / Analytics

2006-09-02 Thread Tomi NA
On 8/31/06, Saurabh Dani <[EMAIL PROTECTED]> wrote: I don't have a lot of experience with reporting tools and how data is stored by high priced tools which use OLAP and other similar storage types but we needed a solution for drill down reports and searching w

Re: 30 milllion+ docs on a single server

2006-08-11 Thread Tomi NA
On 8/12/06, Mark Miller <[EMAIL PROTECTED]> wrote: I've made a nice little archive application with lucene. I made it to handle our largest need: 2.5 million docs or so on a single server. Now the powers that be say: lets use it for a 30+ million document archive on a single server! (each doc siz

Re: accented characters, wildcards and other problems

2006-07-14 Thread Tomi NA
On 7/13/06, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: Bok Tomi, What do you mean by "terms are misrepresented"? What should they be, and what are you seeing? I mean 3/5 accented characters appear in the index with accents correctly displayed, but the remaining 2 accented characters appear

accented characters, wildcards and other problems

2006-07-13 Thread Tomi NA
I've done a bit of testing with accented characters (Croatian, to be specific) and can't really explain what I see when I explore the index with luke. I've used accented characters in directory names, file names and file contents. Now, in the list of terms (in "Top ranking terms", "Overview" tab)

Re: combined filesystem and web search

2006-07-11 Thread Tomi NA
On 7/12/06, Steven Rowe <[EMAIL PROTECTED]> wrote: Tomi NA wrote: > I wish people would start selling .pdf books online... :( Your wish is granted: <http://www.manning.com/hatcher2/> Wow, that was fast! Thanks for the link. >> Then there's IndexMergeTool which

Re: combined filesystem and web search

2006-07-11 Thread Tomi NA
On 7/11/06, Erick Erickson <[EMAIL PROTECTED]> wrote: I can answer a few of these. If you haven't yet, you'd do yourself a favor to pick up the book "Lucene in Action". It's written to the 1.4 code-base, the examples compile but give deprecated warnings for the 1.9 code base, and need a few more

combined filesystem and web search

2006-07-11 Thread Tomi NA
I plan to make lucene (and nutch) a key element in an intranet solution, but I only know about lucene what I've read in the last couple of days. Here's what I'd like opinions about. I would like to build a single point of access to data on intranet web pages and LAN shared documents. I've looked