Re: parameter create in IndexWriter

2006-09-07 Thread jacky
I see, thanks, Erick! I will change some architecture of our application because an IndexWriter will be kept open when program is running. Best Regards. jacky - Original Message - From: "Erick Erickson" <[EMAIL PROTECTED]> To: Sent: Thursday, September 07, 2006

Re: SpanRegexQuery causes error

2006-09-07 Thread Luke Tan
It's spanFirst(spanRegexQuery(monthly:day * of every * months), 10) java.lang.NullPointerException java.lang.NullPointerException at java.util.Hashtable.get(Hashtable.java:336) at org.apache.lucene.index.MultiReader.norms(MultiReader.java:163) at org.apache.lucene.search.spans.SpanWeig

Re: Using Hibernate to store Lucene Indexes in a Database

2006-09-07 Thread Otis Gospodnetic
Just a google away: http://www.google.com/search?q=hibernate%20lucene%20spring Priceless! Otis - Original Message From: Néstor Boscán <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Thursday, September 7, 2006 7:48:52 PM Subject: Using Hibernate to store Lucene Indexes in a D

Re: Highligher Example

2006-09-07 Thread Mark Miller
Highlighting a PDF document, last time I looked (quite a while ago), involves supplying an xml file that describes offsets for highlighting. You can specify the file in the URL. You can also do simple highlighting by passing in a list of words to be highlighted, but this does not even catch min

Re: Highligher Example

2006-09-07 Thread Mag Gam
Thanks for the quick response Erik. I will be getting my LIA book back very soon, I forgot it at a destination :-( Lets assume, there is a document called "hello.pdf" and it has the content "this is hello.pdf. It uses Acrobat" When I perform a search for "Acrobat", i want hello.pdf to show up, a

Re: Determining index from MultiSearcher

2006-09-07 Thread Chris Hostetter
: For ranking purposes, I need to know at least the String version of the : Directory associated with the index for each returned Document. : : Does anybody know if there currently some built-in functionality to do this? note the following methods... http://lucene.apache.org/java/docs/api/org/ap

Re: Highligher Example

2006-09-07 Thread Erik Hatcher
There are test cases in the Highlighter codebase that exercise it and show its use, as well as a few examples of it in the "Lucene in Action" codebase. These examples output plain text with some prefix and suffix surrounding the highlighted terms. Highlighting text in a PDF is possible,

Determining index from MultiSearcher

2006-09-07 Thread Shane Perry
I am currently working on an application which requires the ability to search multiple indexes using the same query. I can use the MultiSearcher object to do this without any problem. For ranking purposes, I need to know at least the String version of the Directory associated with the index f

Highligher Example

2006-09-07 Thread Mag Gam
Hey Anyone have a search result highlighter example? I have various doc, PDFs, DOC, TXT, PPT, and I would like to show a highlight, similar to how google does it... tia

Using Hibernate to store Lucene Indexes in a Database

2006-09-07 Thread Néstor Boscán
Hi Has anybody seen a solution that will store Lucene indexes in a database using Hibernate?. Basically a HibernateDirectory so I can store and retrieve the indexes from a database? Regards, Néstor Boscán

how to index rdf/owl file using lucene

2006-09-07 Thread khgcutg hsowhj
Hi All, How can we index RDF/OWL file, can anyone provide a small example or related papers or any kind of literature to index and search rdf/owl file using lucene.Any kind of help is appreciated. Regards, phani. - Get your

Re: best way indexing user queries

2006-09-07 Thread karl wettin
On Thu, 2006-09-07 at 15:46 +0200, Martin Braun wrote: > Hello, > > I would like to index the user submitted queries to a given index. As a > result of this I want to provide something like: people who searched for > test searched also with these queries: +title:test +author:somename. > > I think

Re: best way indexing user queries

2006-09-07 Thread Karel Tejnora
Discussed before, it's more relation db task than lucene. Simple approach is to get a list of terms from your queries and store relation document - query - terms. I have around 1.6e10 query-terms in postgreSQL and with proper index select takes around 0.6 ms (clustered vacuumed analyzed), 300

best way indexing user queries

2006-09-07 Thread Martin Braun
Hello, I would like to index the user submitted queries to a given index. As a result of this I want to provide something like: people who searched for test searched also with these queries: +title:test +author:somename. I think the simple approach of just adding the queries as a string in a docu

Re: Indexer large file and hi performance indexing

2006-09-07 Thread Erick Erickson
Here's another approach that I *think* will work... Remember that opening an IndexReader takes a snapshot of the index, and later additions aren't visible until you open a new index reader. So, at some point you have an FSDir index only, and your program starts. You open your (shared) reader to

Re: Indexing MS Powerpoint files with Lucene

2006-09-07 Thread Andrzej Bialecki
Tomi NA wrote: On 9/7/06, Nick Burch <[EMAIL PROTECTED]> wrote: On Thu, 7 Sep 2006, Tomi NA wrote: > On 9/7/06, Venkateshprasanna <[EMAIL PROTECTED]> wrote: >> Is there any filter available for extracting text from MS Powerpoint files >> and indexing them? >> The lucene website suggests the PO

Re: SpanRegexQuery causes error

2006-09-07 Thread Erik Hatcher
What's the .toString on the query it parsed to? Keep in mind that "*" isn't the proper regular expression to match everything - it would be ".*" or some other pattern. Erik On Sep 7, 2006, at 7:41 AM, Luke Tan wrote: Hi, I am using code in http://mail-archives.apache.org/mod_mbo

Re: parameter create in IndexWriter

2006-09-07 Thread Erick Erickson
See below... On 9/7/06, jacky <[EMAIL PROTECTED]> wrote: I am afraid i don't understand it. Input the wrong path? This will be happen rarely since the index path is always hard code in the config file. In your application maybe. but not the general case. In any case, this is one of t

Re: Indexing MS Powerpoint files with Lucene

2006-09-07 Thread Tomi NA
On 9/7/06, Nick Burch <[EMAIL PROTECTED]> wrote: On Thu, 7 Sep 2006, Tomi NA wrote: > On 9/7/06, Venkateshprasanna <[EMAIL PROTECTED]> wrote: >> Is there any filter available for extracting text from MS Powerpoint files >> and indexing them? >> The lucene website suggests the POI project, which,

Re: Indexing MS Powerpoint files with Lucene

2006-09-07 Thread Nick Burch
On Thu, 7 Sep 2006, Tomi NA wrote: On 9/7/06, Venkateshprasanna <[EMAIL PROTECTED]> wrote: Is there any filter available for extracting text from MS Powerpoint files and indexing them? The lucene website suggests the POI project, which, it seems does not support PPT files as of now. http://jak

SpanRegexQuery causes error

2006-09-07 Thread Luke Tan
Hi, I am using code in http://mail-archives.apache.org/mod_mbox/lucene-java-user/200605.mbox/[EMAIL PROTECTED] for wildcard search in phrase but it seems that I can only search something like: "one two three word*" but not "one * three word" It throws error: java.lang.NullPointerExceptio

Re: Indexing MS Powerpoint files with Lucene

2006-09-07 Thread Tomi NA
On 9/7/06, Venkateshprasanna <[EMAIL PROTECTED]> wrote: Is there any filter available for extracting text from MS Powerpoint files and indexing them? The lucene website suggests the POI project, which, it seems does not support PPT files as of now. http://jakarta.apache.org/poi/hslf/index.html

Re: Atomic index/search for a phrase

2006-09-07 Thread Erik Hatcher
A single TermQuery is surely the fastest query of all. But, what are you really trying to do? It is not generally useful to index things untokenized except for precise key-like fields but not for full-text ones. Erik On Sep 6, 2006, at 11:51 PM, Venkateshprasanna wrote: Whic

Re: Indexing MS Powerpoint files with Lucene

2006-09-07 Thread Gopikrishnan Subramani
Did you check POI javadocs? Look for org.apache.poi.hslf.extractor.PowerPointExtractor. It's one of the most straightforward classes from POI as far extracting text for indexing is concerned. -Gopi On 9/7/06, Venkateshprasanna <[EMAIL PROTECTED]> wrote: Is there any filter available for extra

RE: Indexer large file and hi performance indexing

2006-09-07 Thread HODAC, Olivier
Actually, latency is not possible. Do you think it is possible to tune the fswriter to flush into the file system each N elements (using the maxMergeDocs and co) and use "for each docs of the RAMindexer (addDocument to the FSindexer)"? What about performances? -Message d'origine- De :