Re: Using SpanRegexQuery to search year like 200?

2006-09-08 Thread Erik Hatcher
To use SpanRegexQuery, you need to understand regular expressions. The WildcardQuery syntax is _NOT_ the same as SpanRegexQuery syntax. WildcardQuery supports a ? for single character match and * for multiple characters. SpanRegexQuery use standard regular expression syntax. "200?" mat

Re: SpanRegexQuery causes error

2006-09-08 Thread Erik Hatcher
We welcome you to package up this issue into a JUnit test case to demonstrate the bug, such that we can add it to our suite and fix the issue. I can't say for certain its a bug just yet, but seems suspicious. A simple JUnit test that could replicate this would be most helpful! Thanks,

RE: Using Hibernate to store Lucene Indexes in a Database

2006-09-08 Thread Néstor Boscán
Tomi thanks for your thoughts. I'm new to Lucene, so coming from an Oracle background my mind is set that everything goes inside the database. Now that I know some of the loses I can have a better picture. Regards, Néstor Boscán -Mensaje original- De: Tomi NA [mailto:[EMAIL PROTECTED] E

Using SpanRegexQuery to search year like 200?

2006-09-08 Thread Luke Tan
Hi, Can this be use to search year 2000, 2001, 2002, ... 2009? SpanFirstQuery snq = new SpanFirstQuery(new SpanRegexQuery(new Term("year", "200?")), 1); I need to use it to search something like Who is born in 200? Thanks

Re: Using Hibernate to store Lucene Indexes in a Database

2006-09-08 Thread Tomi NA
On 9/8/06, Néstor Boscán <[EMAIL PROTECTED]> wrote: To reduce administration tasks. If you want to move your application from server to server you'll have to move the index files. I want to be able to move my application by just moving my database schema and deploying an ear. Regards, Néstor Bo

Re: SpanRegexQuery causes error

2006-09-08 Thread Luke Tan
I use analyzer with LowerCaseTokenizer only (No stop word or any other special treatment). The phrase is tokenized. On 9/9/06, Luke Tan I tried .* too but it gave the same error. I think it's a bug. I solve it using SpanTermQuery where the search phrase is broken into day of every months and

Re: SpanRegexQuery causes error

2006-09-08 Thread Luke Tan
I tried .* too but it gave the same error. I think it's a bug. I solve it using SpanTermQuery where the search phrase is broken into day of every months and I nest these SpanTermQuery into SpanNearQuery with slop > 1. Thanks. On 9/9/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: On Sep 7, 2006

Re: Changing the Scoring api for OR parameters

2006-09-08 Thread Chris Hostetter
if you are already seting the document boost based on the "date" of hte Document, then the next thing you should familiarize yourself with is Similarity.coord function. It's specific purpose is for dealing with Queries which "aggregate" other queries (like a BooleanQuery does with it's clauses) t

Re: Preventing short documents from being boosted

2006-09-08 Thread Daniel Naber
On Freitag 08 September 2006 13:30, Grant Ingersoll wrote: > http://www.gossamer-threads.com/lists/lucene/java-user/38967#38967 I'd be happy about feedback about that similarity class, i.e. whether someone has used it successfully. If so, we could add it to the Lucene core (the old similarity w

Re: delete operation

2006-09-08 Thread karl wettin
On Fri, 2006-09-08 at 15:27 +0800, jacky wrote: > > So when the lucene database is updated, how to notify to reopen the > IndexSearcher since there may be several applications to search this > lucene database? Jira issue 550 contains easy to use decorated notification code that will do all th

Re: how to index rdf/owl file using lucene

2006-09-08 Thread Simon Willnauer
Just curious! RDF / OWL is xml right?! So just download the next best xml api or use java build in dom / sax whatever and extract the content you want to index, create your fields and pass the created document to the index writer. there you go. best regards Simon On 9/7/06, khgcutg hsowhj <[EMAI

Re: FWD: Re: parser question

2006-09-08 Thread Michael D. Curtin
If your question is why are the queries '(field:software field:engineer)' and '(+field:software +field:engineer)' returning the same results, it could be because none of your documents have *only* "software" *or* "engineer", i.e. they all have both words or neither. You could tes

FWD: Re: parser question

2006-09-08 Thread Chris Salem
any help with this? Chris Salem 440.946.5214 x5458 [EMAIL PROTECTED] - Forwarded Message - To: Mark Miller <[EMAIL PROTECTED]> From: Chris Salem <[EMAIL PROTECTED]> Sent: Wed 9/6/2006 3:58:49 PM Subject: Re: parser question its an index of 10 fields and about 10,000 records. Chri

Re: SpanRegexQuery causes error

2006-09-08 Thread Erik Hatcher
On Sep 7, 2006, at 9:26 PM, Luke Tan wrote: spanFirst(spanRegexQuery(monthly:day * of every * months), 10) What analyzer did you use for your text? Again, that is not a valid regular expression. But also, you're using a single long string of several words within your SpanRegexQuery ter

RE: Using Hibernate to store Lucene Indexes in a Database

2006-09-08 Thread Néstor Boscán
Also if you want to backup your application you just backup the database. Regards, Néstor Boscán -Mensaje original- De: Néstor Boscán [mailto:[EMAIL PROTECTED] Enviado el: Viernes, 08 de Septiembre de 2006 10:29 a.m. Para: 'java-user@lucene.apache.org' Asunto: RE: Using Hibernate to sto

RE: Using Hibernate to store Lucene Indexes in a Database

2006-09-08 Thread Néstor Boscán
To reduce administration tasks. If you want to move your application from server to server you'll have to move the index files. I want to be able to move my application by just moving my database schema and deploying an ear. Regards, Néstor Boscán -Mensaje original- De: Marcus Falck [mai

RE: Highligher Example

2006-09-08 Thread Dejan Nenov
Second that - I was a client of Stellent - the libs work great but are expensive. To see Stellent in action - get a copy of the free X1 desktop search or the X1 server (Lucene based). Another alternative is KeyView from Verity - now Autonomy. -Original Message- From: mark harwood [mailto:[

RE: Using Hibernate to store Lucene Indexes in a Database

2006-09-08 Thread Ramana Jelda
HI Marcus, Somehow I like your wording.. Can't stop replying you. Jelda > -Original Message- > From: Marcus Falck [mailto:[EMAIL PROTECTED] > Sent: Friday, September 08, 2006 2:05 PM > To: java-user@lucene.apache.org > Subject: SV: Using Hibernate to store Lucene Indexes in a Database >

SV: Using Hibernate to store Lucene Indexes in a Database

2006-09-08 Thread Marcus Falck
I cant understand why you are interested in storing the directory in a database using hibernate. It seems to me like you are trying to mix 2 good techniques in a destructive way. -Ursprungligt meddelande- Från: Néstor Boscán [mailto:[EMAIL PROTECTED] Skickat: den 8 september 2006 01:

Re: read past EOF

2006-09-08 Thread Michael McCandless
Bhavin Pandya wrote: It sounds like you're working with the index correctly, so I don't have any other ideas on why you're getting CFS files that are truncated. I would wory about the "cp" step filling up disk, but if you're nowhere near filling up disk that's not the root cause here. I h

Re: delete operation

2006-09-08 Thread Simon Willnauer
An other way to prevent your indexsearch from reopened everytime you delete an document is to use a global delete filter which excludes all deleted documents from being retrieved e.g. included in your search results. That won't work with updates without using a buffer or something similar but if y

Re: Preventing short documents from being boosted

2006-09-08 Thread Grant Ingersoll
http://www.gossamer-threads.com/lists/lucene/java-user/38967#38967 -Grant On Sep 8, 2006, at 5:57 AM, Wright, Tim wrote: Hi all, We have an issue where around 10-20% of our documents are much shorter (only a paragraph or so of text) than all the rest. Because Lucene considers document length

Re: read past EOF

2006-09-08 Thread Bhavin Pandya
Hi Mike, It sounds like you're working with the index correctly, so I don't have any other ideas on why you're getting CFS files that are truncated. I would wory about the "cp" step filling up disk, but if you're nowhere near filling up disk that's not the root cause here. I have found th

Re: delete operation

2006-09-08 Thread Michael McCandless
jacky wrote: > There is a question about delete operation, i have not found any doc in > lucene api's javadoc: >When using delete(Term term) of IndexReader and commit, at the same time, > an indexSearcher is open.So the deleted document still can be seached till > reopen the indexSearcher

Preventing short documents from being boosted

2006-09-08 Thread Wright, Tim
Hi all, We have an issue where around 10-20% of our documents are much shorter (only a paragraph or so of text) than all the rest. Because Lucene considers document length when indexing, most of the time these shorter documents end up being scored higher than the longer ones. We'd prefer it if

Re: Highligher Example

2006-09-08 Thread mark harwood
If you have a budget for this stuff then Stellent provide tools for parsing multiple document types and also have a viewer that can display documents with their original formatting, plus your highlights. See http://www.stellent.com/en/products/outside_in/viewer_tech/index.htm I don't work for S

Re: Indexing MS Powerpoint files with Lucene

2006-09-08 Thread Tomi NA
On 9/7/06, Andrzej Bialecki <[EMAIL PROTECTED]> wrote: Tomi NA wrote: > On 9/7/06, Nick Burch <[EMAIL PROTECTED]> wrote: >> On Thu, 7 Sep 2006, Tomi NA wrote: >> > On 9/7/06, Venkateshprasanna <[EMAIL PROTECTED]> wrote: >> >> Is there any filter available for extracting text from MS >> Powerpoint

Changing the Scoring api for OR parameters

2006-09-08 Thread Marcus Falck
Hi everyone, I want to override the default scoring when it comes to queries containing the OR operator. For example if I got the following headlines in my index : "Sun sues Microsoft" "Microsoft want to buy Tiscali" ".NU domain sues Microsoft" "The sun is shining" "Sun brings antitrus

Re: duplicate fields

2006-09-08 Thread jacky
hi Daniel, How do you use a separate database to check the duplicate fields? It is interesting! Best Regards. jacky - Original Message - From: "Daniel Noll" <[EMAIL PROTECTED]> To: Sent: Friday, September 08, 2006 3:08 PM Subject: Re: duplicate fields > jack

delete operation

2006-09-08 Thread jacky
hi, There is a question about delete operation, i have not found any doc in lucene api's javadoc: When using delete(Term term) of IndexReader and commit, at the same time, an indexSearcher is open.So the deleted document still can be seached till reopen the indexSearcher, i don't know how

Re: duplicate fields

2006-09-08 Thread Daniel Noll
jacky wrote: hi, 1. Is there an effect method to check if there exists the same field(hold a unique ID) when added into lucene index database? Make a search for this field? One way is to create an IndexReader and IndexSearcher on your index, which you reopen every now and then. But we do thi