Handling of unknown/multiple lanuage documents

2005-09-04 Thread Hacking Bear
Hello, I'm new to Lucene. After some readings, I'm still not quite sure which Analyzer I should be using for handling documents in unknown or multiple lanugages. The documents I want to index may be written in lanuages other than the user/system's default language and one document may contain t

Re: Date boosts implementation

2005-09-04 Thread Chris Hostetter
: Could someone please give me some suggestions on how to implement date : boosts? I would like to boost the document when it is new and lower : the boost when it's old. you should check out this older thread... http://mail-archives.apache.org/mod_mbox/lucene-java-user/200501.mbox/[EMAIL PROTEC

Re: Can Span Queries contain boolean, prefix and other component queries?

2005-09-04 Thread Chris Hostetter
: >>[Query] : >>"Napol* Dynamite" near "film|movie" : >This can be done using nested SpanNearQuery's and SpanOrQuery's. : >A PhrasePrefixQuery can not be used as a SpanQuery. I've never really looked at SpanQueries very hard, but this thread got me a bit curious. Looking over the docs and the c

Date boosts implementation

2005-09-04 Thread Ben
Hi Could someone please give me some suggestions on how to implement date boosts? I would like to boost the document when it is new and lower the boost when it's old. Thanks, Ben - To unsubscribe, e-mail: [EMAIL PROTECTED] For a

Re: Can Span Queries contain boolean, prefix and other component queries?

2005-09-04 Thread Sean O'Connor
Paul, Thanks for pointing me to the surround code. I have started playing with it, and am impressed. Now I just need to adjust my thinking a bit more to see if I can implement the tool correctly, and get my specific search functionality out of what it offers. You've been a great help, Sean

Re: Lucene contrib (surround), Subversion, and Eclipse

2005-09-04 Thread Sean O'Connor
Chris, I think my hesitation with my approach was due to being lazy, and not wanting to get up to speed with ant. I've gotten over much of the Eclipse internal project/build learning curve, so this is very likely a case of the 'golden hammer' syndrome, where I am ignoring the benefits of us

Re: SAME-opattor (possible newbie question)

2005-09-04 Thread Chris Hostetter
: For example, given this data: : : author: a b c : author: d e f : : a search for "a SAME c" would match the first row, but "a SAME d" would : match nothing, which is what I want. if i understand you correctly, then you are describing a use case in which the index has two documents, each contain

Re: Lucene contrib (surround), Subversion, and Eclipse

2005-09-04 Thread Chris Hostetter
I don't use Eclipse, (and in fac i've never acctaully built the from source) but if i remember correctly, one of the main reasons why the "sandbox" was retired and everything in it was moved to where it is in the "contrib" directory was so Lucene and all of the "contrib"uted code could be compiled

SAME-opattor (possible newbie question)

2005-09-04 Thread Martin Malmsten-2
Is there a way to tell Lucene to restrict proximity searches to just one field? This would mimic the BRS/Search SAME-operator, which I use very often. For example, given this data: author: a b c author: d e f a search for "a SAME c" would match the first row, but "a SAME d" would match nothing

Lucene contrib (surround), Subversion, and Eclipse

2005-09-04 Thread Sean O'Connor
Hello, I am new to subversion, junit and the Lucene contrib repository. I am looking over the 'surround' project at the moment. If there is anyone out there with Eclipse experience who uses the contrib subversion (or cvs) repository could you look over my approach listed below? I am using

Re: "Right" combination of analyzers for indexing and searching

2005-09-04 Thread Otis Gospodnetic
Hi Jeff, This is a tough question to answer, because there is no universal answer. The choice of Analyzer depends on what/how you are trying to index/search. I've used analyzers from the Lucene distributions, but have also written specialized ones. My suggestion for you is to start with the Sta

Re: Can Span Queries contain boolean, prefix and other component queries?

2005-09-04 Thread Paul Elschot
Sean, On Sunday 04 September 2005 20:43, Sean O'Connor wrote: > Hello, > I am trying to do some complex queries such as: > > [Field contents] > The movie Napoleon Dynamite is a movie about a kid named Napoleon who > has no Dynamite. > > [Query] > "Napol* Dynamite" near "film|movie" > > >

Re: Phrase frequency

2005-09-04 Thread Sean O'Connor
I believe the index just contains information about single terms. A PhraseQuery then searches the index for the parts of the phrase and returns the hit information. So, as far as I understand, there is no way to get the frequency of phrase directly from an index, but you could create a PhraseQ

Can Span Queries contain boolean, prefix and other component queries?

2005-09-04 Thread Sean O'Connor
Hello, I am trying to do some complex queries such as: [Field contents] The movie Napoleon Dynamite is a movie about a kid named Napoleon who has no Dynamite. [Query] "Napol* Dynamite" near "film|movie" Is this possible with some version of a span query? Something like a PhrasePrefixQ

RE: How to search between dates?

2005-09-04 Thread houyang
"MMDD" is better since it has less number unique terms compared with the unix time stamp if you only care about the days. -Original Message- From: Filip Anselm [mailto:[EMAIL PROTECTED] Sent: Sunday, September 04, 2005 3:56 AM To: java-user@lucene.apache.org Subject: Re: How to searc

"Right" combination of analyzers for indexing and searching

2005-09-04 Thread Jeff Rodenburg
Question to those who've deployed and maintained Lucene: any recommendations or observations about practical decisions regarding analyzer choice in indexing & searching? What have you found in operation to work well, become difficult, yield better/worse results, affect performance, etc.? What wo

Re: How to search between dates?

2005-09-04 Thread Filip Anselm
DateFilter sounds great!! - But how is the best way to store dates in af Field? I get the time as a unix time stamp, seconds since epoch - and usually I can cut it down to hours or days since ephoc instead - if this has any effect on the perfomance... thanks... Chris Hostetter wrote: >: How do I

Re: How to search between dates?

2005-09-04 Thread Chris Hostetter
: How do I combine two queries - one made by the QueryParser and the : programmatically made RangeQuery? you could make them both children of a single BooleanQuery, but as long as you're going to write a little java code to put them together -- why not use a DateFilter instead? http://lucene.apa

Re: How to search between dates?

2005-09-04 Thread Filip Anselm
jian => I'll try the RangeQuery first, and if it doesn't give any perfomance problems I'll just stick to that method - it's easy, simple and doens't involve other systems. But thanks for info... How do I combine two queries - one made by the QueryParser and the programmatically made RangeQuery? F