date:20050905

Re: Can Span Queries contain boolean, prefix and other component queries?

2005-09-05 Thread Paul Elschot

On Monday 05 September 2005 04:38, Chris Hostetter wrote: > > : >>[Query] > : >>"Napol* Dynamite" near "film|movie" > > : >This can be done using nested SpanNearQuery's and SpanOrQuery's. > : >A PhrasePrefixQuery can not be used as a SpanQuery. > > I've never really looked at SpanQueries very ha

Multiple Language Indexing and Searching

2005-09-05 Thread Olivier Jaquemet

Hi, I'd like to go in details regarding issues that occurs when you want to index and search contents in multiple languages. I have read Lucene in Action book, and many thread on this mailing list, the most interesting so far being this one: http://mail-archives.apache.org/mod_mbox/lucene-ja

Optimize, OutOfMemory + Merge

2005-09-05 Thread Martin Rode

Hi all, The code snipset below does NOT result in an optimized index in one of my test cases. As I understand, the optimized index, means that there is only ONE segment file in the index folder. After this code has run, I sometimes have 100 segment files in the directory. When I call optimiz

Re: Optimize, OutOfMemory + Merge

2005-09-05 Thread Erik Hatcher

You should call .optimize() instead of merging. Erik On Sep 5, 2005, at 5:22 AM, Martin Rode wrote: Hi all, The code snipset below does NOT result in an optimized index in one of my test cases. As I understand, the optimized index, means that there is only ONE segment file in the ind

Re: SAME-opattor (possible newbie question)

2005-09-05 Thread Martin Malmsten

: For example, given this data: : : author: a b c : author: d e f : : a search for "a SAME c" would match the first row, but "a SAME d" would : match nothing, which is what I want. if i understand you correctly, then you are describing a use case in which the index has two documents, each co

BM25 with Lucene

2005-09-05 Thread Karl Koch

Hello all, did somebody here implement and run the BM25 algorithm with Lucene (perferably Lucene 1.2 but any information or even code about that would be very helpful on any Lucene version). Kind Regards, Karl -- Lust, ein paar Euro nebenbei zu verdienen? Ohne Kosten, ohne Risiko! Satte Provisi

TermVectorOffsetInfo class?

2005-09-05 Thread Koji Sekiguchi

Hi, I wanted to try highlighter in contrib, compiled it and I got a compile error because there isn't TermVectorOffsetInfo class which is imported by TokenSources.java: import org.apache.lucene.index.TermVectorOffsetInfo; I tried to find the issues on Bugzilla, but couldn't find them. Where can

Re: TermVectorOffsetInfo class?

2005-09-05 Thread mark harwood

It's in the latest version of Lucene in SVN. If you don't want to work with the latest version of Lucene simply remove TokenSources.java - it's an optional class for use with the highlighter and provides a way of retrieving already-parsed document tokens from the index. Instead, you can simply run

RE: TermVectorOffsetInfo class?

2005-09-05 Thread Koji Sekiguchi

Hi Mark, Thank you for your advice. I want to work with current version - 1.4.3 so I simply deleted the class and could compile highlighter. Thank you, Koji > -Original Message- > From: mark harwood [mailto:[EMAIL PROTECTED] > Sent: Tuesday, September 06, 2005 12:44 AM > To: java-user

Deleting All Documents With Certain Field Name

2005-09-05 Thread Luke

Would this not delete all records from the index that have a saleDate field? reader.delete(new Term("salesDate", "")); Thanks, Luke - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Deleting All Documents With Certain Field Name

2005-09-05 Thread Otis Gospodnetic

No. The delete method deletes all Documents with _matching_ terms. Otis --- Luke <[EMAIL PROTECTED]> wrote: > Would this not delete all records from the index that have a saleDate > field? > > reader.delete(new Term("salesDate", "")); > > Thanks, > > Luke > > >

List of values from refix query

2005-09-05 Thread Axel

Hi Assuming that in the indexing process I setup 3 different documents doc1, doc2, doc3. with something like: doc1.add(Field.Keyword("variable", "var_no1")); doc1.add(Field.Keyword("variable", "var_test1")); doc2.add(Field.Keyword("variable", "var_no2")); doc2.add(Field.Keyword("variable", "var_

Re: List of values from refix query

2005-09-05 Thread Otis Gospodnetic

That looks correct. That's what PrefixQuery is for. If you use QueryParser and give if "var*", QP will convert that to PrefixQuery for you. Otis --- Axel <[EMAIL PROTECTED]> wrote: > Hi > > Assuming that in the indexing process I setup 3 different documents > doc1, doc2, doc3. > > with somet

Re: SAME-opattor (possible newbie question)

2005-09-05 Thread Chris Hostetter

: > : For example, given this data: : > : : > : author: a b c : > : author: d e f : > : a search for "a SAME c" would match the first row, but "a SAME d" : > would : > : match nothing, which is what I want. : No, both fields are in the same document. Which is also why proximity : does not work.

Re: List of values from refix query

2005-09-05 Thread Chris Hostetter

: How can I get all values across the documents with a given prefix? : For prefix = "var" for example I would like to have a list of all 5 values. : : For prefix = "var_no" for example I would like to have a list of the values : {"var_no1", "var_no2", "var_no3"}. if you just want the values, you

Highlighter apply to Japanese

2005-09-05 Thread Koji Sekiguchi

Hi again, I'm using highlighter to highlight terms in Japanese text, but I cannot get preferable output. If I use StandardAnalyzer or SnowballAnalyzer w/ English, getBestFragment() returns preferable outputs: Sample: (SnowballAnalyzer) Text: A meeting will be held in the City Hall TokenStream: [

Multi-lang analyzer? Re: Multiple Language Indexing and Searching

2005-09-05 Thread Hacking Bear

Hi, I have the similar problem to deal with. In fact, a lot of times, the documents do not have any lanugage information or it may contain text in multiple languages. Further, the user would not like to always supply this information. Also the user may very well be interested in documents in m

use of Luke s getHighFreqTerms

2005-09-05 Thread Nils Hoeller

Hi, i ve got only one little question: I m using the class HighFreqTerms of the Luke Project to find those terms in my index ( made by Nutch) Now I wanted to filter the Terms with a stopwordlist (junkwords). The method getHighFreqTerms gives me the ability to define a Hashtable junkwords ,

Hits document offset information? Span query or Surround?

2005-09-05 Thread Sean O'Connor

I believe I have heard that Span queries provide some way to access document offset information for their hits somehow. Does anyone know if this is true, and if so, how I would go about it? Alternatively (preferably actually) does the surround code from the SVN development area have a way of r

Re: Highlighter apply to Japanese

2005-09-05 Thread markharw00d

I don't know the behaviour of the Japanese Analyzer you are using. Can you add to your example diagnosis the Token.getPositionIncrement, Token.startOffset and Token.endOffset for each of the tokens? The highlighter groups tokens with overlapping start and end offsets into a single TokenGroup f

Re: Hits document offset information? Span query or Surround?

2005-09-05 Thread markharw00d

>>I believe I have heard that Span queries provide some way to access document offset information for their hits somehow. See http://marc.theaimsgroup.com/?l=lucene-user&m=112496111224218&w=2 Faithfully selecting extracts based *exactly* on query criteria will be hard given complex queries eg

Re: Highlighter apply to Japanese

2005-09-05 Thread Chris Lu

Hi, Koji, I had the same problem as you. This is because CJK's n-gram analysis is different from single character's. My get around is to use CJKHighlighter and CJKHighlightAnalyzer in sandbox. -- Chris Lu Lucene Search RAD on Any Database http://www.dbsight.net On 9/5/05, Koji Se

Re: Can Span Queries contain boolean, prefix and other component queries?

Multiple Language Indexing and Searching

Optimize, OutOfMemory + Merge

Re: Optimize, OutOfMemory + Merge

Re: SAME-opattor (possible newbie question)

BM25 with Lucene

TermVectorOffsetInfo class?

Re: TermVectorOffsetInfo class?

RE: TermVectorOffsetInfo class?

Deleting All Documents With Certain Field Name

Re: Deleting All Documents With Certain Field Name

List of values from refix query

Re: List of values from refix query

Re: SAME-opattor (possible newbie question)

Re: List of values from refix query

Highlighter apply to Japanese

Multi-lang analyzer? Re: Multiple Language Indexing and Searching

use of Luke s getHighFreqTerms

Hits document offset information? Span query or Surround?

Re: Highlighter apply to Japanese

Re: Hits document offset information? Span query or Surround?

Re: Highlighter apply to Japanese

22 matches

Site Navigation

Mail list logo

Footer information