date:20050823

Re: how to get newest library version?

2005-08-23 Thread Paul Elschot

On Tuesday 23 August 2005 23:45, Peter Veentjer - Anchor Men wrote: > Does anyone know how I can download the newest version of Lucene from the SVN? I have been trying (even the website) but I only get timeouts. I would even be happy with a newly build jar (based on the newest sources). So I help

Re: Lucene and Xanga.com

2005-08-23 Thread Otis Gospodnetic

Nicely done, looks pretty and seems fast. How much data is being searched there? Otis --- Monsur Hossain <[EMAIL PROTECTED]> wrote: > Hey all. We just relaunched our search feature over here at > Xanga.com; the > Blogs, Metros and Blogrings sections are powered by Lucene.NET! You > can > che

Re: post-normalization score filter

2005-08-23 Thread Chris Hostetter

It doesn't look like there were any replies to this while i've been away, so i just wanted to point out that this isn't really a practicle thing to do because the score's don't have any meaningfulll absolute value (ie: you can't compare the scores from one search with the scores of another). the

Lucene and Xanga.com

2005-08-23 Thread Monsur Hossain

Hey all. We just relaunched our search feature over here at Xanga.com; the Blogs, Metros and Blogrings sections are powered by Lucene.NET! You can check it out here: http://search.xanga.com/ This is only the beginning of what we want to do with search and Lucene. I want to thank everyone on th

how to get newest library version?

2005-08-23 Thread Peter Veentjer - Anchor Men

Does anyone know how I can download the newest version of Lucene from the SVN? I have been trying (even the website) but I only get timeouts. I would even be happy with a newly build jar (based on the newest sources). So I help someone can help me out so I can remove a MultiFieldQueryParser bug

Example of Field.TermVector.WITH_POSITIONS_OFFSETS usage?

2005-08-23 Thread Sean O'Connor

Hello, I am trying to work through term positions and how to get them from a collection of hits. Does setting TermVector.WITH_POSITIONS_OFFSETS to true save the start/end position of the term in the source text file? (I _think_ it does). If so, where would I start for trying to make th

Re: QueryParser not thread-safe

2005-08-23 Thread jian chen

Right. My philosophy is that, make it work, then, make it better. Don't waste time on something that you are not sure if it would cause performance problem. Jian On 8/23/05, Paul Elschot <[EMAIL PROTECTED]> wrote: > On Tuesday 23 August 2005 19:01, Miles Barr wrote: > > On Tue, 2005-08-23 at 13

Re: i18n query normalization

2005-08-23 Thread Ken Krugler

We have a multi-languaged index and we need to match accented characters with non accented characters. For example, if a document contains: mângão, the query: mangao should match it. I guess I would have to build some sort of analyzer/tokenizer for this. I was wondering if there are

Re: i18n query normalization

2005-08-23 Thread Daniel Naber

On Tuesday 23 August 2005 19:15, John Wang wrote: > We have a multi-languaged index and we need to match accented > characters with non accented characters. For example, if a document > contains: mângão, the query: mangao should match it. See ISOLatin1AccentFilter in contrib/analyzers in SVN. r

Re: QueryParser not thread-safe

2005-08-23 Thread Paul Elschot

On Tuesday 23 August 2005 19:01, Miles Barr wrote: > On Tue, 2005-08-23 at 13:47 -0300, [EMAIL PROTECTED] wrote: > > Hi! I've been having problems with lucene's QueryParser, apparently it is not thread-safe. > > > > That means I can't parse queries in threads where the queryparser object is cre

Re: QueryParser not thread-safe

2005-08-23 Thread Luke Francl

On Tue, 2005-08-23 at 12:01, Miles Barr wrote: > Using a non-threadsafe object in a threaded environment is fairly > standard in Java, just wrap it in a synchronized block. > > If you don't want all threads waiting on one query parser, create a pool > of them. Based on doing the simplest possib

i18n query normalization

2005-08-23 Thread John Wang

Hi: We have a multi-languaged index and we need to match accented characters with non accented characters. For example, if a document contains: mângão, the query: mangao should match it. I guess I would have to build some sort of analyzer/tokenizer for this. I was wondering if there a

Re: QueryParser not thread-safe

2005-08-23 Thread Miles Barr

On Tue, 2005-08-23 at 13:47 -0300, [EMAIL PROTECTED] wrote: > Hi! I've been having problems with lucene's QueryParser, apparently it is not > thread-safe. > > That means I can't parse queries in threads where the queryparser object is > created once and reused for each query. If I do, the resul

Re: hslf ppt files

2005-08-23 Thread Nick Burch

On Tue, 23 Aug 2005, Derya Kasapoglu wrote: is there anybody who have the poi hslf classes to extract text from Power Point files. I know the classes are on the poi sites but they are not packaged in a jar! You'll need to either download it yourself from CVS and compile with ant, or grab a ni

QueryParser not thread-safe

2005-08-23 Thread jhandl

Hi! I've been having problems with lucene's QueryParser, apparently it is not thread-safe. That means I can't parse queries in threads where the queryparser object is created once and reused for each query. If I do, the resulting queries may have all kinds of weird problems, for example missin

Re: WhiteSpace Tokenizer question

2005-08-23 Thread Yonik Seeley

It's the QueryParser, not the Analyzer. When the query parser sees multiple tokens from what looks like a single word, it puts them in a phrase query. I think the only way to change that behavior would be to modify the QueryParser. -Yonik On 8/23/05, Dan Armbrust <[EMAIL PROTECTED]> wrote: > I w

RE: WhiteSpace Tokenizer question

2005-08-23 Thread Vanlerberghe, Luc

The query string is first parsed by QueryParser and what it believes to be single terms are then passed on to your analyzer. QueryParser only considers space, tab, \n and \r to be white space (See QueryParser.jj) QueryParser itself is not aware that '-' should be treated as white space so in your

WhiteSpace Tokenizer question

2005-08-23 Thread Dan Armbrust

I wrote a slightly modified version of the WhiteSpaceTokenizer that allows me to treat other characters as whitespace. My thought was that this would be an easy way to make it tokenize on characters such as "-". My tokenizer looks like this: public class CustomWhiteSpaceTokenizer extends Char

hslf ppt files

2005-08-23 Thread Derya Kasapoglu

Hi, is there anybody who have the poi hslf classes to extract text from Power Point files. I know the classes are on the poi sites but they are not packaged in a jar! If i download all of them by myself i get version problems! So maybe someone has a jar file and can send me? Thanks in forward By

Re: Why is delete() part of IndexREADER?

2005-08-23 Thread Cheolgoo Kang

It's because of Lucene's index structure. IndexWriter creates a new segment(one Lucene index is composed of several segments) when a document added and doesn't care about old indexes already exist. So, IndexWriter should not have delete() operation for old indexes. And so, the IndexReader have cont

Re: Hierarchical Documents

2005-08-23 Thread Dan Funk

People indexing XML documents tend to deal with the same kind of problem, there is an excellent article at the URL below showing how they handled some fairly complex hierarchical queries. http://www.idealliance.org/papers/xmle02/dx_xmle02/papers/03-02-08/03-02-08.html Rohit Lodha wrote: Hi A

Re: UpdateIndex

2005-08-23 Thread Miles Barr

On Tue, 2005-08-23 at 13:53 +0200, Derya Kasapoglu wrote: > Thank you for your help!!! > > I try it without Analyzer! > > document.add(Field.Keyword("path", file[i].getAbsolutePath())); > > then > > Term term = new Term("path", file[i].getAbsolutePath()); > Query query = new TermQuery(term); >

Re: UpdateIndex

2005-08-23 Thread Derya Kasapoglu

Thank you for your help!!! I try it without Analyzer! document.add(Field.Keyword("path", file[i].getAbsolutePath())); then Term term = new Term("path", file[i].getAbsolutePath()); Query query = new TermQuery(term); reader.delete(term); so is better! :) and it works > --- Ursprüngliche N

Re: UpdateIndex

2005-08-23 Thread Derya Kasapoglu

I meant the reader.hasDeletions() returns null and reader.delete(term) returns 0. So...! I store the path that way in the index: document.add(Field.Text("pathLC", file[i].getAbsolutePath())); and i use the StandardAnalyzer. I can not search for the path if i store it as Keyword like that: document

Re: UpdateIndex

2005-08-23 Thread Miles Barr

On Tue, 2005-08-23 at 12:54 +0200, Derya Kasapoglu wrote: > Yes, it returns null. > But this is a little bit funny because the searching is correct > and it finds the document whitch have changed! > So want can i do!? > > Is there an opportunity to get the document id? It can't return null since

Re: UpdateIndex

2005-08-23 Thread Derya Kasapoglu

Yes, it returns null. But this is a little bit funny because the searching is correct and it finds the document whitch have changed! So want can i do!? Is there an opportunity to get the document id? > --- Ursprüngliche Nachricht --- > Von: Miles Barr <[EMAIL PROTECTED]> > An: java-user@lucene.a

Re: UpdateIndex

2005-08-23 Thread Miles Barr

On Tue, 2005-08-23 at 12:38 +0200, Derya Kasapoglu wrote: > i query the index for the path of the files in the directory and compare the > dates. > But i have a Problem! > I find out the files which have changed but i can not delete the documet > from the index, i don't know why! > > In the Field

Re: UpdateIndex

2005-08-23 Thread Derya Kasapoglu

Hi, i'm writing the deletion now and i do it that way: i query the index for the path of the files in the directory and compare the dates. But i have a Problem! I find out the files which have changed but i can not delete the documet from the index, i don't know why! In the Field "pathLC" is he

Re: Hierarchical Documents

2005-08-23 Thread Paul . Illingworth

I have been struggling with this sort of problem for some time and still haven't got an ideal solution. Initially I was going to go for the approach Erik has suggested for similar reasons - it allowed me to search within categories and within sub categories of those categories very simply. Un

Re: Hierarchical Documents

2005-08-23 Thread Erik Hatcher

On Aug 22, 2005, at 2:27 AM, Rohit Lodha wrote: Currently, Documents cannot contain other documents. I have a Graph of Objects (Documents) to search in. I could flatten them and search but... Is there any nice way to do it? I have used a technique of encoding a hierarchical path (like "/ ca

Re: Why is delete() part of IndexREADER?

2005-08-23 Thread Ray Tsang

I have come to peace with this problem. Basically, I think it's because you need to read/find what you are deleting first? hehe Writer just need to write whatever it's been told to write. ray, On 8/23/05, Mikko Noromaa <[EMAIL PROTECTED]> wrote: > Hi, > > Why IndexReader allows me to do write-

Why is delete() part of IndexREADER?

2005-08-23 Thread Mikko Noromaa

Hi, Why IndexReader allows me to do write-operations like delete? I'd think this should be part of the IndexWriter class instead. I had created a wrapper class that callers can open for either writing or searching. It creates either an IndexWriter or an IndexSearches and stores that inside itself

Re: how to get newest library version?

Re: Lucene and Xanga.com

Re: post-normalization score filter

Lucene and Xanga.com

how to get newest library version?

Example of Field.TermVector.WITH_POSITIONS_OFFSETS usage?

Re: QueryParser not thread-safe

Re: i18n query normalization

Re: i18n query normalization

Re: QueryParser not thread-safe

Re: QueryParser not thread-safe

i18n query normalization

Re: QueryParser not thread-safe

Re: hslf ppt files

QueryParser not thread-safe

Re: WhiteSpace Tokenizer question

RE: WhiteSpace Tokenizer question

WhiteSpace Tokenizer question

hslf ppt files

Re: Why is delete() part of IndexREADER?

Re: Hierarchical Documents

Re: UpdateIndex

Re: UpdateIndex

Re: UpdateIndex

Re: UpdateIndex

Re: UpdateIndex

Re: UpdateIndex

Re: UpdateIndex

Re: Hierarchical Documents

Re: Hierarchical Documents

Re: Why is delete() part of IndexREADER?

Why is delete() part of IndexREADER?

32 matches

Site Navigation

Mail list logo

Footer information