did you mean issue

2009-11-18 Thread m.harig
hello all i've a doubt in spell checker , when i search for a keyword hoem am getting the spell results as in the following order (in which am retrieving 4 suggested words) form hold home them my need is to get the home word to be fetched first. But its in the third position , howeve

Re: Finding the highest term in a field

2009-11-18 Thread Daniel Noll
On Thu, Nov 19, 2009 at 16:01, Yonik Seeley wrote: > On Wed, Nov 18, 2009 at 10:48 PM, Daniel Noll wrote: >> But what if I want to find the highest?  TermEnum can't step backwards. > > I've also wanted to do the same. It's coming with the new flexible > indexing patch: > https://issues.apache.org

Keep URLs intact and not tokenized by the StandardTokenizer

2009-11-18 Thread Sudha Verma
Hi, I am using lucene 2-9-1. I am reading in free text documents which I index using lucene and the StandardAnalyzer at the moment. The StandardAnalyzer keeps email addresses intact and does not tokenize them. Is there something similar for URLs? This seems like a common need. So, I thought I'd

Re: lucene not returning correct results eventhough search query is present

2009-11-18 Thread Otis Gospodnetic
Hi, Please use java-user list for user questions. Are you sure the file got fully indexed in the first place? Use Luke to check. Also, see: IndexWriter.MaxFieldLength Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NE

Re: Finding the highest term in a field

2009-11-18 Thread Yonik Seeley
On Wed, Nov 18, 2009 at 10:48 PM, Daniel Noll wrote: > But what if I want to find the highest?  TermEnum can't step backwards. I've also wanted to do the same. It's coming with the new flexible indexing patch: https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system

Finding the highest term in a field

2009-11-18 Thread Daniel Noll
Hi all. If I want to find the lowest term in a field, I can do something like this: public Date computeEarliestDate(IndexReader reader) throws IOException { TermEnum terms = reader.terms(new Term("date", "")); if (terms.term() == null || !"date".equals(terms.term().fie

RE: recovering terms hit from wildcard queries

2009-11-18 Thread Uwe Schindler
> Thanks - that might work though I believe would produce many queries > instead > of just one to maintain the specific Term used to match a given hit > document. > > I presume then I would get all the actual terms from the WildcardTermEnum > that my wildcard containing string refers to and then u

Re: recovering terms hit from wildcard queries

2009-11-18 Thread Christopher Tignor
Thanks - that might work though I believe would produce many queries instead of just one to maintain the specific Term used to match a given hit document. I presume then I would get all the actual terms from the WildcardTermEnum that my wildcard containing string refers to and then use them each i

Phrase query with terms at same location

2009-11-18 Thread Christopher Tignor
Hello, I have indexed words in my documents with part of speech tags at the same location as these words using a custom Tokenizer as described, very helpfully, here: http://mail-archives.apache.org/mod_mbox/lucene-java-user/200607.mbox/%3c20060712115026.38897.qm...@web26002.mail.ukl.yahoo.com%3e

Re: recovering terms hit from wildcard queries

2009-11-18 Thread Simon Willnauer
You could use WildcardTermEnum directly and pass your term and the reader to it. This will allow you to enumerate all terms that match your wildcard term. Is that what are you asking for? simon On Wed, Nov 18, 2009 at 10:39 PM, Christopher Tignor wrote: > Hello, > > Firstly, thanks for all the g

recovering terms hit from wildcard queries

2009-11-18 Thread Christopher Tignor
Hello, Firstly, thanks for all the good answers and support form this mailing list. Would it be possible and if so, what would be the best way to recover the terms filled in for a wildcard query following a successful search? For example: If I parse and execute a query using the string "my*" and

Re: Problems with fragments size on highlight.

2009-11-18 Thread Mark Harwood
It could be the "merge contiguous fragments" feature that attempts to do exactly this to improve readability It's an option you can turn off. On 15 Nov 2009, at 01:21, Felipe Lobo wrote: Hi, i'm having some problems with the size of the fragmentes when i'm doing the highlight. I pass on the

RE: Lucene Java 3.0.0 RC1 now available for testing

2009-11-18 Thread Uwe Schindler
There are already some proposals of reforming the whole Document/Field API because it does not match a Full Text Search engine using an Inverted Index. Stored fields and indexed fields should not be mixed together. The problem then disappears, because you are forced to split between indexing and st

Re: Lucene Java 3.0.0 RC1 now available for testing

2009-11-18 Thread Glen Newton
Yes, I would agree with you on the surprise aspect. :-) But you suggest hiding complexity, and being in control and having transparency are mutually exclusive, which isn't necesarily the case. I think I can live with the decisions made. :-) If I can think of a viable and complete alternative, I'l

Re: Token character positions

2009-11-18 Thread Grant Ingersoll
On Nov 17, 2009, at 10:37 AM, Christopher Tignor wrote: > Hello, > > Hoping someone might clear up a question for me: > > When Tokenizing we provide the start and end character offsets for each > token locating it within the source text. > > If I tokenize the text "word" and then serach for th