hello all
i've a doubt in spell checker , when i search for a keyword hoem
am getting the spell results as in the following order (in which am
retrieving 4 suggested words)
form
hold
home
them
my need is to get the home word to be fetched first. But its in the third
position , howeve
On Thu, Nov 19, 2009 at 16:01, Yonik Seeley wrote:
> On Wed, Nov 18, 2009 at 10:48 PM, Daniel Noll wrote:
>> But what if I want to find the highest? TermEnum can't step backwards.
>
> I've also wanted to do the same. It's coming with the new flexible
> indexing patch:
> https://issues.apache.org
Hi,
I am using lucene 2-9-1.
I am reading in free text documents which I index using lucene and the
StandardAnalyzer at the moment.
The StandardAnalyzer keeps email addresses intact and does not tokenize
them. Is there something similar for
URLs? This seems like a common need. So, I thought I'd
Hi,
Please use java-user list for user questions.
Are you sure the file got fully indexed in the first place? Use Luke to check.
Also, see:
IndexWriter.MaxFieldLength
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NE
On Wed, Nov 18, 2009 at 10:48 PM, Daniel Noll wrote:
> But what if I want to find the highest? TermEnum can't step backwards.
I've also wanted to do the same. It's coming with the new flexible
indexing patch:
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system
Hi all.
If I want to find the lowest term in a field, I can do something like this:
public Date computeEarliestDate(IndexReader reader) throws IOException {
TermEnum terms = reader.terms(new Term("date", ""));
if (terms.term() == null || !"date".equals(terms.term().fie
> Thanks - that might work though I believe would produce many queries
> instead
> of just one to maintain the specific Term used to match a given hit
> document.
>
> I presume then I would get all the actual terms from the WildcardTermEnum
> that my wildcard containing string refers to and then u
Thanks - that might work though I believe would produce many queries instead
of just one to maintain the specific Term used to match a given hit
document.
I presume then I would get all the actual terms from the WildcardTermEnum
that my wildcard containing string refers to and then use them each i
Hello,
I have indexed words in my documents with part of speech tags at the same
location as these words using a custom Tokenizer as described, very
helpfully, here:
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200607.mbox/%3c20060712115026.38897.qm...@web26002.mail.ukl.yahoo.com%3e
You could use WildcardTermEnum directly and pass your term and the
reader to it. This will allow you to enumerate all terms that match
your wildcard term.
Is that what are you asking for?
simon
On Wed, Nov 18, 2009 at 10:39 PM, Christopher Tignor
wrote:
> Hello,
>
> Firstly, thanks for all the g
Hello,
Firstly, thanks for all the good answers and support form this mailing list.
Would it be possible and if so, what would be the best way to recover the
terms filled in for a wildcard query following a successful search?
For example:
If I parse and execute a query using the string "my*" and
It could be the "merge contiguous fragments" feature that attempts to
do exactly this to improve readability
It's an option you can turn off.
On 15 Nov 2009, at 01:21, Felipe Lobo wrote:
Hi, i'm having some problems with the size of the fragmentes when
i'm doing
the highlight. I pass on the
There are already some proposals of reforming the whole Document/Field API
because it does not match a Full Text Search engine using an Inverted Index.
Stored fields and indexed fields should not be mixed together. The problem
then disappears, because you are forced to split between indexing and
st
Yes, I would agree with you on the surprise aspect. :-)
But you suggest hiding complexity, and being in control and having
transparency are mutually exclusive, which isn't necesarily the case.
I think I can live with the decisions made. :-)
If I can think of a viable and complete alternative, I'l
On Nov 17, 2009, at 10:37 AM, Christopher Tignor wrote:
> Hello,
>
> Hoping someone might clear up a question for me:
>
> When Tokenizing we provide the start and end character offsets for each
> token locating it within the source text.
>
> If I tokenize the text "word" and then serach for th
15 matches
Mail list logo