Hi Hans,
>
> I'm in the process to upgrade from 2.0 to 2.1, but are missing the
> similar contrib (the jar only contains a Manifest). is this a bug or
> is that on purpose?
Take a look in:
lucene-2.1.0/contrib/queries/
this is the new home, it is explained in the changelog why the code moved...
Hello All,
I am implementing a query auto-complete function à la google. Right now
I am using a TermEnum enumerator on a specific field and list the Terms
found.
That works good for Searches with only one Term, but when the user's
typing two or three words the function will autocomplete each Term
hello,
with a SpanFirstQuery I want to realize a "starts with" search -
that seems to work fine. But I have the Problem that I have documents
with multiple titles and I thought I can do a sfq-search for each tiltle
by adding multiple instances for the specific field:
fo
Hi Daniel,
>> so a doc from 1973 should get a boost of 1.1973 and a doc of 1975 should
>> get a boost of 1.1975 .
>
> The boost is stored with a limited resolution. Try boosting one doc by 10,
> the other one by 20 or something like that.
You're right. I thought that with the float values the r
Hello all,
I am trying to boost more recent Docs, i.e. Docs with a greater year
Value like this:
if (title.getEJ() != null) {
titleDocument.setBoost(new Float("1." + title.getEJ()));
}
so a doc from 1973 should get a boost of 1.1973 and a do
Hi Wooi,
>Just wondering is there anyone used Digester to extract xml content and
> index the xml file? Is there any source that I can refer to on how to
> extract the xml contents. Or is there any other xml parser is much easier to
> use?
Perhaps this article may help:
http://www-128.ibm.com
spinergywmy schrieb:
> Hi Erick,
>
>I did take a look at the link that u provided me, and I have try myself
> but I have no return reesult.
>
>My search string is "third party license readme"
>
hhm with a quick look I would suggest that you have to split the string
into individual terms,
hi all,
I would like to implement the possibility to search for "C++" and "C#" -
I found in the archive the hint to customize the appropriate *.jj file
with the code in NutchAnalysis.jj:
// irregular words
| <#IRREGULAR_WORD: (|)>
| <#C_PLUS_PLUS: ("C"|"c") "++" >
| <#C_SHARP: ("C"|"c") "#"
hi Erik,
> "action and" is likely not a single Term, so you'll want to create a
> SpanNearQuery of those individual terms (that match the way they were
> when analyzed and indexed, mind you) and use a SpanNearQuery inside a
> SpanFirstQuery. Make sense?
Yes, it works (see below)!
... but with my
t; in the title) but i get (correct) results for "action",
What am I doing wrong here?
tia,
martin
>
> Erik
>
>
> On Nov 14, 2006, at 8:32 AM, Martin Braun wrote:
>
>> hi,
>>
>> i would like to provide a exact "PrefixField Search&quo
hi,
i would like to provide a exact "PrefixField Search", i.e. a search for
exactly the first words in a field.
I think I can't use a PrefixQuery because it would find also substrings
inside the field, e.g.
action* would find titles like "Action and knowledge" but also (that's
what i don't want it
WATHELET Thomas schrieb:
> how to update a field in lucene?
>
I think you'll have to delete the whole doc and add the doc with the new
field to the index...
hth,
martin
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional
Hi Breck,
i have tried your tutorial and built (hopefully) a successful
SpellCheck.model File with
49M.
My Lucene Index directory is 2,4G. When I try to read the Model with the
readmodel function,
i get an "Exception in thread "main" java.lang.OutOfMemoryError: Java
heap space", though I started j
Hi Breck,
thanks for your answer.
>>
>> With lucenes spellcheck contribution I am not really satisfied because
>> the Index has some (many?) mispelled words, so the did you mean class
>> (from the jave.net example) is good in finding similar mispelled words.
>> With the similarWords Function the
hi all,
does anybody have practical experiences with Ling Pipes Spellchecker
(http://www.alias-i.com/lingpipe/demos/tutorial/querySpellChecker/read-me.html)?
With lucenes spellcheck contribution I am not really satisfied because
the Index has some (many?) mispelled words, so the did you mean clas
Hello Rajiv,
perhaps captcha's will solve your problem:
http://en.wikipedia.org/wiki/CAPTCHA
many open-source PHP products are using this like phpmyfaq and phpBB.
So you can take a look at this code.
hth,
martin
Original-Nachricht
Von: Rajiv Roopan <[EMAIL PROTECTED]>
Betre
Hello,
I would like to index the user submitted queries to a given index. As a
result of this I want to provide something like: people who searched for
test searched also with these queries: +title:test +author:somename.
I think the simple approach of just adding the queries as a string in a
docu
hello,
I am using FieldCache.DEFAULT.getStrings in combination with an own
HitCollector (I loop through all results and count the number of
occurences of a fieldvalue in the results).
My Problem is that I have Filed values like dt.|lat or ger.|eng. an it
seems that only the last token of the field
Hi Thomas,
> Is it possible to update fields in an existing index.
> If yes how to proceed.
>
I think you can only delete a document and then reindex the updated
document:
public static int delTitle(String ID) {
try {
return writer.deleteDocuments(new Term("ID",ID))
hello ould,
sid'ahmed schrieb:
> Hello,
> I indexed my document but, Can I search for an address web, it returns
> me no result,
> and when I search the same address with a query like "http*" it returns
> me a result,
It depends on which analyzer you use:
the StandardAnalyzer will do this with
Hello Adrian,
>> I am indexing some text in a java object that is "%772B" with the
>> standard analyser and Lucene 2.
>>
>> Should I be able to search for this with the same text as the query, or
>> do I need to do any escaping of characters?
Besides Luke there are the AnalyzerUtils from the LIA
hi andy,
> How can I use HitCollector to iterate over every returned document?
You have to override the function collect for the HitCollector class and
then store the retrieved Data in an array or map.
Here is just a source-code scratch (is = IndexSearcher)
is.search(query, null
Hi Yonik,
>> So a Phrase search to "The xmen story" will fail. With a slop of 1 the
>> doc will be found.
>>
>> But when generating the query I won't know when to use a slop. So adding
>> slops isn't a nice solution.
>
> If you can't tolerate slop, this is a problem.
I use the WordDelimiterFilte
Hi John,
> Just for the record - I've been using javamail POP and IMAP providers in
> the past, and they were prone to hanging with some servers, and resource
> intensive. I've been also using Outlook (proper, not Outlook Express -
> this is AFAIK impossible to work with) via a Java-COM bridge suc
Hi Yonik,
>> I can't figure out what the parameters does. ;)
>
> Yes, it will fail without slop... I don't think there is a practical
> way around that.
I am trying to analyze your WordDelimiterFilter.
If I have x-men, after analyzing (with catenateAll) I get this:
Analzying "The x-men story
Yonik Seeley schrieb:
> On 7/23/06, karl wettin <[EMAIL PROTECTED]> wrote:
>> I'm want to filter words with a dash in them.
>>
>> ["x-men"]
>> ["xmen"]
>> ["x", "men"]
>>
>> All of above should be synonyms. The problem is ["x", "men"] requiring a
>> distance between the terms and thus also matching
hi herbert,
>> WhitespaceAnalyzer looks brutal. Is it possible that I keep
>> StandardAnalyzer and at the same time to tell the parser to keep a
>> list of chars during indexing?
Perhaps it would be sufficient to use the WhitespaceAnalyzer and keep
StandardAnalyzer for the other fields by using a
hi miles,
thanks for the response.
I think I didn't explain my Problem good enough.
The harder problem for me is how to get the proposals for the
refinement? I have a date-range of 16xx to now, for about 4 bn. docs.
So the number of found documents could be quite large. But the
distribution of t
Hello all,
I want to realize a drill-down Function aka "narrow search" aka "refine
search".
I want to have something like:
Refine by Date:
* 1990-2000 (30 Docs)
* 2001-2003 (200 Docs)
* 2004-2006 (10 Docs)
But not only DateRanges but also for other Categories.
What I have found in the List-Arc
he example doc which may produce tokens that do not match those of the
> indexed content. Use setAnalyzer() to ensure they are in sync.
>
>
>
>
> - Original Message
> From: Martin Braun <[EMAIL PROTECTED]>
> To: java-user@lucene.apache.org
> Sent: Frid
Hello,
inspired by this thread, I also tried to implement a MoreLikeThis
search. But I have the same Problem of a null query.
I did set the Fieldname to a Field that is stored in the Index.
But "like" just returns null.
Here is my Code:
Hits hits = this.is.search(new Ter
Hi all,
I don't know who can update the Wiki Pages so I am just mailing here.
The download of spellchecker1.1.zip
contribution does not work with Lucene-2.0 anymore.
http://wiki.apache.org/jakarta-lucene/SpellChecker?highlight=spellchecker1.1.zip
So I wanted to build _only_ the spellcheck-contri
[EMAIL PROTECTED] schrieb:
> hi,
>
> my problem is that i am using mysql db in which one table is
> present and i want index each row in the table and then search
>
> plz reply
>
> how this can be done?
http://wiki.apache.org/jakarta-lucene/LuceneFAQ
How can I use Lucene to index a database?
Co
Hi chris,
> searching everytime using a new searcher was taking time. So For testing, i
> made it a static one and reused the same. This gave me a lot of
> improvement.
> Previously my query was taking approx 25 sec. But now most of the queries
> are taking time between the 100 and 800 ms.
Do you
hi,
>
> I'm hardly the lucene expert, but I don't think you can search just a
> portion of the index. But that's effectively what you're doing if you
> restrict the search to "son and.".
I think there is also the possibility to write a custom search filter
(org.apache.lucene.search.Filter), an
Hello all,
german words are often dash-concatenated, e.g. West-Berlin or something
like "C*-algebras and W*-algebras".
I tend to write my own analyzer like the SynonymAnalyzer from the
LIA-Book. I want to Index these words like this:
West-Berlin => Westberlin | West | Berlin | "West Berlin"
C*-
36 matches
Mail list logo