Hello ctorresl,
you can use QueryParser automatically creating query as query syntax (Erick
showed).
Or use BooleanQuery class.
BooleanQuery query = new BooleanQuery;
query.add(a_termquery, Occur.SHOULD);
query.add(other_termquery, Occur.SHOULD);
On Thu, Jan 28, 2010 at 11:15 AM, Erick Erickson w
Robert,
Thank you for this great information. Let me look into these suggestions.
Ivan
--- On Wed, 1/27/10, Robert Muir wrote:
> From: Robert Muir
> Subject: Re: Average Precision - TREC-3
> To: java-user@lucene.apache.org
> Date: Wednesday, January 27, 2010, 2:52 PM
> Hi Ivan, it sounds to
Have you looked at the query syntax?
See...
http://lucene.apache.org/java/3_0_0/queryparsersyntax.html
And the book Lucene In Action has many examples
HTH
Erick
On Wed, Jan 27, 2010 at 6:55 PM, ctorresl wrote:
>
> Hello:
> IÄm working with Lucene for my thesis, please I need answers to
>
ctorresl wrote:
> Hello:
> IÄm working with Lucene for my thesis, please I need answers to
> these questions:
> 1. How can I tell Lucene to search for more than one term??? (for example:
> the query "house garden computer" will return documents in which at least
> one of the
> term appears) What cl
Hello:
IÄm working with Lucene for my thesis, please I need answers to
these questions:
1. How can I tell Lucene to search for more than one term??? (for example:
the query "house garden computer" will return documents in which at least
one of the
term appears) What classes I need to use?
2. Lucen
Hi Ivan, it sounds to me like you are going about it the right way.
I too have complained about different document/topic formats before, at
least with non-TREC test collections that claim to be in TREC format.
Here is a description of what I do, for what its worth.
1. if you use the trunk benchma
Thank you, Jose.
-Original Message-
From: José Ramón Pérez Agüera [mailto:jose.agu...@gmail.com]
Sent: Wednesday, January 27, 2010 1:42 PM
To: java-user@lucene.apache.org
Subject: Re: Average Precision - TREC-3
Hi Ivan,
you might want use the lucene BM25 implementation. Results should b
Hi Ivan,
you might want use the lucene BM25 implementation. Results should be
better changing the ranking function. Another option is Language model
implementation for Lucene:
http://nlp.uned.es/~jperezi/Lucene-BM25/
http://ilps.science.uva.nl/resources/lm-lucene
The main problem with this imple
Robert, Grant:
Thank you for your replies.
Our goal is to fine-tune our existing system to perform better on relevance.
I agree with Robert's comment that these collections are not completely
compatible. Yes, it is possible that the results will vary some depending on
the collections differ
Thank you much.
I study about your comments. They are useful.
I am newer using Lucene 3.0. Hope it works well.
On Thu, Jan 28, 2010 at 1:21 AM, Robert Muir wrote:
> no, but you can take the tokenfilter itself and simply use it in your
> lucene
> application.
>
> it uses the old tokenstream API s
no, but you can take the tokenfilter itself and simply use it in your lucene
application.
it uses the old tokenstream API so if you want to use Lucene 3.0 or 3.1, you
will need a version that works with the new tokenstream API.
There is a patch available here for that:
https://issues.apache.org/ji
Robert:
Is this in Lucene yet? According to what I could find in JIRA, it's
still open. And it's not in the Javadocs on a quick scan.
Erick
On Wed, Jan 27, 2010 at 11:08 AM, Robert Muir wrote:
> WordDelimiterFilter has a splitOnCaseChange option that should be useful
> for
> this:
>
> http
Hello, forgive my ignorance here (I have not worked with these english TREC
collections), but is the TREC-3 test collection the same as the test
collection used in the 2007 paper you referenced?
It looks like that is a different collection, its not really possible to
compare these relevance scores
WordDelimiterFilter has a splitOnCaseChange option that should be useful for
this:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory
>From the example: PowerShot -> Power, Shot
On Wed, Jan 27, 2010 at 11:01 AM, Phan The Dai wrote:
> Can everyone suggest
On Jan 26, 2010, at 8:28 AM, Ivan Provalov wrote:
> We are looking into making some improvements to relevance ranking of our
> search platform based on Lucene. We started by running the Ad Hoc TREC task
> on the TREC-3 data using "out-of-the-box" Lucene. The reason to run this old
> TREC-3 (
Can everyone suggest me a solution for tokenize the camelcase words in java
?
Examples for camelcase words are: getXmlRule, setTokenizeAnalyzer.
They should be tokenized to get, Xml, Rule, set, Tokenize, Analyzer.
Thank you very much!
On Wed, Jan 27, 2010 at 4:53 PM, Asif Nawaz wrote:
>
>
> IndexSearcher is = new IndexSearcher("index");IndexReader ir =
> is.getIndexReader().open("index");System.out.println("No of documents in
> index = "+ir.numDocs());
> The last statement shows no of documents = 167. that means IndexReader i
Hello ,
I could successfully implement the Chinese analyzer (CJKAnalyzer) and search
Chinese text. However, I have problem when I use the Boolean operator AND
then I got always 0 hits. When I search for the 2 Chinese terms without the
“AND” operator is no problem, When I want to count only the
IndexSearcher is = new IndexSearcher("index");IndexReader ir =
is.getIndexReader().open("index");System.out.println("No of documents in index
= "+ir.numDocs());
The last statement shows no of documents = 167. that means IndexReader is
reading from index, which is open. I think the problem may
In the demo example for hotel database searching. I am confused how to open the
index and where should i fit that code. In SearchEngine.java file i opened the
index this way
IndexSearcher is = new IndexSearcher(IndexReader.open("index"));
but it's not working and still returns 0 hits :(
> D
On Wed, Jan 27, 2010 at 4:25 AM, Jamie wrote:
> We got to the bottom of it.
Thanks for bringing closure!
> Turned out to be a status page that was opening
> the reader to obtain docCount but not closing it.Thanks for your help!
If you only need the docCount in the index, it's much faster to us
Lots of other things to check are listed in the FAQ:
http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3F
--
Ian.
On Wed, Jan 27, 2010 at 11:47 AM, Simon Willnauer
wrote:
> Do you open the searcher / reader after you call commit on the writer?
>
> sim
Do you open the searcher / reader after you call commit on the writer?
simon
On Wed, Jan 27, 2010 at 12:40 PM, Asif Nawaz wrote:
>
> ok. it works when i add commit and close indexes. when open the index file
> with Lukes, it shows me the list of documents that were matched. But in my
> progr
ok. it works when i add commit and close indexes. when open the index file with
Lukes, it shows me the list of documents that were matched. But in my program
it returns no of hits = 0. Why???
Hits hits = se.performSearch("significance");System.out.println("hits length =
"+ hits.length());
do you close your index writer or commit it before you open your searcher?
one more thing, if you search for "Hotel" you might not find anything
if the querystring is not passed through the StandardAnalyzer you use
for indexing. (well, or another analyzer that does lowercasing).
BTW. you email is
i build an index to store 100 docs, each with field author, title and
abstract.for (i=0;i<100;i++) {writer = new IndexWriter("index",new
StandardAnalyzer(),true,IndexWriter.MaxFieldLength.UNLIMITED);
doc.add(new Field("author",cfcDoc.getAu(), Field.Store.YES,
Field.Index.TOKENIZED));do
Hi Jake
We got to the bottom of it. Turned out to be a status page that was
opening the reader to obtain docCount but not closing it.Thanks for your
help!
Jamie
On 2010/01/27 10:48 AM, Jamie wrote:
Hi Jake
Ok. The number of file handles left open is increasing rapidly. For
instance, 4200 f
Hi Jake
Ok. The number of file handles left open is increasing rapidly. For
instance, 4200 file handles were left open by Lucene 2.9.1 over a period
of 16 min. You can see in the attached snapshot a picture from JPicus
showing the file handles that are left open. These index files are
delet
On Wed, Jan 27, 2010 at 12:17 AM, Jamie wrote:
> Hi Jake
>
>
> You were indexing but not searching? So you are never calling getReader()
>> in the first place?
>>
>>
> Of course, the call exists, its just that during testing we did not execute
> any searches at all.
Oh! Re-reading your initi
Hi Jake
You were indexing but not searching? So you are never calling getReader()
in the first place?
Of course, the call exists, its just that during testing we did not
execute any searches at all.
How have you been doing search in a realtime fashion with Lucene before
2.9's introduction
30 matches
Mail list logo