Chris Hostetter wrote:
>
> unless i'm mistaken, docFreq isn't the only method affected by deleted
> docs, things like termDocs, termPositions, terms, ... pretty much all of
> hte IndexReader methods work that way (even getFieldNames could be
> missleading if the only doc with a field of that name
Tom Roberts is out of the office until 3rd September 2007 and will get back to
you on his return.
http://www.luxonline.org.uk
http://www.lux.org.uk
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail:
Tom Roberts is out of the office until 3rd September 2007 and will get back to
you on his return.
http://www.luxonline.org.uk
http://www.lux.org.uk
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail:
- /** Returns the number of documents containing the term t.
+ /** Returns the number of documents, including deleted, containing the
term t.
there is a note about this in the javadocs for deleteDocument, but i agree
it's not entirely clear ...
unless i'm mistaken, docFreq isn't the only
I was running in to some problems that turned out to be a non-
documented feature. Here is a javadoc suggestion:
- /** Returns the number of documents containing the term tcode>.
+ /** Returns the number of documents, including deleted, containing
the term t.
* @throws IOException if the
Yes. JUnit would also be good, if you have it.
if you want to write some there is a lot of good helper code already out
there for making sure the hits and scores produced by a query match the
explanations produced by that same query...
[EMAIL PROTECTED]:~/svn/lucene-clean$ ls
src/test/or
you should probably send this question to the nutch user mailing (or
perhaps hte hadoop user mailing list) ... this is the mailing list for the
Lucene java API that is used by nutch ... nothing in your stack trace
seems to indicate any problems in any Lucene Java code.
When i run nutch, i a
Antoine Baudoux wrote:
>
>
> That's some good news!
>
> Any idea on the release date for 2.3?
We're aiming for a release in early October. Keep your fingers crossed ;)
- Michael
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For
Give these tips a try to see if they help:
http://wiki.apache.org/lucene-java/
LuceneFAQ#head-3558e5121806fb4fce80fc022d889484a9248b71
Luke is your friend.
Cheers,
Grant
On Aug 30, 2007, at 6:06 AM, prabin meitei wrote:
Hi,
I am trying to search from an idlist (string containing comma
s
On Aug 30, 2007, at 3:40 PM, Peter Keegan wrote:
There are a couple of minor bugs in BoostingTermQuery.explain().
1. The computation of average payload score produces NaN if no
payloads were
found. It should probably be:
float avgPayloadScore = super.score() * (payloadsSeen > 0 ?
(payload
20 aug 2007 kl. 14.33 skrev Michael McCandless:
"karl wettin" <[EMAIL PROTECTED]> wrote:
I want to set documents in my IndexReader as deleted, but I will
never commit these deletions. Sort of a filter on a reader rather
than on a searcher, and no write-locks.
I could go hacking in IndexR
There are a couple of minor bugs in BoostingTermQuery.explain().
1. The computation of average payload score produces NaN if no payloads were
found. It should probably be:
float avgPayloadScore = super.score() * (payloadsSeen > 0 ? (payloadScore /
payloadsSeen) : 1);
2. If the average payload sco
SearchBlox uses the Lucene Search API and delivers out-of-the-box search
functionality for rapid deployment and easy administration. SearchBlox
provides integrated HTTP/HTTPS, File System and Feed crawlers, support for
various document formats including HTML, Word, PDF, PowerPoint and Excel,
suppo
If I use BoostingTermQuery on a query containing terms without payloads, I
get very different results than doing the same query with TermQuery.
Presumably, this is because the BoostingSpanScorer/SpanScorer compute scores
differently than TermScorer. Is there a way to make BoostingTermQuery behave
l
Hi all...
i am indexing pdf document using pdfbox 7.4, its working fine for some pdf
files. for japanese pdf files its giving the below exception.
caught a class java.io.IOException
with message: Unknown encoding for 'UniJIS-UCS2-H'
Can any one help me , how to set the encoding while reading pd
30 aug 2007 kl. 11.24 skrev Madhu:
Hi all..
I am trying to index 5Mb excel file ,but while indexing using poi
3..Its
giving me out of memory exception.
Can any one knows how to index large size excle files files.
Increase the maximum VM heap size?
http://blogs.sun.com/watt/resource/jvm-
Hi,
I am trying to search from an idlist (string containing comma seperated
numeric values)
eg:
QueryParser vParser = new QueryParser("idlist", new AlphanumAnalyzer()); //
analyzer using custom lettertokenizer which tokenize nuber also. class is
given below.
Query q = vParser.parse("55"); // exa
On Aug 29, 2007, at 10:33 PM, George Aroush wrote:
Just read the thread. Unfortunately, it doesn't offer a solution.
As I read it offered a number of solutions:
* Twiddle the *.fnm files (carefully)
* Use string substitution on the users query, so "foo:whatever" ->
"bar:whatever" unde
Hi all..
I am trying to index 5Mb excel file ,but while indexing using poi 3..Its
giving me out of memory exception.
Can any one knows how to index large size excle files files.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For
Le 30 Aug 2007 à 05:38, Michael Busch a écrit :
Chris Lu wrote:
Hi, Antoine,
It does take a long time to open the index reader.
One thing you could do is to put new documents into one smaller
index and
re-open it, it should be much faster.
We're planning to add a reopen() method to Ind
Tom Roberts is out of the office until 3rd September 2007 and will get back to
you on his return.
http://www.luxonline.org.uk
http://www.lux.org.uk
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail:
Tom Roberts is out of the office until 3rd September 2007 and will get back to
you on his return.
http://www.luxonline.org.uk
http://www.lux.org.uk
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail:
Le 29 Aug 2007 à 23:33, Chris Lu a écrit :
Hi, Antoine,
It does take a long time to open the index reader.
One thing you could do is to put new documents into one smaller
index and
re-open it, it should be much faster.
Yes, but there is the problem of deleted /updated documents. Your
so
23 matches
Mail list logo